Hi! I continue deeping inside this problem... high writing rates continues.
Searching in logs i see this: 2017-07-10 08:46:18.888 INFO (commitScheduler-11-thread-1) [c:ads s:shard2 r:core_node47 x:ads_shard2_replica3] o.a.s.u.LoggingInfoStream [DWPT][commitScheduler-11-thread-1]: flushed: segment=_mb7 ramUsed=7.531 MB newFlushedSize=2.472 MB docs/MB=334.132 2017-07-10 08:46:29.336 INFO (commitScheduler-11-thread-1) [c:ads s:shard2 r:core_node47 x:ads_shard2_replica3] o.a.s.u.LoggingInfoStream [DWPT][commitScheduler-11-thread-1]: flushed: segment=_mba ramUsed=8.079 MB newFlushedSize=1.784 MB docs/MB=244.978 A flush happens each 10 seconds (my autosoftcommit time is 10 secs and hardcommit 5 minutes). ¿is the expected behaviour? I thought soft commits does not write into disk... 2017-07-06 0:02 GMT+02:00 Antonio De Miguel <deveto...@gmail.com>: > Hi erik. > > What i want to said is that we have enough memory to store shards, and > furthermore, JVMs heapspaces > > Machine has 400gb of RAM. I think we have enough. > > We have 10 JVM running on the machine, each of one using 16gb. > > Shard size is about 8gb. > > When we have query or indexing peaks our problem are the CPU ussage and > the disk io, but we have a lot of unused memory. > > > > > > > > > > El 5/7/2017 19:04, "Erick Erickson" <erickerick...@gmail.com> escribió: > >> bq: We have enough physical RAM to store full collection and 16Gb for >> each JVM. >> >> That's not quite what I was asking for. Lucene uses MMapDirectory to >> map part of the index into the OS memory space. If you've >> over-allocated the JVM space relative to your physical memory that >> space can start swapping. Frankly I'd expect your query performance to >> die if that was happening so this is a sanity check. >> >> How much physical memory does the machine have and how much memory is >> allocated to _all_ of the JVMs running on that machine? >> >> see: http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on >> -64bit.html >> >> Best, >> Erick >> >> >> On Wed, Jul 5, 2017 at 9:41 AM, Antonio De Miguel <deveto...@gmail.com> >> wrote: >> > Hi Erik! thanks for your response! >> > >> > Our soft commit is 5 seconds. Why generates I/0 a softcommit? first >> notice. >> > >> > >> > We have enough physical RAM to store full collection and 16Gb for each >> > JVM. The collection is relatively small. >> > >> > I've tried (for testing purposes) disabling transactionlog (commenting >> > <updateLog>)... but cluster does not go up. I'll try writing into >> separated >> > drive, nice idea... >> > >> > >> > >> > >> > >> > >> > >> > >> > 2017-07-05 18:04 GMT+02:00 Erick Erickson <erickerick...@gmail.com>: >> > >> >> What is your soft commit interval? That'll cause I/O as well. >> >> >> >> How much physical RAM and how much is dedicated to _all_ the JVMs on a >> >> machine? One cause here is that Lucene uses MMapDirectory which can be >> >> starved for OS memory if you use too much JVM, my rule of thumb is >> >> that _at least_ half of the physical memory should be reserved for the >> >> OS. >> >> >> >> Your transaction logs should fluctuate but even out. By that I mean >> >> they should increase in size but every hard commit should truncate >> >> some of them so I wouldn't expect them to grow indefinitely. >> >> >> >> One strategy is to put your tlogs on a separate drive exactly to >> >> reduce contention. You could disable them too at a cost of risking >> >> your data. That might be a quick experiment you could run though, >> >> disable tlogs and see what that changes. Of course I'd do this on my >> >> test system ;). >> >> >> >> But yeah, Solr will use a lot of I/O in the scenario you are outlining >> >> I'm afraid. >> >> >> >> Best, >> >> Erick >> >> >> >> On Wed, Jul 5, 2017 at 8:08 AM, Antonio De Miguel <deveto...@gmail.com >> > >> >> wrote: >> >> > thanks Markus! >> >> > >> >> > We already have SSD. >> >> > >> >> > About changing topology.... we probed yesterday with 10 shards, but >> >> system >> >> > goes more inconsistent than with the current topology (5x10). I dont >> know >> >> > why... too many traffic perhaps? >> >> > >> >> > About merge factor.. we set default configuration for some days... >> but >> >> when >> >> > a merge occurs system overload. We probed with mergefactor of 4 to >> >> improbe >> >> > query times and trying to have smaller merges. >> >> > >> >> > 2017-07-05 16:51 GMT+02:00 Markus Jelsma <markus.jel...@openindex.io >> >: >> >> > >> >> >> Try mergeFactor of 10 (default) which should be fine in most cases. >> If >> >> you >> >> >> got an extreme case, either create more shards and consider better >> >> hardware >> >> >> (SSD's) >> >> >> >> >> >> -----Original message----- >> >> >> > From:Antonio De Miguel <deveto...@gmail.com> >> >> >> > Sent: Wednesday 5th July 2017 16:48 >> >> >> > To: solr-user@lucene.apache.org >> >> >> > Subject: Re: High disk write usage >> >> >> > >> >> >> > Thnaks a lot alessandro! >> >> >> > >> >> >> > Yes, we have very big physical dedicated machines, with a >> topology of >> >> 5 >> >> >> > shards and10 replicas each shard. >> >> >> > >> >> >> > >> >> >> > 1. transaction log files are increasing but not with this rate >> >> >> > >> >> >> > 2. we 've probed with values between 300 and 2000 MB... without >> any >> >> >> > visible results >> >> >> > >> >> >> > 3. We don't use those features >> >> >> > >> >> >> > 4. No. >> >> >> > >> >> >> > 5. I've probed with low and high mergefacors and i think that is >> the >> >> >> point. >> >> >> > >> >> >> > With low merge factor (over 4) we 've high write disk rate as i >> said >> >> >> > previously >> >> >> > >> >> >> > with merge factor of 20, writing disk rate is decreasing, but now, >> >> with >> >> >> > high qps rates (over 1000 qps) system is overloaded. >> >> >> > >> >> >> > i think that's the expected behaviour :( >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > 2017-07-05 15:49 GMT+02:00 alessandro.benedetti < >> a.benede...@sease.io >> >> >: >> >> >> > >> >> >> > > Point 2 was the ram Buffer size : >> >> >> > > >> >> >> > > *ramBufferSizeMB* sets the amount of RAM that may be used by >> Lucene >> >> >> > > indexing for buffering added documents and deletions >> before >> >> >> they >> >> >> > > are >> >> >> > > flushed to the Directory. >> >> >> > > maxBufferedDocs sets a limit on the number of documents >> >> >> buffered >> >> >> > > before flushing. >> >> >> > > If both ramBufferSizeMB and maxBufferedDocs is set, >> then >> >> >> > > Lucene will flush based on whichever limit is hit >> first. >> >> >> > > >> >> >> > > <ramBufferSizeMB>100</ramBufferSizeMB> >> >> >> > > <maxBufferedDocs>1000</maxBufferedDocs> >> >> >> > > >> >> >> > > >> >> >> > > >> >> >> > > >> >> >> > > ----- >> >> >> > > --------------- >> >> >> > > Alessandro Benedetti >> >> >> > > Search Consultant, R&D Software Engineer, Director >> >> >> > > Sease Ltd. - www.sease.io >> >> >> > > -- >> >> >> > > View this message in context: http://lucene.472066.n3. >> >> >> > > nabble.com/High-disk-write-usage-tp4344356p4344386.html >> >> >> > > Sent from the Solr - User mailing list archive at Nabble.com. >> >> >> > > >> >> >> > >> >> >> >> >> >> >