From that short description, you should not be running optimize at all. Just stop doing it. It doesn’t make that big a difference.
It may take your indexes a few weeks to get back to a normal state after the forced merges. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Jun 17, 2020, at 4:12 AM, Raveendra Yerraguntla > <raveend...@yahoo.com.INVALID> wrote: > > Thank you David, Walt , Eric. > 1. First time bloated index generated , there is no disk space issue. one > copy of index is 1/6 of disk capacity. we ran into disk capacity after more > than 2 copies of bloated copies.2. Solr is upgraded from 5.*. in 5.* more > than 5 segments is causing performance issue. Performance in 7.* is not > measured for increasing segments. I will plan a PT to get optimum number. > Application has incremental indexing multiple times in a work week. > I will keep you updated on the resolution. > Thanks again > On Tuesday, June 16, 2020, 07:34:26 PM EDT, Erick Erickson > <erickerick...@gmail.com> wrote: > > It Depends (tm). > > As of Solr 7.5, optimize is different. See: > https://lucidworks.com/post/solr-and-optimizing-your-index-take-ii/ > > So, assuming you have _not_ specified maxSegments=1, any very large > segment (near 5G) that has _zero_ deleted documents won’t be merged. > > So there are two scenarios: > > 1> What Walter mentioned. The optimize process runs out of disk space > and leaves lots of crud around > > 2> your “older segments” are just max-sized segments with zero deletions. > > > All that said… do you have demonstrable performance improvements after > optimizing? The entire name “optimize” is misleading, of course who > wouldn’t want an optimized index? In earlier versions of Solr (i.e. 4x), > it made quite a difference. In more recent Solr releases, it’s not as clear > cut. So before worrying about making optimize work, I’d recommend that > you do some performance tests on optimized and un-optimized indexes. > If there are significant improvements, that’s one thing. Otherwise, it’s > a waste. > > Best, > Erick > >> On Jun 16, 2020, at 5:36 PM, Walter Underwood <wun...@wunderwood.org> wrote: >> >> For a full forced merge (mistakenly named “optimize”), the worst case disk >> space >> is 3X the size of the index. It is common to need 2X the size of the index. >> >> When I worked on Ultraseek Server 20+ years ago, it had the same merge >> behavior. >> I implemented a disk space check that would refuse to merge if there wasn’t >> enough >> free space. It would log an error and send an email to the admin. >> >> wunder >> Walter Underwood >> wun...@wunderwood.org >> http://observer.wunderwood.org/ (my blog) >> >>> On Jun 16, 2020, at 1:58 PM, David Hastings <hastings.recurs...@gmail.com> >>> wrote: >>> >>> I cant give you a 100% true answer but ive experienced this, and what >>> "seemed" to happen to me was that the optimize would start, and that will >>> drive the size up by 3 fold, and if you out of disk space in the process >>> the optimize will quit since, it cant optimize, and leave the live index >>> pieces in tact, so now you have the "current" index as well as the >>> "optimized" fragments >>> >>> i cant say for certain thats what you ran into, but we found that if you >>> get an expanding disk it will keep growing and prevent this from happening, >>> then the index will contract and the disk will shrink back to only what it >>> needs. saved me a lot of headaches not needing to ever worry about disk >>> space >>> >>> On Tue, Jun 16, 2020 at 4:43 PM Raveendra Yerraguntla >>> <raveend...@yahoo.com.invalid> wrote: >>> >>>> >>>> when optimize command is issued, the expectation after the completion of >>>> optimization process is that the index size either decreases or at most >>>> remain same. In solr 7.6 cluster with 50 plus shards, when optimize command >>>> is issued, some of the shard's transient or older segment files are not >>>> deleted. This is happening randomly across all shards. When unnoticed these >>>> transient files makes disk full. Currently it is handled through monitors, >>>> but question is what is causing the transient/older files remains there. >>>> Are there any specific race conditions which laves the older files not >>>> being deleted? >>>> Any pointers around this will be helpful. >>>> TIA >>