Re: Pathological index condition

2017-08-28 Thread Erick Erickson
bq: I guess the alternative would be to occasionally roll the dice and decide to merge that kind of segment. That's what I was getting to with the "autoCompact" idea in a more deterministic manner. On Mon, Aug 28, 2017 at 1:32 PM, Walter Underwood wrote: > That makes

Re: Pathological index condition

2017-08-28 Thread Walter Underwood
That makes sense. I guess the alternative would be to occasionally roll the dice and decide to merge that kind of segment. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Aug 28, 2017, at 1:28 PM, Erick Erickson wrote: >

Re: Pathological index condition

2017-08-28 Thread Erick Erickson
I don't think jitter would help. As long as a segment has > 50% max segment size "live" docs, it's forever ineligible for merging (outside optimize of expungeDeletes commands). So the "zone" is anything over 50%. Or I missed your point. Erick On Mon, Aug 28, 2017 at 12:50 PM, Walter Underwood

Re: Pathological index condition

2017-08-28 Thread Walter Underwood
If this happens in a precise zone, how about adding some random jitter to the threshold? That tends to get this kind of lock-up unstuck. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Aug 28, 2017, at 12:44 PM, Erick Erickson

Re: Pathological index condition

2017-08-28 Thread Erick Erickson
And one more thought (not very well thought out). A parameter on TMP (or whatever) that did <3> something like: > a parameter > a parameter > On startup TMP takes the current timestamp *> Every minute (or whatever) it checks the current timestamp and if is in between the last check time and

Re: Pathological index condition

2017-08-27 Thread Erick Erickson
I've been thinking about this a little more. Since this is an outlier, I'm loathe to change the core TMP merge selection process. Say the max segment size if 5G. You'd be doing an awful lot of I/O to merge a segment with 4.75G "live" docs with one with 0.25G. Plus that doesn't really allow users

Re: Pathological index condition

2017-08-09 Thread Erick Erickson
Thanks Mike: bq: Or are you saying that each segments 20% of not-deleted docs is still greater than 1/2 of the max segment size, and so TMP considers them ineligible? Exactly. Hadn't seen the blog, thanks for that. Added to my list of things to refer to. The problem we're seeing is that "in

Re: Pathological index condition

2017-08-08 Thread Michael McCandless
Hi Erick, Some questions/answers below: On Sun, Aug 6, 2017 at 8:22 PM, Erick Erickson wrote: > Particularly interested if Mr. McCandless has any opinions here. > > I admit it took some work, but I can create an index that never merges > and is 80% deleted documents

Pathological index condition

2017-08-06 Thread Erick Erickson
Particularly interested if Mr. McCandless has any opinions here. I admit it took some work, but I can create an index that never merges and is 80% deleted documents using TieredMergePolicy. I'm trying to understand how indexes "in the wild" can have > 30% deleted documents. I think the root