Darn... I swear sometimes, when I try hard enough, I can hear my brain cells giving up to atrophy... Sigh.
D. On Wed, Jan 27, 2021 at 4:44 AM David Smiley <dsmi...@apache.org> wrote: > > LOL and it was Dawid :-) Having amnesia Dawid? > I think I've re-explored my own ideas before too. > > ~ David Smiley > Apache Lucene/Solr Search Developer > http://www.linkedin.com/in/davidwsmiley > > > On Tue, Jan 26, 2021 at 5:39 PM Michael McCandless > <luc...@mikemccandless.com> wrote: >> >> Oh I found this long ago (well, ~2 years) issue exploring this: >> https://issues.apache.org/jira/browse/LUCENE-8580 >> >> Mike McCandless >> >> http://blog.mikemccandless.com >> >> >> On Tue, Jan 26, 2021 at 3:38 PM Dawid Weiss <dawid.we...@gmail.com> wrote: >>> >>> > +1 to make a single merge concurrent! It is horribly frustrating to >>> > watch that last merge running on a single core :) I have lost many hours >>> > of my life to this frustration. >>> >>> > Yeah... it is, isn't it? Especially on new machines where you have >>> > super-fast SSDs, countless cores, etc... That last merge consumes so few >>> > resources that the computer feels practically idle... it's hard to >>> > explain to people using our software (who invested in hardware) why we're >>> > basically doing nothing... :) >>> >>> > I do think we need to explore concurrency within terms/postings across >>> > fields in one segment to really see gains in the common case where merge >>> > time is dominated by postings. >>> >>> Yeah, probably. >>> >>> > if you want to experiment with something like that, you can hackishly >>> > simulate it today to quickly see the overhead, correct? its a small hack >>> > to PerFieldPostingsFormat to force it to emit "files-per-field" and then >>> > CFS will combine it all together. >>> >>> Good idea, Robert. I'll try this. >>> >>> > By default merging stored fields is super fast because Lucene can copy >>> > compressed data directly, but if there are deletes or index sorting is >>> > enabled this optimization is not applicable anymore and I wouldn't be >>> > surprised if stored fields started taking non negligible time. >>> >>> In this case these segments are essentially made from scratch but with >>> lots and lots of term vectors and postings... But the more parallel >>> stages we can introduce, the better. >>> >>> I have some other stuff on my plate before I can dive deep into this >>> but I eventually will. Thanks for the pointers, everyone - helpful! >>> >>> D. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org