LOL and it was Dawid :-) Having amnesia Dawid? I think I've re-explored my own ideas before too.
~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Tue, Jan 26, 2021 at 5:39 PM Michael McCandless < luc...@mikemccandless.com> wrote: > Oh I found this long ago (well, ~2 years) issue exploring this: > https://issues.apache.org/jira/browse/LUCENE-8580 > > Mike McCandless > > http://blog.mikemccandless.com > > > On Tue, Jan 26, 2021 at 3:38 PM Dawid Weiss <dawid.we...@gmail.com> wrote: > >> > +1 to make a single merge concurrent! It is horribly frustrating to >> watch that last merge running on a single core :) I have lost many hours >> of my life to this frustration. >> >> > Yeah... it is, isn't it? Especially on new machines where you have >> super-fast SSDs, countless cores, etc... That last merge consumes so few >> resources that the computer feels practically idle... it's hard to explain >> to people using our software (who invested in hardware) why we're basically >> doing nothing... :) >> >> > I do think we need to explore concurrency within terms/postings across >> fields in one segment to really see gains in the common case where merge >> time is dominated by postings. >> >> Yeah, probably. >> >> > if you want to experiment with something like that, you can hackishly >> simulate it today to quickly see the overhead, correct? its a small hack to >> PerFieldPostingsFormat to force it to emit "files-per-field" and then CFS >> will combine it all together. >> >> Good idea, Robert. I'll try this. >> >> > By default merging stored fields is super fast because Lucene can copy >> compressed data directly, but if there are deletes or index sorting is >> enabled this optimization is not applicable anymore and I wouldn't be >> surprised if stored fields started taking non negligible time. >> >> In this case these segments are essentially made from scratch but with >> lots and lots of term vectors and postings... But the more parallel >> stages we can introduce, the better. >> >> I have some other stuff on my plate before I can dive deep into this >> but I eventually will. Thanks for the pointers, everyone - helpful! >> >> D. >> >