LOL and it was Dawid :-)  Having amnesia Dawid?
I think I've re-explored my own ideas before too.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Tue, Jan 26, 2021 at 5:39 PM Michael McCandless <
luc...@mikemccandless.com> wrote:

> Oh I found this long ago (well, ~2 years) issue exploring this:
> https://issues.apache.org/jira/browse/LUCENE-8580
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Tue, Jan 26, 2021 at 3:38 PM Dawid Weiss <dawid.we...@gmail.com> wrote:
>
>> > +1 to make a single merge concurrent!  It is horribly frustrating to
>> watch that last merge running on a single core :)  I have lost many hours
>> of my life to this frustration.
>>
>> > Yeah... it is, isn't it? Especially on new machines where you have
>> super-fast SSDs, countless cores, etc... That last merge consumes so few
>> resources that the computer feels practically idle... it's hard to explain
>> to people using our software (who invested in hardware) why we're basically
>> doing nothing... :)
>>
>> > I do think we need to explore concurrency within terms/postings across
>> fields in one segment to really see gains in the common case where merge
>> time is dominated by postings.
>>
>> Yeah, probably.
>>
>> > if you want to experiment with something like that, you can hackishly
>> simulate it today to quickly see the overhead, correct? its a small hack to
>> PerFieldPostingsFormat to force it to emit "files-per-field" and then CFS
>> will combine it all together.
>>
>> Good idea, Robert. I'll try this.
>>
>> > By default merging stored fields is super fast because Lucene can copy
>> compressed data directly, but if there are deletes or index sorting is
>> enabled this optimization is not applicable anymore and I wouldn't be
>> surprised if stored fields started taking non negligible time.
>>
>> In this case these segments are essentially made from scratch but with
>> lots and lots of term vectors and postings... But the more parallel
>> stages we can introduce, the better.
>>
>> I have some other stuff on my plate before I can dive deep into this
>> but I eventually will. Thanks for the pointers, everyone - helpful!
>>
>> D.
>>
>

Reply via email to