[ https://issues.apache.org/jira/browse/LUCENE-8757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830173#comment-16830173 ]
Michael McCandless commented on LUCENE-8757: -------------------------------------------- Thanks [~atris] – I agree it's important to have better defaults for how we coalesce segments into per-query-per-thread work units. A few small comments: * Can you insert {{_}} in the big number constants (e.g. {{25000000}})? Makes it easier to read, and open-source code is written for reading :) * I think something is wrong with {{docSum}} – you only set it, and never add to it? I think the intention is to sum up docs in multiple adjacent (sorted by {{maxDoc}}) segments until that count exceeds {{25000000}}? * How did you pick {{25000000}} and {{100}} as good constants? We are using much smaller values in our production infrastructure – {{250_000}} and {{5}}, admittedly after only a little experimentation. * Can you add some tests? You can maybe make the slice method a package private static method and then create test cases with "interesting" {{LeafReaderContext}} combinations? In particular, a test case exposing the {{docSum}} bug would be great, then fix that bug, then see the test case pass. > Better Segment To Thread Mapping Algorithm > ------------------------------------------ > > Key: LUCENE-8757 > URL: https://issues.apache.org/jira/browse/LUCENE-8757 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Atri Sharma > Priority: Major > Attachments: LUCENE-8757.patch > > > The current segments to threads allocation algorithm always allocates one > thread per segment. This is detrimental to performance in case of skew in > segment sizes since small segments also get their dedicated thread. This can > lead to performance degradation due to context switching overheads. > > A better algorithm which is cognizant of size skew would have better > performance for realistic scenarios -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org