[ https://issues.apache.org/jira/browse/LUCENE-7299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15298343#comment-15298343 ]
Yonik Seeley commented on LUCENE-7299: -------------------------------------- bq. So this is just for integers. No, MSB variations are fine for strings... you go byte-by-byte, just as you would for integers, with each byte determining the appropriate bucket. I imagine this would have a weakness for large common prefixes. Random strings may be the best case since it equally distributed between buckets (hence does a maximum amount of work on each pass). > BytesRefHash.sort() should use radix sort? > ------------------------------------------ > > Key: LUCENE-7299 > URL: https://issues.apache.org/jira/browse/LUCENE-7299 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Adrien Grand > Assignee: Adrien Grand > Priority: Minor > Attachments: LUCENE-7299.patch > > > Switching DocIdSetBuilder to radix sort helped make things significantly > faster. We should be able to do the same with BytesRefHash.sort()? -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org