[jira] [Commented] (LUCENE-7261) Speed up LSBRadixSorter
[ https://issues.apache.org/jira/browse/LUCENE-7261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298281#comment-15298281 ] Yonik Seeley commented on LUCENE-7261: -- Thanks Adrien, interesting. Definitely not the results I expected (but I never tested against anything else). IIRC I think reorder looked a bit different in my implementation. When I get some time, I'll take a crack at it just for the fun of it :-) > Speed up LSBRadixSorter > --- > > Key: LUCENE-7261 > URL: https://issues.apache.org/jira/browse/LUCENE-7261 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Assignee: Adrien Grand >Priority: Minor > Fix For: 6.1, master (7.0) > > Attachments: LUCENE-7261.patch, MSBRadixSorter.java > > > Currently it always does 4 passes over the data (one per byte, since ints > have 4 bytes). However, most of the time, we know {{maxDoc}}, so we can use > this information to do fewer passes when they are not necessary. For > instance, if maxDoc is less than or equal to 2^24, we only need 3 passes, and > if maxDoc is less than or equals to 2^16, we only need two passes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7261) Speed up LSBRadixSorter
[ https://issues.apache.org/jira/browse/LUCENE-7261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15262446#comment-15262446 ] ASF subversion and git services commented on LUCENE-7261: - Commit 8ca6f6651ede19bfaee9051e9b87927685cb9be0 in lucene-solr's branch refs/heads/branch_6x from [~jpountz] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=8ca6f66 ] LUCENE-7261: Speed up LSBRadixSorter. > Speed up LSBRadixSorter > --- > > Key: LUCENE-7261 > URL: https://issues.apache.org/jira/browse/LUCENE-7261 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Assignee: Adrien Grand >Priority: Minor > Attachments: LUCENE-7261.patch > > > Currently it always does 4 passes over the data (one per byte, since ints > have 4 bytes). However, most of the time, we know {{maxDoc}}, so we can use > this information to do fewer passes when they are not necessary. For > instance, if maxDoc is less than or equal to 2^24, we only need 3 passes, and > if maxDoc is less than or equals to 2^16, we only need two passes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7261) Speed up LSBRadixSorter
[ https://issues.apache.org/jira/browse/LUCENE-7261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15262447#comment-15262447 ] ASF subversion and git services commented on LUCENE-7261: - Commit ef45d4b2e1f9c967b62340acb027f50888a00ba2 in lucene-solr's branch refs/heads/master from [~jpountz] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=ef45d4b ] LUCENE-7261: Speed up LSBRadixSorter. > Speed up LSBRadixSorter > --- > > Key: LUCENE-7261 > URL: https://issues.apache.org/jira/browse/LUCENE-7261 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Assignee: Adrien Grand >Priority: Minor > Attachments: LUCENE-7261.patch > > > Currently it always does 4 passes over the data (one per byte, since ints > have 4 bytes). However, most of the time, we know {{maxDoc}}, so we can use > this information to do fewer passes when they are not necessary. For > instance, if maxDoc is less than or equal to 2^24, we only need 3 passes, and > if maxDoc is less than or equals to 2^16, we only need two passes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7261) Speed up LSBRadixSorter
[ https://issues.apache.org/jira/browse/LUCENE-7261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15262411#comment-15262411 ] Yonik Seeley commented on LUCENE-7261: -- I had prev implemented MSD for integers and I was grepping for that code this morning... can't seem to find it :-( > Speed up LSBRadixSorter > --- > > Key: LUCENE-7261 > URL: https://issues.apache.org/jira/browse/LUCENE-7261 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Assignee: Adrien Grand >Priority: Minor > Attachments: LUCENE-7261.patch > > > Currently it always does 4 passes over the data (one per byte, since ints > have 4 bytes). However, most of the time, we know {{maxDoc}}, so we can use > this information to do fewer passes when they are not necessary. For > instance, if maxDoc is less than or equal to 2^24, we only need 3 passes, and > if maxDoc is less than or equals to 2^16, we only need two passes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7261) Speed up LSBRadixSorter
[ https://issues.apache.org/jira/browse/LUCENE-7261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15262369#comment-15262369 ] Adrien Grand commented on LUCENE-7261: -- I implemented LSB because it is very easy to implement. But +1 to explore whether we can make things faster or generate less garbage with MSB sort first. > Speed up LSBRadixSorter > --- > > Key: LUCENE-7261 > URL: https://issues.apache.org/jira/browse/LUCENE-7261 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Assignee: Adrien Grand >Priority: Minor > Attachments: LUCENE-7261.patch > > > Currently it always does 4 passes over the data (one per byte, since ints > have 4 bytes). However, most of the time, we know {{maxDoc}}, so we can use > this information to do fewer passes when they are not necessary. For > instance, if maxDoc is less than or equal to 2^24, we only need 3 passes, and > if maxDoc is less than or equals to 2^16, we only need two passes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7261) Speed up LSBRadixSorter
[ https://issues.apache.org/jira/browse/LUCENE-7261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15262186#comment-15262186 ] Yonik Seeley commented on LUCENE-7261: -- +1 As an aside, I was looking at this stuff a while ago and had decided on trying an MSB sort first (before you added LSB). - can be in-place - since all buckets are sorted relative to each other, can delegate to a different sorting algorithm when buckets become small > Speed up LSBRadixSorter > --- > > Key: LUCENE-7261 > URL: https://issues.apache.org/jira/browse/LUCENE-7261 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Assignee: Adrien Grand >Priority: Minor > Attachments: LUCENE-7261.patch > > > Currently it always does 4 passes over the data (one per byte, since ints > have 4 bytes). However, most of the time, we know {{maxDoc}}, so we can use > this information to do fewer passes when they are not necessary. For > instance, if maxDoc is less than or equal to 2^24, we only need 3 passes, and > if maxDoc is less than or equals to 2^16, we only need two passes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org