[jira] [Commented] (LUCENE-7261) Speed up LSBRadixSorter

2016-05-24 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298281#comment-15298281
 ] 

Yonik Seeley commented on LUCENE-7261:
--

Thanks Adrien, interesting.  Definitely not the results I expected (but I never 
tested against anything else).  IIRC I think reorder looked a bit different in 
my implementation. When I get some time, I'll take a crack at it just for the 
fun of it :-)

> Speed up LSBRadixSorter
> ---
>
> Key: LUCENE-7261
> URL: https://issues.apache.org/jira/browse/LUCENE-7261
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Assignee: Adrien Grand
>Priority: Minor
> Fix For: 6.1, master (7.0)
>
> Attachments: LUCENE-7261.patch, MSBRadixSorter.java
>
>
> Currently it always does 4 passes over the data (one per byte, since ints 
> have 4 bytes). However, most of the time, we know {{maxDoc}}, so we can use 
> this information to do fewer passes when they are not necessary. For 
> instance, if maxDoc is less than or equal to 2^24, we only need 3 passes, and 
> if maxDoc is less than or equals to 2^16, we only need two passes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7261) Speed up LSBRadixSorter

2016-04-28 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15262446#comment-15262446
 ] 

ASF subversion and git services commented on LUCENE-7261:
-

Commit 8ca6f6651ede19bfaee9051e9b87927685cb9be0 in lucene-solr's branch 
refs/heads/branch_6x from [~jpountz]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=8ca6f66 ]

LUCENE-7261: Speed up LSBRadixSorter.


> Speed up LSBRadixSorter
> ---
>
> Key: LUCENE-7261
> URL: https://issues.apache.org/jira/browse/LUCENE-7261
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Assignee: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7261.patch
>
>
> Currently it always does 4 passes over the data (one per byte, since ints 
> have 4 bytes). However, most of the time, we know {{maxDoc}}, so we can use 
> this information to do fewer passes when they are not necessary. For 
> instance, if maxDoc is less than or equal to 2^24, we only need 3 passes, and 
> if maxDoc is less than or equals to 2^16, we only need two passes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7261) Speed up LSBRadixSorter

2016-04-28 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15262447#comment-15262447
 ] 

ASF subversion and git services commented on LUCENE-7261:
-

Commit ef45d4b2e1f9c967b62340acb027f50888a00ba2 in lucene-solr's branch 
refs/heads/master from [~jpountz]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=ef45d4b ]

LUCENE-7261: Speed up LSBRadixSorter.


> Speed up LSBRadixSorter
> ---
>
> Key: LUCENE-7261
> URL: https://issues.apache.org/jira/browse/LUCENE-7261
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Assignee: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7261.patch
>
>
> Currently it always does 4 passes over the data (one per byte, since ints 
> have 4 bytes). However, most of the time, we know {{maxDoc}}, so we can use 
> this information to do fewer passes when they are not necessary. For 
> instance, if maxDoc is less than or equal to 2^24, we only need 3 passes, and 
> if maxDoc is less than or equals to 2^16, we only need two passes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7261) Speed up LSBRadixSorter

2016-04-28 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15262411#comment-15262411
 ] 

Yonik Seeley commented on LUCENE-7261:
--

I had prev implemented MSD for integers and I was grepping for that code this 
morning... can't seem to find it :-(

> Speed up LSBRadixSorter
> ---
>
> Key: LUCENE-7261
> URL: https://issues.apache.org/jira/browse/LUCENE-7261
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Assignee: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7261.patch
>
>
> Currently it always does 4 passes over the data (one per byte, since ints 
> have 4 bytes). However, most of the time, we know {{maxDoc}}, so we can use 
> this information to do fewer passes when they are not necessary. For 
> instance, if maxDoc is less than or equal to 2^24, we only need 3 passes, and 
> if maxDoc is less than or equals to 2^16, we only need two passes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7261) Speed up LSBRadixSorter

2016-04-28 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15262369#comment-15262369
 ] 

Adrien Grand commented on LUCENE-7261:
--

I implemented LSB because it is very easy to implement. But +1 to explore 
whether we can make things faster or generate less garbage with MSB sort first.

> Speed up LSBRadixSorter
> ---
>
> Key: LUCENE-7261
> URL: https://issues.apache.org/jira/browse/LUCENE-7261
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Assignee: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7261.patch
>
>
> Currently it always does 4 passes over the data (one per byte, since ints 
> have 4 bytes). However, most of the time, we know {{maxDoc}}, so we can use 
> this information to do fewer passes when they are not necessary. For 
> instance, if maxDoc is less than or equal to 2^24, we only need 3 passes, and 
> if maxDoc is less than or equals to 2^16, we only need two passes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7261) Speed up LSBRadixSorter

2016-04-28 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15262186#comment-15262186
 ] 

Yonik Seeley commented on LUCENE-7261:
--

+1

As an aside, I was looking at this stuff a while ago and had decided on trying 
an MSB sort first (before you added LSB).
- can be in-place
- since all buckets are sorted relative to each other, can delegate to a 
different sorting algorithm when buckets become small



> Speed up LSBRadixSorter
> ---
>
> Key: LUCENE-7261
> URL: https://issues.apache.org/jira/browse/LUCENE-7261
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Assignee: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7261.patch
>
>
> Currently it always does 4 passes over the data (one per byte, since ints 
> have 4 bytes). However, most of the time, we know {{maxDoc}}, so we can use 
> this information to do fewer passes when they are not necessary. For 
> instance, if maxDoc is less than or equal to 2^24, we only need 3 passes, and 
> if maxDoc is less than or equals to 2^16, we only need two passes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org