pitrou commented on PR #13334:
URL: https://github.com/apache/arrow/pull/13334#issuecomment-1191713393

   Ok, so I implemented the algorithm which was originally envisioned in 
https://issues.apache.org/jira/browse/ARROW-14314?focusedCommentId=17445377&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17445377
   
   The performance numbers are indeed quite better now:
   * on integers:
   ```
   ArraySortIndicesInt64Narrow/32768/10000      12096 ns        12093 ns        
58332 bytes_per_second=2.52347G/s items_per_second=338.694M/s null_percent=0.01 
size=32.768k
   ArraySortIndicesInt64Narrow/32768/100        12100 ns        12098 ns        
57680 bytes_per_second=2.5226G/s items_per_second=338.578M/s null_percent=1 
size=32.768k
   ArraySortIndicesInt64Narrow/32768/10         13387 ns        13384 ns        
50920 bytes_per_second=2.28012G/s items_per_second=306.032M/s null_percent=10 
size=32.768k
   ArraySortIndicesInt64Narrow/32768/2          23478 ns        23473 ns        
28378 bytes_per_second=1.3001G/s items_per_second=174.496M/s null_percent=50 
size=32.768k
   ArraySortIndicesInt64Narrow/32768/1           6422 ns         6420 ns       
109333 bytes_per_second=4.75339G/s items_per_second=637.989M/s null_percent=100 
size=32.768k
   ArraySortIndicesInt64Narrow/32768/0          12042 ns        12040 ns        
57830 bytes_per_second=2.53471G/s items_per_second=340.203M/s null_percent=0 
size=32.768k
   ArraySortIndicesInt64Narrow/1048576/100     400335 ns       400263 ns        
 1721 bytes_per_second=2.4398G/s items_per_second=327.465M/s null_percent=1 
size=1048.58k
   ArraySortIndicesInt64Narrow/8388608/100    5230098 ns      5226130 ns        
  134 bytes_per_second=1.49489G/s items_per_second=200.641M/s null_percent=1 
size=8.38861M
   ArraySortIndicesInt64Wide/32768/10000       196031 ns       195992 ns        
 3553 bytes_per_second=159.446M/s items_per_second=20.8989M/s null_percent=0.01 
size=32.768k
   ArraySortIndicesInt64Wide/32768/100         199523 ns       199488 ns        
 3493 bytes_per_second=156.651M/s items_per_second=20.5325M/s null_percent=1 
size=32.768k
   ArraySortIndicesInt64Wide/32768/10          189211 ns       189180 ns        
 3692 bytes_per_second=165.187M/s items_per_second=21.6513M/s null_percent=10 
size=32.768k
   ArraySortIndicesInt64Wide/32768/2           117105 ns       117085 ns        
 5907 bytes_per_second=266.899M/s items_per_second=34.983M/s null_percent=50 
size=32.768k
   ArraySortIndicesInt64Wide/32768/1             6399 ns         6398 ns       
109441 bytes_per_second=4.76962G/s items_per_second=640.168M/s null_percent=100 
size=32.768k
   ArraySortIndicesInt64Wide/32768/0           195661 ns       195629 ns        
 3571 bytes_per_second=159.741M/s items_per_second=20.9376M/s null_percent=0 
size=32.768k
   ArraySortIndicesInt64Wide/1048576/100      9867426 ns      9865950 ns        
   71 bytes_per_second=101.359M/s items_per_second=13.2853M/s null_percent=1 
size=1048.58k
   ArraySortIndicesInt64Wide/8388608/100    104777802 ns    104736080 ns        
    7 bytes_per_second=76.3825M/s items_per_second=10.0116M/s null_percent=1 
size=8.38861M
   ArraySortIndicesInt64Dict/32768/10000        21406 ns        21401 ns        
32998 bytes_per_second=1.42602G/s items_per_second=191.397M/s null_percent=0.01 
size=32.768k
   ArraySortIndicesInt64Dict/32768/100          23328 ns        23320 ns        
29746 bytes_per_second=1.30864G/s items_per_second=175.643M/s null_percent=1 
size=32.768k
   ArraySortIndicesInt64Dict/32768/10           32835 ns        32828 ns        
20994 bytes_per_second=951.943M/s items_per_second=124.773M/s null_percent=10 
size=32.768k
   ArraySortIndicesInt64Dict/32768/2            63866 ns        63851 ns        
10790 bytes_per_second=489.417M/s items_per_second=64.1489M/s null_percent=50 
size=32.768k
   ArraySortIndicesInt64Dict/32768/1            51498 ns        51488 ns        
13377 bytes_per_second=606.943M/s items_per_second=79.5532M/s null_percent=100 
size=32.768k
   ArraySortIndicesInt64Dict/32768/0            20983 ns        20978 ns        
33428 bytes_per_second=1.45471G/s items_per_second=195.248M/s null_percent=0 
size=32.768k
   ArraySortIndicesInt64Dict/1048576/100       618425 ns       618258 ns        
 1150 bytes_per_second=1.57954G/s items_per_second=212.002M/s null_percent=1 
size=1048.58k
   ArraySortIndicesInt64Dict/8388608/100      9134358 ns      9127689 ns        
   77 bytes_per_second=876.454M/s items_per_second=114.879M/s null_percent=1 
size=8.38861M
   ```
   * on strings:
   ```
   ArraySortIndicesStrings/32768/10000         380932 ns       380866 ns        
 1833 items_per_second=10.7544M/s null_percent=0.01 size=4.096k
   ArraySortIndicesStrings/32768/100           381913 ns       381849 ns        
 1831 items_per_second=10.7267M/s null_percent=1 size=4.096k
   ArraySortIndicesStrings/32768/10            353224 ns       353159 ns        
 1978 items_per_second=11.5982M/s null_percent=10 size=4.096k
   ArraySortIndicesStrings/32768/2             198191 ns       198155 ns        
 3485 items_per_second=20.6707M/s null_percent=50 size=4.096k
   ArraySortIndicesStrings/32768/1               6516 ns         6514 ns       
107935 items_per_second=628.767M/s null_percent=100 size=4.096k
   ArraySortIndicesStrings/32768/0             376185 ns       376122 ns        
 1866 items_per_second=10.8901M/s null_percent=0 size=4.096k
   ArraySortIndicesStrings/1048576/100       17554424 ns     17551747 ns        
   40 items_per_second=7.46775M/s null_percent=1 size=131.072k
   ArraySortIndicesStrings/8388608/100      218960374 ns    218848644 ns        
    3 items_per_second=4.79133M/s null_percent=1 size=1048.58k
   ArraySortIndicesStringsDict/32768/10000      21847 ns        21842 ns        
32054 items_per_second=187.525M/s null_percent=0.01 size=4.096k
   ArraySortIndicesStringsDict/32768/100        23700 ns        23695 ns        
29360 items_per_second=172.865M/s null_percent=1 size=4.096k
   ArraySortIndicesStringsDict/32768/10         33740 ns        33732 ns        
20298 items_per_second=121.428M/s null_percent=10 size=4.096k
   ArraySortIndicesStringsDict/32768/2          64616 ns        64602 ns        
10581 items_per_second=63.4031M/s null_percent=50 size=4.096k
   ArraySortIndicesStringsDict/32768/1          53657 ns        53645 ns        
12715 items_per_second=76.3539M/s null_percent=100 size=4.096k
   ArraySortIndicesStringsDict/32768/0          21505 ns        21500 ns        
32619 items_per_second=190.512M/s null_percent=0 size=4.096k
   ArraySortIndicesStringsDict/1048576/100     629995 ns       629868 ns        
 1113 items_per_second=208.094M/s null_percent=1 size=131.072k
   ArraySortIndicesStringsDict/8388608/100    9055816 ns      9045824 ns        
   77 items_per_second=115.918M/s null_percent=1 size=1048.58k
   ```
   
   We see that sorting a dictionary array is now much faster (potentially 10x 
faster) than sorting an equivalent dense array, if the dictionary size is small 
(which is a very common case).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to