gf2121 commented on PR #12800:
URL: https://github.com/apache/lucene/pull/12800#issuecomment-1820461507

   I did some more work to find out the balance between memory / performance in 
various data distribution. The way i'm thinking now is that we keep the 
timsorter here, but make the run length larger and use LSB radix sorter to sort 
these runs.
   
   * From the memory perspective, we limit the memory usage to a single run. So 
the memory usage should be generally similar to before. Here is the tmp slot 
required in a single shot:
   
   <!--StartFragment--><byte-sheet-html-origin data-id="1700554434864" 
data-version="4" data-is-embed="false" data-grid-line-hidden="false" 
data-importRangeRawData-spreadSource="https://bytedance.larkoffice.com/sheets/KyGbsHJslhzExktvfS7cWTdznpd";
 data-importRangeRawData-range="&#39;Sheet1&#39;!A1:D7">
     | tim sorter | lsb sorter | off sorter
   -- | -- | -- | --
   natural | 0 | 112500 | 0
   reverse | 0 | 112500 | 0
   random | 56052 | 112500 | 56248
   partial | 52506 | 112500 | 56248
   natural_exception | 32780 | 112500 | 20356
   reverse_exception | 45724 | 112500 | 56250
   
   </byte-sheet-html-origin><!--EndFragment-->
   
   &nbsp;
   * From the performance perspective, we have great performance for sorted 
data as before, and 4x faster on random data. But we will lose some performance 
for "sorted exception" cases as we are using larger run length.
   
   ```
   Benchmark                     (bit)            (order)  (size)   Mode  Cnt   
Score   Error   Units
   DocSorterBenchmark.lsbSorter     24            natural  100000  thrpt    5   
2.067 ± 0.037  ops/ms
   DocSorterBenchmark.lsbSorter     24            reverse  100000  thrpt    5   
2.067 ± 0.047  ops/ms
   DocSorterBenchmark.lsbSorter     24             random  100000  thrpt    5   
2.045 ± 0.031  ops/ms
   DocSorterBenchmark.lsbSorter     24            partial  100000  thrpt    5   
2.049 ± 0.040  ops/ms
   DocSorterBenchmark.lsbSorter     24  natural_exception  100000  thrpt    5   
2.074 ± 0.020  ops/ms
   DocSorterBenchmark.lsbSorter     24  reverse_exception  100000  thrpt    5   
2.063 ± 0.070  ops/ms
   DocSorterBenchmark.lsbSorter     31            natural  100000  thrpt    5   
1.593 ± 0.020  ops/ms
   DocSorterBenchmark.lsbSorter     31            reverse  100000  thrpt    5   
1.577 ± 0.037  ops/ms
   DocSorterBenchmark.lsbSorter     31             random  100000  thrpt    5   
1.586 ± 0.017  ops/ms
   DocSorterBenchmark.lsbSorter     31            partial  100000  thrpt    5   
1.591 ± 0.018  ops/ms
   DocSorterBenchmark.lsbSorter     31  natural_exception  100000  thrpt    5   
1.584 ± 0.032  ops/ms
   DocSorterBenchmark.lsbSorter     31  reverse_exception  100000  thrpt    5   
1.563 ± 0.033  ops/ms
   DocSorterBenchmark.offSorter     24            natural  100000  thrpt    5  
48.221 ± 2.295  ops/ms
   DocSorterBenchmark.offSorter     24            reverse  100000  thrpt    5  
16.023 ± 0.833  ops/ms
   DocSorterBenchmark.offSorter     24             random  100000  thrpt    5   
0.375 ± 0.166  ops/ms
   DocSorterBenchmark.offSorter     24            partial  100000  thrpt    5   
0.487 ± 0.233  ops/ms
   DocSorterBenchmark.offSorter     24  natural_exception  100000  thrpt    5   
1.304 ± 0.055  ops/ms
   DocSorterBenchmark.offSorter     24  reverse_exception  100000  thrpt    5   
0.943 ± 0.603  ops/ms
   DocSorterBenchmark.offSorter     31            natural  100000  thrpt    5  
48.030 ± 2.616  ops/ms
   DocSorterBenchmark.offSorter     31            reverse  100000  thrpt    5  
15.940 ± 0.961  ops/ms
   DocSorterBenchmark.offSorter     31             random  100000  thrpt    5   
0.363 ± 0.094  ops/ms
   DocSorterBenchmark.offSorter     31            partial  100000  thrpt    5   
0.486 ± 0.093  ops/ms
   DocSorterBenchmark.offSorter     31  natural_exception  100000  thrpt    5   
0.999 ± 0.691  ops/ms
   DocSorterBenchmark.offSorter     31  reverse_exception  100000  thrpt    5   
0.942 ± 0.058  ops/ms
   DocSorterBenchmark.timSorter     24            natural  100000  thrpt    5  
46.111 ± 4.226  ops/ms
   DocSorterBenchmark.timSorter     24            reverse  100000  thrpt    5  
15.828 ± 0.571  ops/ms
   DocSorterBenchmark.timSorter     24             random  100000  thrpt    5   
0.082 ± 0.015  ops/ms
   DocSorterBenchmark.timSorter     24            partial  100000  thrpt    5   
0.515 ± 0.260  ops/ms
   DocSorterBenchmark.timSorter     24  natural_exception  100000  thrpt    5   
2.681 ± 0.164  ops/ms
   DocSorterBenchmark.timSorter     24  reverse_exception  100000  thrpt    5   
1.940 ± 0.138  ops/ms
   DocSorterBenchmark.timSorter     31            natural  100000  thrpt    5  
46.916 ± 4.504  ops/ms
   DocSorterBenchmark.timSorter     31            reverse  100000  thrpt    5  
15.718 ± 0.865  ops/ms
   DocSorterBenchmark.timSorter     31             random  100000  thrpt    5   
0.085 ± 0.001  ops/ms
   DocSorterBenchmark.timSorter     31            partial  100000  thrpt    5   
0.502 ± 0.036  ops/ms
   DocSorterBenchmark.timSorter     31  natural_exception  100000  thrpt    5   
2.711 ± 0.134  ops/ms
   DocSorterBenchmark.timSorter     31  reverse_exception  100000  thrpt    5   
1.944 ± 0.149  ops/ms
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to