autumnust commented on pull request #651:
URL: https://github.com/apache/orc/pull/651#issuecomment-828170668


   Here is the new jmh result: 
   
   ```# Run complete. Total time: 00:10:35
   Benchmark                                    (dictImpl)  (upperBound)  Mode  
Cnt       Score      Error  Units
   ORCWriterBenchMark.dictBench                     RBTREE         10000  avgt  
  5   28939.068 ± 3080.947  us/op
   ORCWriterBenchMark.dictBench:bytesPerRecord      RBTREE         10000  avgt  
  5      49.963                 #
   ORCWriterBenchMark.dictBench:ops                 RBTREE         10000  avgt  
  5         ≈ 0                 #
   ORCWriterBenchMark.dictBench:perRecord           RBTREE         10000  avgt  
  5       0.883 ±    0.094  us/op
   ORCWriterBenchMark.dictBench:records             RBTREE         10000  avgt  
  5  163840.000                 #
   ORCWriterBenchMark.dictBench                     RBTREE          2500  avgt  
  5   21998.781 ± 1448.300  us/op
   ORCWriterBenchMark.dictBench:bytesPerRecord      RBTREE          2500  avgt  
  5      23.532                 #
   ORCWriterBenchMark.dictBench:ops                 RBTREE          2500  avgt  
  5         ≈ 0                 #
   ORCWriterBenchMark.dictBench:perRecord           RBTREE          2500  avgt  
  5       0.671 ±    0.044  us/op
   ORCWriterBenchMark.dictBench:records             RBTREE          2500  avgt  
  5  163840.000                 #
   ORCWriterBenchMark.dictBench                     RBTREE           500  avgt  
  5   17730.281 ± 4574.132  us/op
   ORCWriterBenchMark.dictBench:bytesPerRecord      RBTREE           500  avgt  
  5      13.156                 #
   ORCWriterBenchMark.dictBench:ops                 RBTREE           500  avgt  
  5         ≈ 0                 #
   ORCWriterBenchMark.dictBench:perRecord           RBTREE           500  avgt  
  5       0.541 ±    0.140  us/op
   ORCWriterBenchMark.dictBench:records             RBTREE           500  avgt  
  5  163840.000                 #
   ORCWriterBenchMark.dictBench                       HASH         10000  avgt  
  5   21269.613 ± 4137.763  us/op
   ORCWriterBenchMark.dictBench:bytesPerRecord        HASH         10000  avgt  
  5      42.268                 #
   ORCWriterBenchMark.dictBench:ops                   HASH         10000  avgt  
  5         ≈ 0                 #
   ORCWriterBenchMark.dictBench:perRecord             HASH         10000  avgt  
  5       0.649 ±    0.126  us/op
   ORCWriterBenchMark.dictBench:records               HASH         10000  avgt  
  5  163840.000                 #
   ORCWriterBenchMark.dictBench                       HASH          2500  avgt  
  5   11586.898 ± 4075.783  us/op
   ORCWriterBenchMark.dictBench:bytesPerRecord        HASH          2500  avgt  
  5      17.692                 #
   ORCWriterBenchMark.dictBench:ops                   HASH          2500  avgt  
  5         ≈ 0                 #
   ORCWriterBenchMark.dictBench:perRecord             HASH          2500  avgt  
  5       0.354 ±    0.124  us/op
   ORCWriterBenchMark.dictBench:records               HASH          2500  avgt  
  5  163840.000                 #
   ORCWriterBenchMark.dictBench                       HASH           500  avgt  
  5    9646.080 ± 2279.530  us/op
   ORCWriterBenchMark.dictBench:bytesPerRecord        HASH           500  avgt  
  5      11.613                 #
   ORCWriterBenchMark.dictBench:ops                   HASH           500  avgt  
  5         ≈ 0                 #
   ORCWriterBenchMark.dictBench:perRecord             HASH           500  avgt  
  5       0.294 ±    0.070  us/op
   ORCWriterBenchMark.dictBench:records               HASH           500  avgt  
  5  163840.000                 #
   ORCWriterBenchMark.dictBench                       NONE         10000  avgt  
  5    4077.675 ±  117.606  us/op
   ORCWriterBenchMark.dictBench:bytesPerRecord        NONE         10000  avgt  
  5      50.146                 #
   ORCWriterBenchMark.dictBench:ops                   NONE         10000  avgt  
  5         ≈ 0                 #
   ORCWriterBenchMark.dictBench:perRecord             NONE         10000  avgt  
  5       0.124 ±    0.004  us/op
   ORCWriterBenchMark.dictBench:records               NONE         10000  avgt  
  5  163840.000                 #
   ORCWriterBenchMark.dictBench                       NONE          2500  avgt  
  5    4607.634 ± 1163.084  us/op
   ORCWriterBenchMark.dictBench:bytesPerRecord        NONE          2500  avgt  
  5      50.146                 #
   ORCWriterBenchMark.dictBench:ops                   NONE          2500  avgt  
  5         ≈ 0                 #
   ORCWriterBenchMark.dictBench:perRecord             NONE          2500  avgt  
  5       0.141 ±    0.035  us/op
   ORCWriterBenchMark.dictBench:records               NONE          2500  avgt  
  5  163840.000                 #
   ORCWriterBenchMark.dictBench                       NONE           500  avgt  
  5    3783.059 ±  367.511  us/op
   ORCWriterBenchMark.dictBench:bytesPerRecord        NONE           500  avgt  
  5      50.146                 #
   ORCWriterBenchMark.dictBench:ops                   NONE           500  avgt  
  5         ≈ 0                 #
   ORCWriterBenchMark.dictBench:perRecord             NONE           500  avgt  
  5       0.115 ±    0.011  us/op
   ORCWriterBenchMark.dictBench:records               NONE           500  avgt  
  5  163840.000                 #
   ```
   
   Unfortunately the previous implementation had a bug which end up with great 
locality (but incorrect). HASH is still much better than RB-Tree but obviously 
we needs to iterate further to improve it. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to