mapleFU commented on PR #34323:
URL: https://github.com/apache/arrow/pull/34323#issuecomment-1447528488

   @wgtmac @rok 
   
   1. Dict Decoding for ByteArray is added, but in our current benchmark, there 
is no ndv. Seems I can add a ndv for benchmark dict in the future
   2. Different batch size is added
   
   The benchmark data is listed below:
   
   ```
   
-------------------------------------------------------------------------------------------------------------
   Benchmark                                                   Time             
CPU   Iterations UserCounters...
   
-------------------------------------------------------------------------------------------------------------
   BM_PlainEncodingByteArray/8/8                             235 ns          
228 ns      3065389 byte_array_bytes=104.223M items_per_second=35.074M/s
   BM_PlainEncodingByteArray/64/8                            396 ns          
370 ns      2015966 byte_array_bytes=637.045M items_per_second=21.6312M/s
   BM_PlainEncodingByteArray/512/8                           371 ns          
365 ns      1874429 byte_array_bytes=3.37022G items_per_second=21.9365M/s
   BM_PlainEncodingByteArray/1024/8                          401 ns          
390 ns      1819042 byte_array_bytes=4.31113G items_per_second=20.5191M/s
   BM_PlainEncodingByteArray/8/64                            580 ns          
566 ns      1230099 byte_array_bytes=334.587M items_per_second=113.158M/s
   BM_PlainEncodingByteArray/64/64                           829 ns          
816 ns       857780 byte_array_bytes=1.65037G items_per_second=78.449M/s
   BM_PlainEncodingByteArray/512/64                         1881 ns         
1856 ns       393902 byte_array_bytes=6.6148G items_per_second=34.48M/s
   BM_PlainEncodingByteArray/1024/64                        9410 ns         
8131 ns        84531 byte_array_bytes=2.7596G items_per_second=7.87144M/s
   BM_PlainEncodingByteArray/8/512                          2214 ns         
2208 ns       317557 byte_array_bytes=644.641M items_per_second=231.851M/s
   BM_PlainEncodingByteArray/64/512                         4019 ns         
4016 ns       168215 byte_array_bytes=2.7811G items_per_second=127.486M/s
   BM_PlainEncodingByteArray/512/512                       19707 ns        
19346 ns        36416 byte_array_bytes=4.69737G items_per_second=26.4651M/s
   BM_PlainEncodingByteArray/1024/512                      63092 ns        
62419 ns        11561 byte_array_bytes=3.12039G items_per_second=8.20265M/s
   BM_PlainEncodingByteArray/8/1024                         4484 ns         
4484 ns       155323 byte_array_bytes=637.446M items_per_second=228.367M/s
   BM_PlainEncodingByteArray/64/1024                       19180 ns        
18981 ns        37781 byte_array_bytes=1.22762G items_per_second=53.9499M/s
   BM_PlainEncodingByteArray/512/1024                      86915 ns        
80674 ns         8801 byte_array_bytes=2.32632G items_per_second=12.6931M/s
   BM_PlainEncodingByteArray/1024/1024                     92043 ns        
78708 ns         9747 byte_array_bytes=5.16017G items_per_second=13.0102M/s
   BM_DeltaBitLengthEncodingByteArray/8/8                    747 ns          
733 ns       960272 byte_array_bytes=32.6492M items_per_second=10.9149M/s
   BM_DeltaBitLengthEncodingByteArray/64/8                   775 ns          
760 ns       905996 byte_array_bytes=286.295M items_per_second=10.5287M/s
   BM_DeltaBitLengthEncodingByteArray/512/8                  796 ns          
780 ns       905621 byte_array_bytes=1.62831G items_per_second=10.2597M/s
   BM_DeltaBitLengthEncodingByteArray/1024/8                 794 ns          
787 ns       885661 byte_array_bytes=2.09902G items_per_second=10.1652M/s
   BM_DeltaBitLengthEncodingByteArray/8/64                  1261 ns         
1231 ns       578441 byte_array_bytes=157.336M items_per_second=51.9844M/s
   BM_DeltaBitLengthEncodingByteArray/64/64                 1374 ns         
1364 ns       510900 byte_array_bytes=982.972M items_per_second=46.9351M/s
   BM_DeltaBitLengthEncodingByteArray/512/64                1650 ns         
1643 ns       437511 byte_array_bytes=7.34712G items_per_second=38.9574M/s
   BM_DeltaBitLengthEncodingByteArray/1024/64              11026 ns         
9007 ns        73656 byte_array_bytes=2.40457G items_per_second=7.10521M/s
   BM_DeltaBitLengthEncodingByteArray/8/512                 6608 ns         
6269 ns       116291 byte_array_bytes=236.071M items_per_second=81.6661M/s
   BM_DeltaBitLengthEncodingByteArray/64/512                7989 ns         
7394 ns        92218 byte_array_bytes=1.52464G items_per_second=69.2451M/s
   BM_DeltaBitLengthEncodingByteArray/512/512              58210 ns        
41833 ns        17754 byte_array_bytes=2.29012G items_per_second=12.239M/s
   BM_DeltaBitLengthEncodingByteArray/1024/512             81322 ns        
67902 ns        10659 byte_array_bytes=2.87694G items_per_second=7.54026M/s
   BM_DeltaBitLengthEncodingByteArray/8/1024               12956 ns        
12113 ns        60335 byte_array_bytes=247.615M items_per_second=84.536M/s
   BM_DeltaBitLengthEncodingByteArray/64/1024              26775 ns        
22107 ns        30578 byte_array_bytes=993.571M items_per_second=46.3211M/s
   BM_DeltaBitLengthEncodingByteArray/512/1024             88313 ns        
72388 ns         9790 byte_array_bytes=2.58773G items_per_second=14.146M/s
   BM_DeltaBitLengthEncodingByteArray/1024/1024           141532 ns       
122762 ns         5944 byte_array_bytes=3.14682G items_per_second=8.34132M/s
   BM_PlainDecodingByteArray/8/8                             137 ns          
125 ns      5639113 byte_array_bytes=191.73M items_per_second=64.0507M/s
   BM_PlainDecodingByteArray/64/8                            127 ns          
123 ns      5670358 byte_array_bytes=1.79183G items_per_second=64.9014M/s
   BM_PlainDecodingByteArray/512/8                           124 ns          
122 ns      5829301 byte_array_bytes=10.4811G items_per_second=65.7388M/s
   BM_PlainDecodingByteArray/1024/8                          143 ns          
125 ns      5804552 byte_array_bytes=13.7568G items_per_second=64.0992M/s
   BM_PlainDecodingByteArray/8/64                            311 ns          
246 ns      2895170 byte_array_bytes=787.486M items_per_second=259.683M/s
   BM_PlainDecodingByteArray/64/64                           227 ns          
226 ns      2851486 byte_array_bytes=5.48626G items_per_second=283.404M/s
   BM_PlainDecodingByteArray/512/64                          231 ns          
229 ns      3030998 byte_array_bytes=50.8995G items_per_second=279.919M/s
   BM_PlainDecodingByteArray/1024/64                         224 ns          
223 ns      3125963 byte_array_bytes=102.05G items_per_second=286.763M/s
   BM_PlainDecodingByteArray/8/512                          1098 ns         
1097 ns       630619 byte_array_bytes=1.28016G items_per_second=466.662M/s
   BM_PlainDecodingByteArray/64/512                         1112 ns         
1102 ns       645733 byte_array_bytes=10.6759G items_per_second=464.412M/s
   BM_PlainDecodingByteArray/512/512                        1200 ns         
1161 ns       614634 byte_array_bytes=79.2829G items_per_second=441.183M/s
   BM_PlainDecodingByteArray/1024/512                       1229 ns         
1174 ns       603589 byte_array_bytes=162.913G items_per_second=436.226M/s
   BM_PlainDecodingByteArray/8/1024                         2290 ns         
2186 ns       312444 byte_array_bytes=1.28227G items_per_second=468.407M/s
   BM_PlainDecodingByteArray/64/1024                        2632 ns         
2300 ns       303235 byte_array_bytes=9.85301G items_per_second=445.132M/s
   BM_PlainDecodingByteArray/512/1024                       3790 ns         
3536 ns       200536 byte_array_bytes=53.0065G items_per_second=289.603M/s
   BM_PlainDecodingByteArray/1024/1024                      3198 ns         
3178 ns       222884 byte_array_bytes=117.997G items_per_second=322.171M/s
   BM_DeltaBitLengthDecodingByteArray/8/8                    689 ns          
672 ns      1045229 byte_array_bytes=35.5378M items_per_second=11.9118M/s
   BM_DeltaBitLengthDecodingByteArray/64/8                   751 ns          
743 ns       903342 byte_array_bytes=285.456M items_per_second=10.7662M/s
   BM_DeltaBitLengthDecodingByteArray/512/8                 1055 ns         
1051 ns       655879 byte_array_bytes=1.17927G items_per_second=7.61373M/s
   BM_DeltaBitLengthDecodingByteArray/1024/8                1179 ns         
1170 ns       603204 byte_array_bytes=1.42959G items_per_second=6.83965M/s
   BM_DeltaBitLengthDecodingByteArray/8/64                   829 ns          
819 ns       855254 byte_array_bytes=232.629M items_per_second=78.1025M/s
   BM_DeltaBitLengthDecodingByteArray/64/64                 1184 ns         
1176 ns       602213 byte_array_bytes=1.15866G items_per_second=54.4343M/s
   BM_DeltaBitLengthDecodingByteArray/512/64                4279 ns         
4276 ns       163854 byte_array_bytes=2.7516G items_per_second=14.9689M/s
   BM_DeltaBitLengthDecodingByteArray/1024/64               7645 ns         
7615 ns        92953 byte_array_bytes=3.03454G items_per_second=8.40392M/s
   BM_DeltaBitLengthDecodingByteArray/8/512                 1925 ns         
1922 ns       363082 byte_array_bytes=737.056M items_per_second=266.345M/s
   BM_DeltaBitLengthDecodingByteArray/64/512                5109 ns         
5044 ns       139495 byte_array_bytes=2.30627G items_per_second=101.512M/s
   BM_DeltaBitLengthDecodingByteArray/512/512              41306 ns        
40867 ns        16609 byte_array_bytes=2.14243G items_per_second=12.5285M/s
   BM_DeltaBitLengthDecodingByteArray/1024/512             80874 ns        
79691 ns         8941 byte_array_bytes=2.41324G items_per_second=6.42482M/s
   BM_DeltaBitLengthDecodingByteArray/8/1024                3208 ns         
3206 ns       219540 byte_array_bytes=900.992M items_per_second=319.394M/s
   BM_DeltaBitLengthDecodingByteArray/64/1024               9308 ns         
9303 ns        74957 byte_array_bytes=2.43558G items_per_second=110.077M/s
   BM_DeltaBitLengthDecodingByteArray/512/1024             80116 ns        
80005 ns         8506 byte_array_bytes=2.24834G items_per_second=12.7992M/s
   BM_DeltaBitLengthDecodingByteArray/1024/1024           154870 ns       
154196 ns         4420 byte_array_bytes=2.34G items_per_second=6.64089M/s
   BM_PlainDecodingSpacedByteArray/8/8                      10.4 ns         
10.4 ns     67218499 byte_array_bytes=2.28543G items_per_second=772.044M/s 
null_percent=2
   BM_PlainDecodingSpacedByteArray/64/8                     10.5 ns         
10.5 ns     66603235 byte_array_bytes=21.0466G items_per_second=762.433M/s 
null_percent=2
   BM_PlainDecodingSpacedByteArray/512/8                    10.4 ns         
10.4 ns     68236097 byte_array_bytes=122.689G items_per_second=770.475M/s 
null_percent=2
   BM_PlainDecodingSpacedByteArray/1024/8                   10.8 ns         
10.8 ns     67406209 byte_array_bytes=159.753G items_per_second=743.078M/s 
null_percent=2
   BM_PlainDecodingSpacedByteArray/8/64                      144 ns          
144 ns      4810600 byte_array_bytes=1.29405G items_per_second=444.822M/s 
null_percent=2
   BM_PlainDecodingSpacedByteArray/64/64                     144 ns          
144 ns      4845364 byte_array_bytes=9.13836G items_per_second=444.619M/s 
null_percent=2
   BM_PlainDecodingSpacedByteArray/512/64                    152 ns          
148 ns      4719684 byte_array_bytes=77.7615G items_per_second=431.605M/s 
null_percent=2
   BM_PlainDecodingSpacedByteArray/1024/64                   159 ns          
151 ns      4773628 byte_array_bytes=152.04G items_per_second=424.058M/s 
null_percent=2
   BM_PlainDecodingSpacedByteArray/8/512                    1216 ns         
1180 ns       583946 byte_array_bytes=1.17023G items_per_second=433.89M/s 
null_percent=2
   BM_PlainDecodingSpacedByteArray/64/512                   1173 ns         
1164 ns       611551 byte_array_bytes=9.93403G items_per_second=439.96M/s 
null_percent=2
   BM_PlainDecodingSpacedByteArray/512/512                  1205 ns         
1193 ns       598792 byte_array_bytes=75.7496G items_per_second=429.141M/s 
null_percent=2
   BM_PlainDecodingSpacedByteArray/1024/512                 1201 ns         
1181 ns       594233 byte_array_bytes=157.627G items_per_second=433.348M/s 
null_percent=2
   BM_PlainDecodingSpacedByteArray/8/1024                   2344 ns         
2330 ns       280204 byte_array_bytes=1.12922G items_per_second=439.487M/s 
null_percent=2
   BM_PlainDecodingSpacedByteArray/64/1024                  2323 ns         
2312 ns       304577 byte_array_bytes=9.66727G items_per_second=442.964M/s 
null_percent=2
   BM_PlainDecodingSpacedByteArray/512/1024                 4764 ns         
4707 ns       154352 byte_array_bytes=39.936G items_per_second=217.534M/s 
null_percent=2
   BM_PlainDecodingSpacedByteArray/1024/1024                3067 ns         
3061 ns       217958 byte_array_bytes=113.274G items_per_second=334.54M/s 
null_percent=2
   BM_DeltaBitLengthDecodingSpacedByteArray/8/8              287 ns          
271 ns      2666616 byte_array_bytes=90.6649M items_per_second=29.4838M/s 
null_percent=2
   BM_DeltaBitLengthDecodingSpacedByteArray/64/8             324 ns          
324 ns      2148043 byte_array_bytes=678.782M items_per_second=24.7125M/s 
null_percent=2
   BM_DeltaBitLengthDecodingSpacedByteArray/512/8            630 ns          
629 ns      1120197 byte_array_bytes=2.01411G items_per_second=12.7162M/s 
null_percent=2
   BM_DeltaBitLengthDecodingSpacedByteArray/1024/8           761 ns          
753 ns       947816 byte_array_bytes=2.24632G items_per_second=10.6185M/s 
null_percent=2
   BM_DeltaBitLengthDecodingSpacedByteArray/8/64             444 ns          
444 ns      1604485 byte_array_bytes=431.606M items_per_second=144.24M/s 
null_percent=2
   BM_DeltaBitLengthDecodingSpacedByteArray/64/64            816 ns          
811 ns       870041 byte_array_bytes=1.6409G items_per_second=78.9493M/s 
null_percent=2
   BM_DeltaBitLengthDecodingSpacedByteArray/512/64          3914 ns         
3903 ns       177444 byte_array_bytes=2.92357G items_per_second=16.3984M/s 
null_percent=2
   BM_DeltaBitLengthDecodingSpacedByteArray/1024/64         8412 ns         
7518 ns        97220 byte_array_bytes=3.09646G items_per_second=8.51239M/s 
null_percent=2
   BM_DeltaBitLengthDecodingSpacedByteArray/8/512           2265 ns         
1855 ns       391519 byte_array_bytes=784.604M items_per_second=276.042M/s 
null_percent=2
   BM_DeltaBitLengthDecodingSpacedByteArray/64/512          5374 ns         
5112 ns       100000 byte_array_bytes=1.6244G items_per_second=100.165M/s 
null_percent=2
   BM_DeltaBitLengthDecodingSpacedByteArray/512/512        32772 ns        
30121 ns        23796 byte_array_bytes=3.01029G items_per_second=16.9983M/s 
null_percent=2
   BM_DeltaBitLengthDecodingSpacedByteArray/1024/512       58609 ns        
58345 ns        11690 byte_array_bytes=3.10091G items_per_second=8.77535M/s 
null_percent=2
   BM_DeltaBitLengthDecodingSpacedByteArray/8/1024          3191 ns         
3178 ns       222474 byte_array_bytes=896.57M items_per_second=322.247M/s 
null_percent=2
   BM_DeltaBitLengthDecodingSpacedByteArray/64/1024         9198 ns         
9125 ns        75708 byte_array_bytes=2.40297G items_per_second=112.224M/s 
null_percent=2
   BM_DeltaBitLengthDecodingSpacedByteArray/512/1024       58358 ns        
58147 ns        12110 byte_array_bytes=3.13326G items_per_second=17.6106M/s 
null_percent=2
   BM_DeltaBitLengthDecodingSpacedByteArray/1024/1024     120708 ns       
115508 ns         6136 byte_array_bytes=3.18891G items_per_second=8.86519M/s 
null_percent=2
   BM_DictDecodingByteArray/8/8                              904 ns          
893 ns       750035 bytes_per_second=136.719M/s
   BM_DictDecodingByteArray/64/8                             931 ns          
923 ns       768133 bytes_per_second=1057.74M/s
   BM_DictDecodingByteArray/512/8                            907 ns          
905 ns       758651 bytes_per_second=8.42918G/s
   BM_DictDecodingByteArray/1024/8                           950 ns          
930 ns       763309 bytes_per_second=16.4053G/s
   BM_DictDecodingByteArray/8/64                            1155 ns         
1152 ns       603532 bytes_per_second=105.942M/s
   BM_DictDecodingByteArray/64/64                           1302 ns         
1300 ns       532097 bytes_per_second=751.171M/s
   BM_DictDecodingByteArray/512/64                          1377 ns         
1370 ns       514895 bytes_per_second=5.57052G/s
   BM_DictDecodingByteArray/1024/64                         1592 ns         
1588 ns       440038 bytes_per_second=9.60928G/s
   BM_DictDecodingByteArray/8/512                           3454 ns         
3445 ns       198841 bytes_per_second=35.4355M/s
   BM_DictDecodingByteArray/64/512                          4939 ns         
4922 ns       141697 bytes_per_second=198.411M/s
   BM_DictDecodingByteArray/512/512                        21046 ns        
20583 ns        34146 bytes_per_second=379.56M/s
   BM_DictDecodingByteArray/1024/512                       40554 ns        
37768 ns        17072 bytes_per_second=413.711M/s
   BM_DictDecodingByteArray/8/1024                          5964 ns         
5929 ns       118177 bytes_per_second=20.587M/s
   BM_DictDecodingByteArray/64/1024                         9671 ns         
9652 ns        72344 bytes_per_second=101.177M/s
   BM_DictDecodingByteArray/512/1024                       39855 ns        
38685 ns        17524 bytes_per_second=201.954M/s
   BM_DictDecodingByteArray/1024/1024                      70775 ns        
70380 ns        10262 bytes_per_second=222.009M/s
   ```
   
   Runing on x86 may got a bit differents, because it can make full use of simd 
unpack, which could make Dict Decoding a bit faster


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to