mapleFU commented on PR #34323: URL: https://github.com/apache/arrow/pull/34323#issuecomment-1447528488
@wgtmac @rok 1. Dict Decoding for ByteArray is added, but in our current benchmark, there is no ndv. Seems I can add a ndv for benchmark dict in the future 2. Different batch size is added The benchmark data is listed below: ``` ------------------------------------------------------------------------------------------------------------- Benchmark Time CPU Iterations UserCounters... ------------------------------------------------------------------------------------------------------------- BM_PlainEncodingByteArray/8/8 235 ns 228 ns 3065389 byte_array_bytes=104.223M items_per_second=35.074M/s BM_PlainEncodingByteArray/64/8 396 ns 370 ns 2015966 byte_array_bytes=637.045M items_per_second=21.6312M/s BM_PlainEncodingByteArray/512/8 371 ns 365 ns 1874429 byte_array_bytes=3.37022G items_per_second=21.9365M/s BM_PlainEncodingByteArray/1024/8 401 ns 390 ns 1819042 byte_array_bytes=4.31113G items_per_second=20.5191M/s BM_PlainEncodingByteArray/8/64 580 ns 566 ns 1230099 byte_array_bytes=334.587M items_per_second=113.158M/s BM_PlainEncodingByteArray/64/64 829 ns 816 ns 857780 byte_array_bytes=1.65037G items_per_second=78.449M/s BM_PlainEncodingByteArray/512/64 1881 ns 1856 ns 393902 byte_array_bytes=6.6148G items_per_second=34.48M/s BM_PlainEncodingByteArray/1024/64 9410 ns 8131 ns 84531 byte_array_bytes=2.7596G items_per_second=7.87144M/s BM_PlainEncodingByteArray/8/512 2214 ns 2208 ns 317557 byte_array_bytes=644.641M items_per_second=231.851M/s BM_PlainEncodingByteArray/64/512 4019 ns 4016 ns 168215 byte_array_bytes=2.7811G items_per_second=127.486M/s BM_PlainEncodingByteArray/512/512 19707 ns 19346 ns 36416 byte_array_bytes=4.69737G items_per_second=26.4651M/s BM_PlainEncodingByteArray/1024/512 63092 ns 62419 ns 11561 byte_array_bytes=3.12039G items_per_second=8.20265M/s BM_PlainEncodingByteArray/8/1024 4484 ns 4484 ns 155323 byte_array_bytes=637.446M items_per_second=228.367M/s BM_PlainEncodingByteArray/64/1024 19180 ns 18981 ns 37781 byte_array_bytes=1.22762G items_per_second=53.9499M/s BM_PlainEncodingByteArray/512/1024 86915 ns 80674 ns 8801 byte_array_bytes=2.32632G items_per_second=12.6931M/s BM_PlainEncodingByteArray/1024/1024 92043 ns 78708 ns 9747 byte_array_bytes=5.16017G items_per_second=13.0102M/s BM_DeltaBitLengthEncodingByteArray/8/8 747 ns 733 ns 960272 byte_array_bytes=32.6492M items_per_second=10.9149M/s BM_DeltaBitLengthEncodingByteArray/64/8 775 ns 760 ns 905996 byte_array_bytes=286.295M items_per_second=10.5287M/s BM_DeltaBitLengthEncodingByteArray/512/8 796 ns 780 ns 905621 byte_array_bytes=1.62831G items_per_second=10.2597M/s BM_DeltaBitLengthEncodingByteArray/1024/8 794 ns 787 ns 885661 byte_array_bytes=2.09902G items_per_second=10.1652M/s BM_DeltaBitLengthEncodingByteArray/8/64 1261 ns 1231 ns 578441 byte_array_bytes=157.336M items_per_second=51.9844M/s BM_DeltaBitLengthEncodingByteArray/64/64 1374 ns 1364 ns 510900 byte_array_bytes=982.972M items_per_second=46.9351M/s BM_DeltaBitLengthEncodingByteArray/512/64 1650 ns 1643 ns 437511 byte_array_bytes=7.34712G items_per_second=38.9574M/s BM_DeltaBitLengthEncodingByteArray/1024/64 11026 ns 9007 ns 73656 byte_array_bytes=2.40457G items_per_second=7.10521M/s BM_DeltaBitLengthEncodingByteArray/8/512 6608 ns 6269 ns 116291 byte_array_bytes=236.071M items_per_second=81.6661M/s BM_DeltaBitLengthEncodingByteArray/64/512 7989 ns 7394 ns 92218 byte_array_bytes=1.52464G items_per_second=69.2451M/s BM_DeltaBitLengthEncodingByteArray/512/512 58210 ns 41833 ns 17754 byte_array_bytes=2.29012G items_per_second=12.239M/s BM_DeltaBitLengthEncodingByteArray/1024/512 81322 ns 67902 ns 10659 byte_array_bytes=2.87694G items_per_second=7.54026M/s BM_DeltaBitLengthEncodingByteArray/8/1024 12956 ns 12113 ns 60335 byte_array_bytes=247.615M items_per_second=84.536M/s BM_DeltaBitLengthEncodingByteArray/64/1024 26775 ns 22107 ns 30578 byte_array_bytes=993.571M items_per_second=46.3211M/s BM_DeltaBitLengthEncodingByteArray/512/1024 88313 ns 72388 ns 9790 byte_array_bytes=2.58773G items_per_second=14.146M/s BM_DeltaBitLengthEncodingByteArray/1024/1024 141532 ns 122762 ns 5944 byte_array_bytes=3.14682G items_per_second=8.34132M/s BM_PlainDecodingByteArray/8/8 137 ns 125 ns 5639113 byte_array_bytes=191.73M items_per_second=64.0507M/s BM_PlainDecodingByteArray/64/8 127 ns 123 ns 5670358 byte_array_bytes=1.79183G items_per_second=64.9014M/s BM_PlainDecodingByteArray/512/8 124 ns 122 ns 5829301 byte_array_bytes=10.4811G items_per_second=65.7388M/s BM_PlainDecodingByteArray/1024/8 143 ns 125 ns 5804552 byte_array_bytes=13.7568G items_per_second=64.0992M/s BM_PlainDecodingByteArray/8/64 311 ns 246 ns 2895170 byte_array_bytes=787.486M items_per_second=259.683M/s BM_PlainDecodingByteArray/64/64 227 ns 226 ns 2851486 byte_array_bytes=5.48626G items_per_second=283.404M/s BM_PlainDecodingByteArray/512/64 231 ns 229 ns 3030998 byte_array_bytes=50.8995G items_per_second=279.919M/s BM_PlainDecodingByteArray/1024/64 224 ns 223 ns 3125963 byte_array_bytes=102.05G items_per_second=286.763M/s BM_PlainDecodingByteArray/8/512 1098 ns 1097 ns 630619 byte_array_bytes=1.28016G items_per_second=466.662M/s BM_PlainDecodingByteArray/64/512 1112 ns 1102 ns 645733 byte_array_bytes=10.6759G items_per_second=464.412M/s BM_PlainDecodingByteArray/512/512 1200 ns 1161 ns 614634 byte_array_bytes=79.2829G items_per_second=441.183M/s BM_PlainDecodingByteArray/1024/512 1229 ns 1174 ns 603589 byte_array_bytes=162.913G items_per_second=436.226M/s BM_PlainDecodingByteArray/8/1024 2290 ns 2186 ns 312444 byte_array_bytes=1.28227G items_per_second=468.407M/s BM_PlainDecodingByteArray/64/1024 2632 ns 2300 ns 303235 byte_array_bytes=9.85301G items_per_second=445.132M/s BM_PlainDecodingByteArray/512/1024 3790 ns 3536 ns 200536 byte_array_bytes=53.0065G items_per_second=289.603M/s BM_PlainDecodingByteArray/1024/1024 3198 ns 3178 ns 222884 byte_array_bytes=117.997G items_per_second=322.171M/s BM_DeltaBitLengthDecodingByteArray/8/8 689 ns 672 ns 1045229 byte_array_bytes=35.5378M items_per_second=11.9118M/s BM_DeltaBitLengthDecodingByteArray/64/8 751 ns 743 ns 903342 byte_array_bytes=285.456M items_per_second=10.7662M/s BM_DeltaBitLengthDecodingByteArray/512/8 1055 ns 1051 ns 655879 byte_array_bytes=1.17927G items_per_second=7.61373M/s BM_DeltaBitLengthDecodingByteArray/1024/8 1179 ns 1170 ns 603204 byte_array_bytes=1.42959G items_per_second=6.83965M/s BM_DeltaBitLengthDecodingByteArray/8/64 829 ns 819 ns 855254 byte_array_bytes=232.629M items_per_second=78.1025M/s BM_DeltaBitLengthDecodingByteArray/64/64 1184 ns 1176 ns 602213 byte_array_bytes=1.15866G items_per_second=54.4343M/s BM_DeltaBitLengthDecodingByteArray/512/64 4279 ns 4276 ns 163854 byte_array_bytes=2.7516G items_per_second=14.9689M/s BM_DeltaBitLengthDecodingByteArray/1024/64 7645 ns 7615 ns 92953 byte_array_bytes=3.03454G items_per_second=8.40392M/s BM_DeltaBitLengthDecodingByteArray/8/512 1925 ns 1922 ns 363082 byte_array_bytes=737.056M items_per_second=266.345M/s BM_DeltaBitLengthDecodingByteArray/64/512 5109 ns 5044 ns 139495 byte_array_bytes=2.30627G items_per_second=101.512M/s BM_DeltaBitLengthDecodingByteArray/512/512 41306 ns 40867 ns 16609 byte_array_bytes=2.14243G items_per_second=12.5285M/s BM_DeltaBitLengthDecodingByteArray/1024/512 80874 ns 79691 ns 8941 byte_array_bytes=2.41324G items_per_second=6.42482M/s BM_DeltaBitLengthDecodingByteArray/8/1024 3208 ns 3206 ns 219540 byte_array_bytes=900.992M items_per_second=319.394M/s BM_DeltaBitLengthDecodingByteArray/64/1024 9308 ns 9303 ns 74957 byte_array_bytes=2.43558G items_per_second=110.077M/s BM_DeltaBitLengthDecodingByteArray/512/1024 80116 ns 80005 ns 8506 byte_array_bytes=2.24834G items_per_second=12.7992M/s BM_DeltaBitLengthDecodingByteArray/1024/1024 154870 ns 154196 ns 4420 byte_array_bytes=2.34G items_per_second=6.64089M/s BM_PlainDecodingSpacedByteArray/8/8 10.4 ns 10.4 ns 67218499 byte_array_bytes=2.28543G items_per_second=772.044M/s null_percent=2 BM_PlainDecodingSpacedByteArray/64/8 10.5 ns 10.5 ns 66603235 byte_array_bytes=21.0466G items_per_second=762.433M/s null_percent=2 BM_PlainDecodingSpacedByteArray/512/8 10.4 ns 10.4 ns 68236097 byte_array_bytes=122.689G items_per_second=770.475M/s null_percent=2 BM_PlainDecodingSpacedByteArray/1024/8 10.8 ns 10.8 ns 67406209 byte_array_bytes=159.753G items_per_second=743.078M/s null_percent=2 BM_PlainDecodingSpacedByteArray/8/64 144 ns 144 ns 4810600 byte_array_bytes=1.29405G items_per_second=444.822M/s null_percent=2 BM_PlainDecodingSpacedByteArray/64/64 144 ns 144 ns 4845364 byte_array_bytes=9.13836G items_per_second=444.619M/s null_percent=2 BM_PlainDecodingSpacedByteArray/512/64 152 ns 148 ns 4719684 byte_array_bytes=77.7615G items_per_second=431.605M/s null_percent=2 BM_PlainDecodingSpacedByteArray/1024/64 159 ns 151 ns 4773628 byte_array_bytes=152.04G items_per_second=424.058M/s null_percent=2 BM_PlainDecodingSpacedByteArray/8/512 1216 ns 1180 ns 583946 byte_array_bytes=1.17023G items_per_second=433.89M/s null_percent=2 BM_PlainDecodingSpacedByteArray/64/512 1173 ns 1164 ns 611551 byte_array_bytes=9.93403G items_per_second=439.96M/s null_percent=2 BM_PlainDecodingSpacedByteArray/512/512 1205 ns 1193 ns 598792 byte_array_bytes=75.7496G items_per_second=429.141M/s null_percent=2 BM_PlainDecodingSpacedByteArray/1024/512 1201 ns 1181 ns 594233 byte_array_bytes=157.627G items_per_second=433.348M/s null_percent=2 BM_PlainDecodingSpacedByteArray/8/1024 2344 ns 2330 ns 280204 byte_array_bytes=1.12922G items_per_second=439.487M/s null_percent=2 BM_PlainDecodingSpacedByteArray/64/1024 2323 ns 2312 ns 304577 byte_array_bytes=9.66727G items_per_second=442.964M/s null_percent=2 BM_PlainDecodingSpacedByteArray/512/1024 4764 ns 4707 ns 154352 byte_array_bytes=39.936G items_per_second=217.534M/s null_percent=2 BM_PlainDecodingSpacedByteArray/1024/1024 3067 ns 3061 ns 217958 byte_array_bytes=113.274G items_per_second=334.54M/s null_percent=2 BM_DeltaBitLengthDecodingSpacedByteArray/8/8 287 ns 271 ns 2666616 byte_array_bytes=90.6649M items_per_second=29.4838M/s null_percent=2 BM_DeltaBitLengthDecodingSpacedByteArray/64/8 324 ns 324 ns 2148043 byte_array_bytes=678.782M items_per_second=24.7125M/s null_percent=2 BM_DeltaBitLengthDecodingSpacedByteArray/512/8 630 ns 629 ns 1120197 byte_array_bytes=2.01411G items_per_second=12.7162M/s null_percent=2 BM_DeltaBitLengthDecodingSpacedByteArray/1024/8 761 ns 753 ns 947816 byte_array_bytes=2.24632G items_per_second=10.6185M/s null_percent=2 BM_DeltaBitLengthDecodingSpacedByteArray/8/64 444 ns 444 ns 1604485 byte_array_bytes=431.606M items_per_second=144.24M/s null_percent=2 BM_DeltaBitLengthDecodingSpacedByteArray/64/64 816 ns 811 ns 870041 byte_array_bytes=1.6409G items_per_second=78.9493M/s null_percent=2 BM_DeltaBitLengthDecodingSpacedByteArray/512/64 3914 ns 3903 ns 177444 byte_array_bytes=2.92357G items_per_second=16.3984M/s null_percent=2 BM_DeltaBitLengthDecodingSpacedByteArray/1024/64 8412 ns 7518 ns 97220 byte_array_bytes=3.09646G items_per_second=8.51239M/s null_percent=2 BM_DeltaBitLengthDecodingSpacedByteArray/8/512 2265 ns 1855 ns 391519 byte_array_bytes=784.604M items_per_second=276.042M/s null_percent=2 BM_DeltaBitLengthDecodingSpacedByteArray/64/512 5374 ns 5112 ns 100000 byte_array_bytes=1.6244G items_per_second=100.165M/s null_percent=2 BM_DeltaBitLengthDecodingSpacedByteArray/512/512 32772 ns 30121 ns 23796 byte_array_bytes=3.01029G items_per_second=16.9983M/s null_percent=2 BM_DeltaBitLengthDecodingSpacedByteArray/1024/512 58609 ns 58345 ns 11690 byte_array_bytes=3.10091G items_per_second=8.77535M/s null_percent=2 BM_DeltaBitLengthDecodingSpacedByteArray/8/1024 3191 ns 3178 ns 222474 byte_array_bytes=896.57M items_per_second=322.247M/s null_percent=2 BM_DeltaBitLengthDecodingSpacedByteArray/64/1024 9198 ns 9125 ns 75708 byte_array_bytes=2.40297G items_per_second=112.224M/s null_percent=2 BM_DeltaBitLengthDecodingSpacedByteArray/512/1024 58358 ns 58147 ns 12110 byte_array_bytes=3.13326G items_per_second=17.6106M/s null_percent=2 BM_DeltaBitLengthDecodingSpacedByteArray/1024/1024 120708 ns 115508 ns 6136 byte_array_bytes=3.18891G items_per_second=8.86519M/s null_percent=2 BM_DictDecodingByteArray/8/8 904 ns 893 ns 750035 bytes_per_second=136.719M/s BM_DictDecodingByteArray/64/8 931 ns 923 ns 768133 bytes_per_second=1057.74M/s BM_DictDecodingByteArray/512/8 907 ns 905 ns 758651 bytes_per_second=8.42918G/s BM_DictDecodingByteArray/1024/8 950 ns 930 ns 763309 bytes_per_second=16.4053G/s BM_DictDecodingByteArray/8/64 1155 ns 1152 ns 603532 bytes_per_second=105.942M/s BM_DictDecodingByteArray/64/64 1302 ns 1300 ns 532097 bytes_per_second=751.171M/s BM_DictDecodingByteArray/512/64 1377 ns 1370 ns 514895 bytes_per_second=5.57052G/s BM_DictDecodingByteArray/1024/64 1592 ns 1588 ns 440038 bytes_per_second=9.60928G/s BM_DictDecodingByteArray/8/512 3454 ns 3445 ns 198841 bytes_per_second=35.4355M/s BM_DictDecodingByteArray/64/512 4939 ns 4922 ns 141697 bytes_per_second=198.411M/s BM_DictDecodingByteArray/512/512 21046 ns 20583 ns 34146 bytes_per_second=379.56M/s BM_DictDecodingByteArray/1024/512 40554 ns 37768 ns 17072 bytes_per_second=413.711M/s BM_DictDecodingByteArray/8/1024 5964 ns 5929 ns 118177 bytes_per_second=20.587M/s BM_DictDecodingByteArray/64/1024 9671 ns 9652 ns 72344 bytes_per_second=101.177M/s BM_DictDecodingByteArray/512/1024 39855 ns 38685 ns 17524 bytes_per_second=201.954M/s BM_DictDecodingByteArray/1024/1024 70775 ns 70380 ns 10262 bytes_per_second=222.009M/s ``` Runing on x86 may got a bit differents, because it can make full use of simd unpack, which could make Dict Decoding a bit faster -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org