wesm edited a comment on pull request #7143: URL: https://github.com/apache/arrow/pull/7143#issuecomment-638534173
OK I adapted the benchmark here to use the `BitmapScanner` from #7346 https://github.com/wesm/arrow/tree/bit-runner ``` ------------------------------------------------------------------ Benchmark Time CPU Iterations ------------------------------------------------------------------ BitRunReader/-1 9890 ns 9890 ns 70683 49.3705MB/s BitRunReader/0 108 ns 108 ns 6250693 4.423GB/s BitRunReader/10 2101 ns 2101 ns 334686 232.36MB/s BitRunReader/25 4072 ns 4072 ns 173114 119.915MB/s BitRunReader/50 5221 ns 5221 ns 133040 93.5178MB/s BitRunReader/60 5042 ns 5042 ns 138099 96.8386MB/s BitRunReader/75 3933 ns 3933 ns 179857 124.152MB/s BitRunReader/99 291 ns 291 ns 2412105 1.6376GB/s BitRunReaderWithScanner/-1 47 ns 47 ns 15059881 10.2331GB/s BitRunReaderWithScanner/0 46 ns 46 ns 15078363 10.2704GB/s BitRunReaderWithScanner/10 47 ns 47 ns 15118172 10.2299GB/s BitRunReaderWithScanner/25 47 ns 47 ns 15033144 10.2528GB/s BitRunReaderWithScanner/50 47 ns 47 ns 14947443 10.1964GB/s BitRunReaderWithScanner/60 47 ns 47 ns 14668505 10.0837GB/s BitRunReaderWithScanner/75 46 ns 46 ns 15045334 10.2918GB/s BitRunReaderWithScanner/99 46 ns 46 ns 14961067 10.2813GB/s BitRunReaderScalar/-1 13089 ns 13088 ns 51449 37.3063MB/s BitRunReaderScalar/0 3844 ns 3844 ns 176221 127.024MB/s BitRunReaderScalar/10 6621 ns 6621 ns 104648 73.7517MB/s BitRunReaderScalar/25 12397 ns 12397 ns 55998 39.388MB/s BitRunReaderScalar/50 17099 ns 17099 ns 41378 28.556MB/s BitRunReaderScalar/60 16606 ns 16606 ns 42580 29.4046MB/s BitRunReaderScalar/75 11431 ns 11431 ns 61744 42.7165MB/s BitRunReaderScalar/99 4265 ns 4265 ns 167402 114.484MB/s ``` This isn't apples-to-applies at all because the scanner just popcounts 256-bits at a time, it doesn't segment null- from non-null runs. If the goal is to accelerate the writing of mostly-non-null data, is it worth going to all this trouble of exactly delimiting the start and end point of each run? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
