emkornfield commented on pull request #7143:
URL: https://github.com/apache/arrow/pull/7143#issuecomment-639219042


   > I could try to adapt the Parquet code to use BitBlockCounter and see what 
the benchmarks look like?
   
   @wesm I think if you can take a look and potentially revise the benchmarks 
at 
https://github.com/apache/arrow/blob/7ad49eeca5215d9b2a56b6439f1bd6ea38888ea9/cpp/src/parquet/arrow/reader_writer_benchmark.cc#L238
 to make sure we are aligned on what we are trying to improve, I can update the 
this PR accordingly.  I think there are really two options:
   1.  Remove BitRunReader entirely and use BitBlockCounter
   2.  Use BitBlockCounter in addition to BitRunReader
   
   The way to go really depends on what percentage of values we expect to be 
null.  My intuition is that very high rates and very low rates are likely, but 
I think you probably have a better intuition as to the exact definition of high 
or low.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to