emkornfield opened a new pull request #8156: URL: https://github.com/apache/arrow/pull/8156
This adds helper methods for reconstructing all necessary metadata for arrow types. For now this doesn't handle null_slot_usage (i.e. children of FixedSizeList), it throws exceptions when nulls are encountered in this case. The unit tests demonstrate how to use the helper methods in combination with LevelInfo (generated from parquet/arrow/schema.h) to reconstruct the metadata. - Refactors necessary APIs to use LevelInfo and makes use of them in column_reader - Adds implementations for reconstructing list validity bitmaps (one uses rep/def levels. one uses greater then bitmap generated from rep/def levels). - Adds implementations for reconstruction list lengths (one uses rep/def levels. one uses greater then bitmap generated from rep/def levels). - Adds dynamic dispatch for level comparison algorithms for AVX2 and BMI2. - Adds a pextract alternative that uses BitRunReader that can be used as a fallback. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org