I am using  Apache Arrow to write Parquet files. I am writing an uncompressed 
and non dictionary-encoded parquet file using pyarrow.parquet but the offsets 
are not well aligned when inspected using parquet tools. For example when I add 
up the row group offset with the row group size it does not come up to the row 
group offset of the new rowgroup. Can anyone tell why this is happening ? Also 
the difference between different row groups is not constant. I can see 
previously written parquet files with BIT-PACKED encoding and in those files 
the offset/size math is perfect. I am wondering how to write parquet files with 
similar BIT-PACKED type encoding now (when BIT-PACKED encoding is deprecated) ? 
Thanks a lot


Reply via email to