GitHub user emkornfield added a comment to the discussion: Dictionary page offset logic
> I'm not sure why there is a check that the dictionary page offset is greater > than 0? If this isn't a dictionary page, should it be not set (first > condition)? Looking at the code flow, if the file is malformed (with a negative dictionary_page_offset) we potentially fail to catch the negative offset on line 189 > Is it possible for the data page offset to equal 0 (when we don't have data > pages)? Technically, for a valid parquet file neither offset should be zero because parquet has a magic number as its first four bytes. Without data pages it means the row group is empty, and in theory readers should skip it (e.g. this [PR](https://github.com/apache/parquet-java/pull/1018) does this) GitHub link: https://github.com/apache/arrow/discussions/48184#discussioncomment-15018665 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
