mapleFU commented on issue #40981:
URL: https://github.com/apache/arrow/issues/40981#issuecomment-2038130354

   Aha, no. Whether we have page index: "a", "b", "c", and "d" are stored in 
different pages. There are three levels:
   * File ( a whole parquet file with same schema containing zero or multiple 
rowgroups)
   * RowGroup: some "rows" with schema in file.
   * Column Chunk: a "leaf" column in one row-group. Each Column Chunk. In your 
json there're 4 column chunks
   * Page: some values in Column Chunk. It should have specific column. When 
Page Index is not enabled and the record is nested, some legacy file might have 
"cross page row", e.g.: "list: [[1, 1, 1], [1]]" stores "[1, 1" and "1]" in 
different page. But when page index enabled, it would not.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to