Ted-Jiang opened a new pull request, #1762:
URL: https://github.com/apache/arrow-rs/pull/1762

   # Which issue does this PR close?
   
   Closes #1761 .
   
   # Rationale for this change
   Get this info in memory then we can apply page-level filter in future.
   
   # What changes are included in this PR?
   Add an option to read page index in `parquet/src/file/serialized_reader.rs`. 
   
   # Are there any user-facing changes?
   
   In  `parquet-testing` only `data_index_bloom_encoding_stats.parquet` has one 
`rowgroup` with pageIndex.
   I will generate test file base on `alltypes_plain.parquet` (this file  not 
contains any pageindex) in repo `parquet-testing`, and support multi-RG in Next 
Pr.
   
   ```
    parquet-tools column-index ./alltypes_plain.parquet
   row group 0:
   column index for column id:
   NONE
   offset index for column id:
   NONE
   .
   .
   .
   column index for column timestamp_col:
   NONE
   offset index for column timestamp_col:
   ```
   
   <!---
   If there are user-facing changes then we may require documentation to be 
updated before approving the PR.
   -->
   
   <!---
   If there are any breaking changes to public APIs, please add the `breaking 
change` label.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to