Julien Le Dem created PARQUET-922:
-------------------------------------

             Summary: Define Index pages when a Parquet file is sorted
                 Key: PARQUET-922
                 URL: https://issues.apache.org/jira/browse/PARQUET-922
             Project: Parquet
          Issue Type: Improvement
          Components: parquet-format
            Reporter: Julien Le Dem
            Assignee: Marcel Kornacker


When a Parquet file is sorted we can define an index consisting of the boundary 
values for the pages of the columns sorted on as well as the offsets and length 
of said pages in the file.
The goal is to optimize lookup and range scan type queries, using this to read 
only the pages containing data matching the filter.
We'd require the pages to be aligned accross columns.

[~marcelk] will add a link to the google doc to discuss the spec



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to