Xinli Shang created PARQUET-2093:
------------------------------------

             Summary: Add rewriter version to Parquet footer 
                 Key: PARQUET-2093
                 URL: https://issues.apache.org/jira/browse/PARQUET-2093
             Project: Parquet
          Issue Type: Improvement
    Affects Versions: 1.13.0
            Reporter: Xinli Shang
            Assignee: Xinli Shang


Parquet footer records the writer's version in the field of 'create-by'. As we 
introduce several rewrites, the new file is written partially by the rewriter. 
In this case, we need to record the rewriter's version also. 

Some questions (about a common rewriter) we need to answer before step forward:

What would be the place of the rewriter versions? (New specific field or 
key-value metadata? Which key shall we use?)
Shall we somehow also save what the rewriter has done? How?
At what level shall we copy the original created_by field and what level shall 
we write the version of the rewriter to that field instead? (What different 
levels are possible?)
>From the introduction of this rewriter(s) field in case of any related writer 
>version dependent fix we need to check this field as well and not only the 
>created_by one.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to