[ https://issues.apache.org/jira/browse/PARQUET-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gabor Szadovszky reassigned PARQUET-2034: ----------------------------------------- Assignee: Gabor Szadovszky > Document dictionary page position > --------------------------------- > > Key: PARQUET-2034 > URL: https://issues.apache.org/jira/browse/PARQUET-2034 > Project: Parquet > Issue Type: Bug > Components: parquet-format > Reporter: Gabor Szadovszky > Assignee: Gabor Szadovszky > Priority: Major > > Dictionary page shall be always written to the first position of the column > chunk. Unfortunately, we only have one statement about this "hidden" at the > [encodings > doc|https://github.com/apache/parquet-format/blob/master/Encodings.md#dictionary-encoding-plain_dictionary--2-and-rle_dictionary--8]: > {quote}The dictionary page is written first, before the data pages of the > column chunk.{quote} > This statement is not emphasized enough and not prepared for the potential > extension of the available page types. It also should be placed to a more > central place of the specification and also in the thrift file. -- This message was sent by Atlassian Jira (v8.3.4#803005)