[
https://issues.apache.org/jira/browse/PARQUET-188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335625#comment-14335625
]
Ryan Blue commented on PARQUET-188:
-----------------------------------
I don't think there is a requirement in the format spec ([docs
here|https://parquet.incubator.apache.org/documentation/latest/]) to write
columns in a specific order. I agree that it would ideally match, but I'm
surprised that this is causing a problems because the column chunk metadata
contains the offset where the each column chunk starts.
> Parquet writes columns out of order (compared to the schema)
> ------------------------------------------------------------
>
> Key: PARQUET-188
> URL: https://issues.apache.org/jira/browse/PARQUET-188
> Project: Parquet
> Issue Type: Bug
> Components: parquet-mr
> Reporter: Colin Marc
>
> When building from master, parquet seems to write row groups with the columns
> in arbitrary orders, not in the same order as the schema. This appears to
> happen regardless of the OutputFormat or WriteSupport used.
> This breaks implementations that assume the columns will be in a specific
> order, in particular impala.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)