[
https://issues.apache.org/jira/browse/ARROW-13269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17376723#comment-17376723
]
Joris Van den Bossche commented on ARROW-13269:
-----------------------------------------------
Sidenote: there is also the question whether we should drop partition columns
at all from the written files for a partitioned dataset. Based on a previous
conversation on the mailing list, it seems there are other systems that don't
exclude those columns. At the time I opened an issue to check that we can
_read_ such datasets (with duplicate information between partitioning and file
columns) -> ARROW-10347. But we should maybe also consider if we want to be
able to _write_ such datasets.
> [C++] [Dataset] pyarrow.parquet.write_to_dataset does not send full schema to
> metadata_collector
> ------------------------------------------------------------------------------------------------
>
> Key: ARROW-13269
> URL: https://issues.apache.org/jira/browse/ARROW-13269
> Project: Apache Arrow
> Issue Type: Bug
> Components: C++
> Affects Versions: 4.0.0
> Reporter: Weston Pace
> Priority: Major
>
> If there are partition columns specified then the writers will only write the
> non-partition columns and thus they will not contain the fields used for the
> partition.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)