[jira] [Commented] (ARROW-13269) [C++] [Dataset] pyarrow.parquet.write_to_dataset does not send full schema to metadata_collector

Weston Pace (Jira) Tue, 06 Jul 2021 15:26:05 -0700


    [ 
https://issues.apache.org/jira/browse/ARROW-13269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17376073#comment-17376073
 ]


Weston Pace commented on ARROW-13269:
-------------------------------------

Another potential fix could be to modify pyarrow.parquet.write_metadata.  The 
function currently takes the table schema (which will have the partition 
columns) and the collected metadata (which do not).  So it could add the 
columns from the table schema to the collected metadata.

> [C++] [Dataset] pyarrow.parquet.write_to_dataset does not send full schema to 
> metadata_collector
> ------------------------------------------------------------------------------------------------
>
>                 Key: ARROW-13269
>                 URL: https://issues.apache.org/jira/browse/ARROW-13269
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++
>    Affects Versions: 4.0.0
>            Reporter: Weston Pace
>            Priority: Major
>
> If there are partition columns specified then the writers will only write the 
> non-partition columns and thus they will not contain the fields used for the 
> partition.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (ARROW-13269) [C++] [Dataset] pyarrow.parquet.write_to_dataset does not send full schema to metadata_collector

Reply via email to