[ 
https://issues.apache.org/jira/browse/ARROW-14196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17425933#comment-17425933
 ] 

Micah Kornfield commented on ARROW-14196:
-----------------------------------------

Yeah, I think it is really only the name change that I'm concerned about.  
https://issues.apache.org/jira/browse/ARROW-13151 has another example where 
people where trying to reference things by path that was broken for other 
reasons.

 

A few item that don't really solve all cases but would make things better or at 
least adaptable to the long term:

1.  Add an option that translates "compliant nest type" name back to the 
arrow's naming schema.

2.  Make it possible to select columns by eliding the list name components.

 

Another question that is dataset specific, is if one file was written with 
compliant nested types and one was not, and both where read in the same dataset 
are the results sane (schema's get unified?)

> [C++][Parquet] Default to compliant nested types in Parquet writer
> ------------------------------------------------------------------
>
>                 Key: ARROW-14196
>                 URL: https://issues.apache.org/jira/browse/ARROW-14196
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++, Parquet
>            Reporter: Joris Van den Bossche
>            Priority: Major
>
> In C++ there is already an option to get the "compliant_nested_types" (to 
> have the list columns follow the Parquet specification), and ARROW-11497 
> exposed this option in Python.
> This is still set to False by default, but in the source it says "TODO: At 
> some point we should flip this.", and in ARROW-11497 there was also some 
> discussion about what it would take to change the default.
> cc [~emkornfield] [~apitrou]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to