[ 
https://issues.apache.org/jira/browse/ARROW-12203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17321148#comment-17321148
 ] 

Joris Van den Bossche commented on ARROW-12203:
-----------------------------------------------

Regarding RLE_DICTIONARY, if enabled (so now with specifying {{format="2.0"}}) 
is it actually used for many types? (I am not very familiar with the logic how 
gets decided which encoding is used while writing;  but so to have an idea of 
the impact of enabling it for compatibility with other readers like fastparquet)

On another note, this is still tagged as 4.0. But it might not be the best 
feature to switch just before the release. It might be safer to switch directly 
after the 4.0 release, so we have some time to gather feedback? (although that 
depends on how many people use the dev version, of course ..)

> [C++][Python] Switch default Parquet version to 2.0
> ---------------------------------------------------
>
>                 Key: ARROW-12203
>                 URL: https://issues.apache.org/jira/browse/ARROW-12203
>             Project: Apache Arrow
>          Issue Type: Wish
>          Components: C++, Python
>            Reporter: Antoine Pitrou
>            Priority: Major
>             Fix For: 4.0.0
>
>
> Currently, Parquet write APIs default to maximum-compatibility Parquet 
> version "1.0", which disables some logical types such as UINT32. We may want 
> to switch the default to "2.0" instead, to allow faithful representation of 
> more types.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to