[ 
https://issues.apache.org/jira/browse/ARROW-17352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oliver Klein updated ARROW-17352:
---------------------------------
    Description: 
Parquet files cannot be opened in Windows Parquet Viewer when stored with Arrow 
Version 9.0.0. It worked when stored with version 8 and earlier.

Windows Parquet Viewer: 2.3.5 and 2.3.6

pyarrow version: 9.0.0

Error: System.AggregateException: One or more errors occured. ---> 
Parquet.ParquetException: encoding RLE_DICTIONARY is not supported. 

at Parquet.File.DataColumnReader.ReadColumn(BinaryReader reader ... in 
DataColumnReader.cs: line 259

 

After further checking I found that it seems the problem seems to relate to a 
default parquet version change.

When I use pyarrow 9 and configure version to 1.0 it works again from the 
windows tool - when its 2.4 its not working (or supported in the windows tool).

df.to_parquet(r'C:\temp\test_10.parquet', version='1.0')
df.to_parquet(r'C:\temp\test_24.parquet', version='2.4')

Question might be if such a default change is a bug or a feature.

 

  was:
Parquet files cannot be opened in Windows Parquet Viewer when stored with Arrow 
Version 9.0.0. It worked when stored with version 8 and earlier.

Windows Parquet Viewer: 2.3.5 and 2.3.6

pyarrow version: 9.0.0

Error: System.AggregateException: One or more errors occured. ---> 
Parquet.ParquetException: encoding RLE_DICTIONARY is not supported. 

at Parquet.File.DataColumnReader.ReadColumn(BinaryReader reader ... in 
DataColumnReader.cs: line 259

 

 


> Parquet files cannot be opened in Windows Parquet Viewer when stored with 
> Arrow Version 9.0.0
> ---------------------------------------------------------------------------------------------
>
>                 Key: ARROW-17352
>                 URL: https://issues.apache.org/jira/browse/ARROW-17352
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Parquet
>    Affects Versions: 9.0.0
>         Environment: Windows10
>            Reporter: Oliver Klein
>            Priority: Critical
>         Attachments: arrow9error.PNG
>
>
> Parquet files cannot be opened in Windows Parquet Viewer when stored with 
> Arrow Version 9.0.0. It worked when stored with version 8 and earlier.
> Windows Parquet Viewer: 2.3.5 and 2.3.6
> pyarrow version: 9.0.0
> Error: System.AggregateException: One or more errors occured. ---> 
> Parquet.ParquetException: encoding RLE_DICTIONARY is not supported. 
> at Parquet.File.DataColumnReader.ReadColumn(BinaryReader reader ... in 
> DataColumnReader.cs: line 259
>  
> After further checking I found that it seems the problem seems to relate to a 
> default parquet version change.
> When I use pyarrow 9 and configure version to 1.0 it works again from the 
> windows tool - when its 2.4 its not working (or supported in the windows 
> tool).
> df.to_parquet(r'C:\temp\test_10.parquet', version='1.0')
> df.to_parquet(r'C:\temp\test_24.parquet', version='2.4')
> Question might be if such a default change is a bug or a feature.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to