[ 
https://issues.apache.org/jira/browse/ARROW-11629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Rosenthaler updated ARROW-11629:
-----------------------------------------
    Description: 
If I try to read the attached csv file with pyarrow, changing the float64 
columns to float32 and export it to parquet, the parquet file gets corrupted. 
It is not readable for apache drill or Parquet.Net any longer.

 

Update: Bug in "*Dictionary Encoding*" feature. If I switch it off for floats, 
everything works as expected.

  was:If I try to read the attached csv file with pyarrow, changing the float64 
columns to float32 and export it to parquet, the parquet file gets corrupted. 
It is not readable for apache drill or Parquet.Net any longer.


> [C++] Writing float32 values makes parquet files not readable for some tools
> ----------------------------------------------------------------------------
>
>                 Key: ARROW-11629
>                 URL: https://issues.apache.org/jira/browse/ARROW-11629
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++, Python
>    Affects Versions: 3.0.0
>            Reporter: Matthias Rosenthaler
>            Priority: Major
>         Attachments: foo.parquet, image-2021-02-15-15-49-41-908.png, 
> output.csv, output.parquet
>
>
> If I try to read the attached csv file with pyarrow, changing the float64 
> columns to float32 and export it to parquet, the parquet file gets corrupted. 
> It is not readable for apache drill or Parquet.Net any longer.
>  
> Update: Bug in "*Dictionary Encoding*" feature. If I switch it off for 
> floats, everything works as expected.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to