[ 
https://issues.apache.org/jira/browse/ARROW-7939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17263678#comment-17263678
 ] 

Michael Peleshenko commented on ARROW-7939:
-------------------------------------------

[~apitrou] I'm testing on Windows 10 and the latest nightly wheel I see is 
pyarrow-2.1.0.dev48 while linux wheels go all the way to dev 581. It worked 
fine on my Xeon Silver so far. I'll let you know the results of my colleague's 
test with his Xeon E5 tomorrow.

> [Python] crashes when reading parquet file compressed with snappy
> -----------------------------------------------------------------
>
>                 Key: ARROW-7939
>                 URL: https://issues.apache.org/jira/browse/ARROW-7939
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>    Affects Versions: 0.16.0
>         Environment: Windows 7
> python 3.6.9
> pyarrow 0.16 from conda-forge
>            Reporter: Marc Bernot
>            Assignee: Uwe Korn
>            Priority: Major
>             Fix For: 1.0.0
>
>
> When I installed pyarrow 0.16, some parquet files created with pyarrow 0.15.1 
> would make python crash. I drilled down to the simplest example I could find.
> It happens that some parquet files created with pyarrow 0.16 cannot either be 
> read back. The example below works fine with arrays_ok but python crashes 
> with arrays_nok (and as soon as they are at least three different values 
> apparently).
> Besides, it works fine with 'none', 'gzip' and 'brotli' compression. The 
> problem seems to happen only with snappy.
> {code:python}
> import pyarrow.parquet as pq
> import pyarrow as pa
> arrays_ok = [[0,1]]
> arrays_ok = [[0,1,1]]
> arrays_nok = [[0,1,2]]
> table = pa.Table.from_arrays(arrays_nok,names=['a'])
> pq.write_table(table,'foo.parquet',compression='snappy')
> pq.read_table('foo.parquet')
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to