[ 
https://issues.apache.org/jira/browse/ARROW-3514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Machanic updated ARROW-3514:
---------------------------------
    Description: 
The below Python code throws an exception in 0.11.0, but not in 0.10.0.

I was able to reproduce the issue in Amazon Linux, CentOS 7, and Ubuntu 16.04, 
but not in Windows 7.

The Amazon and CentOS machines are both running zlib 1.2.7, and the Ubuntu 
machine is using 1.2.8.

Tested with CPython 3.6 in all cases.
{code:python}
import io
import pyarrow
from pyarrow import parquet

tbl = pyarrow.Table.from_arrays([pyarrow.array(['abc', 'def'])], ['some_col'])

f = io.BytesIO()
parquet.write_table(tbl, f, compression='gzip')
{code}

Following is the exception:

{code}
Traceback (most recent call last):
  File "test_pyarrow.py", line 8, in <module>
    parquet.write_table(tbl, f, compression='gzip')
  File "/home/adam/anaconda3/lib/python3.6/site-packages/pyarrow/parquet.py", 
line 1125, in write_table
    writer.write_table(table, row_group_size=row_group_size)
  File "/home/adam/anaconda3/lib/python3.6/site-packages/pyarrow/parquet.py", 
line 376, in write_table
    self.writer.write_table(table, row_group_size=row_group_size)
  File "pyarrow/_parquet.pyx", line 934, in 
pyarrow._parquet.ParquetWriter.write_table
  File "pyarrow/error.pxi", line 83, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: Arrow error: IOError: zlib deflate failed, output 
buffer too small
{code}

  was:
The below Python code throws an exception in 0.11.0, but not in 0.10.0.

I was able to reproduce the issue in Amazon Linux, CentOS 7, and Ubuntu 16.04, 
but not in Windows 7.

The Amazon and CentOS machines are both running zlib 1.2.7, and the Ubuntu 
machine is using 1.2.8.

Tested with CPython 3.6 in all cases.
{code:java}
import io
import pyarrow
from pyarrow import parquet

tbl = pyarrow.Table.from_arrays([pyarrow.array(['abc', 'def'])], ['some_col'])

f = io.BytesIO()
parquet.write_table(tbl, f, compression='gzip')
{code}


> [Python] zlib deflate exception when writing Parquet file
> ---------------------------------------------------------
>
>                 Key: ARROW-3514
>                 URL: https://issues.apache.org/jira/browse/ARROW-3514
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++, Python
>    Affects Versions: 0.11.0
>         Environment: Amazon Linux, CentOS 7, Ubuntu 16.04, zlib 1.2.7/1.2.8, 
> CPython 3.6.
>            Reporter: Adam Machanic
>            Priority: Major
>             Fix For: 0.10.0
>
>
> The below Python code throws an exception in 0.11.0, but not in 0.10.0.
> I was able to reproduce the issue in Amazon Linux, CentOS 7, and Ubuntu 
> 16.04, but not in Windows 7.
> The Amazon and CentOS machines are both running zlib 1.2.7, and the Ubuntu 
> machine is using 1.2.8.
> Tested with CPython 3.6 in all cases.
> {code:python}
> import io
> import pyarrow
> from pyarrow import parquet
> tbl = pyarrow.Table.from_arrays([pyarrow.array(['abc', 'def'])], ['some_col'])
> f = io.BytesIO()
> parquet.write_table(tbl, f, compression='gzip')
> {code}
> Following is the exception:
> {code}
> Traceback (most recent call last):
>   File "test_pyarrow.py", line 8, in <module>
>     parquet.write_table(tbl, f, compression='gzip')
>   File "/home/adam/anaconda3/lib/python3.6/site-packages/pyarrow/parquet.py", 
> line 1125, in write_table
>     writer.write_table(table, row_group_size=row_group_size)
>   File "/home/adam/anaconda3/lib/python3.6/site-packages/pyarrow/parquet.py", 
> line 376, in write_table
>     self.writer.write_table(table, row_group_size=row_group_size)
>   File "pyarrow/_parquet.pyx", line 934, in 
> pyarrow._parquet.ParquetWriter.write_table
>   File "pyarrow/error.pxi", line 83, in pyarrow.lib.check_status
> pyarrow.lib.ArrowIOError: Arrow error: IOError: zlib deflate failed, output 
> buffer too small
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to