[
https://issues.apache.org/jira/browse/ARROW-3514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Adam Machanic updated ARROW-3514:
---------------------------------
Description:
The below Python code throws an exception in 0.11.0, but not in 0.10.0.
I was able to reproduce the issue in Amazon Linux, CentOS 7, and Ubuntu 16.04,
but not in Windows 7.
The Amazon and CentOS machines are both running zlib 1.2.7, and the Ubuntu
machine is using 1.2.8.
Tested with CPython 3.6 in all cases.
{code:python}
import io
import pyarrow
from pyarrow import parquet
tbl = pyarrow.Table.from_arrays([pyarrow.array(['abc', 'def'])], ['some_col'])
f = io.BytesIO()
parquet.write_table(tbl, f, compression='gzip')
{code}
Following is the exception:
{code}
Traceback (most recent call last):
File "test_pyarrow.py", line 8, in <module>
parquet.write_table(tbl, f, compression='gzip')
File "/home/adam/anaconda3/lib/python3.6/site-packages/pyarrow/parquet.py",
line 1125, in write_table
writer.write_table(table, row_group_size=row_group_size)
File "/home/adam/anaconda3/lib/python3.6/site-packages/pyarrow/parquet.py",
line 376, in write_table
self.writer.write_table(table, row_group_size=row_group_size)
File "pyarrow/_parquet.pyx", line 934, in
pyarrow._parquet.ParquetWriter.write_table
File "pyarrow/error.pxi", line 83, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: Arrow error: IOError: zlib deflate failed, output
buffer too small
{code}
was:
The below Python code throws an exception in 0.11.0, but not in 0.10.0.
I was able to reproduce the issue in Amazon Linux, CentOS 7, and Ubuntu 16.04,
but not in Windows 7.
The Amazon and CentOS machines are both running zlib 1.2.7, and the Ubuntu
machine is using 1.2.8.
Tested with CPython 3.6 in all cases.
{code:java}
import io
import pyarrow
from pyarrow import parquet
tbl = pyarrow.Table.from_arrays([pyarrow.array(['abc', 'def'])], ['some_col'])
f = io.BytesIO()
parquet.write_table(tbl, f, compression='gzip')
{code}
> [Python] zlib deflate exception when writing Parquet file
> ---------------------------------------------------------
>
> Key: ARROW-3514
> URL: https://issues.apache.org/jira/browse/ARROW-3514
> Project: Apache Arrow
> Issue Type: Bug
> Components: C++, Python
> Affects Versions: 0.11.0
> Environment: Amazon Linux, CentOS 7, Ubuntu 16.04, zlib 1.2.7/1.2.8,
> CPython 3.6.
> Reporter: Adam Machanic
> Priority: Major
> Fix For: 0.10.0
>
>
> The below Python code throws an exception in 0.11.0, but not in 0.10.0.
> I was able to reproduce the issue in Amazon Linux, CentOS 7, and Ubuntu
> 16.04, but not in Windows 7.
> The Amazon and CentOS machines are both running zlib 1.2.7, and the Ubuntu
> machine is using 1.2.8.
> Tested with CPython 3.6 in all cases.
> {code:python}
> import io
> import pyarrow
> from pyarrow import parquet
> tbl = pyarrow.Table.from_arrays([pyarrow.array(['abc', 'def'])], ['some_col'])
> f = io.BytesIO()
> parquet.write_table(tbl, f, compression='gzip')
> {code}
> Following is the exception:
> {code}
> Traceback (most recent call last):
> File "test_pyarrow.py", line 8, in <module>
> parquet.write_table(tbl, f, compression='gzip')
> File "/home/adam/anaconda3/lib/python3.6/site-packages/pyarrow/parquet.py",
> line 1125, in write_table
> writer.write_table(table, row_group_size=row_group_size)
> File "/home/adam/anaconda3/lib/python3.6/site-packages/pyarrow/parquet.py",
> line 376, in write_table
> self.writer.write_table(table, row_group_size=row_group_size)
> File "pyarrow/_parquet.pyx", line 934, in
> pyarrow._parquet.ParquetWriter.write_table
> File "pyarrow/error.pxi", line 83, in pyarrow.lib.check_status
> pyarrow.lib.ArrowIOError: Arrow error: IOError: zlib deflate failed, output
> buffer too small
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)