Joris Van den Bossche created ARROW-10482:
---------------------------------------------

             Summary: [Python] Specifying compression type on a column basis 
when writing Parquet not working
                 Key: ARROW-10482
                 URL: https://issues.apache.org/jira/browse/ARROW-10482
             Project: Apache Arrow
          Issue Type: Bug
          Components: Python
            Reporter: Joris Van den Bossche


>From 
>https://stackoverflow.com/questions/64666270/using-per-column-compression-codec-in-parquet-write-table

According to the docs, you can specify the compression type on a 
column-by-column basis, but that doesn't seem to be working:

{code}
In [5]: table = pa.table([[1, 2], [3, 4], [5, 6]], names=["foo", "bar", "baz"])

In [6]: pq.write_table(table, 'test1.parquet', 
compression=dict(foo='zstd',bar='snappy',baz='brotli'))
...
~/scipy/repos/arrow/python/pyarrow/_parquet.cpython-37m-x86_64-linux-gnu.so in 
string.from_py.__pyx_convert_string_from_py_std__in_string()

TypeError: expected bytes, str found
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to