[ 
https://issues.apache.org/jira/browse/ARROW-18370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Zhu updated ARROW-18370:
---------------------------
    Description: 
`ds.write_dataset` allows specifying Parquet compression, for example:
{code:python}
import pandas as pd
import pyarrow as pa
import pyarrow.dataset as ds

df = pd.DataFrame({'a': [1, 2, 3], 'b': [4, 5, 6]})

df = pa.Table.from_pandas(df)

ds.write_dataset(
    df,
    base_dir='test',
    format='parquet',
    
file_options=ds.ParquetFileFormat().make_write_options(compression='snappy'))
{code}
However, such trick (the `file_options` argument) doesn't work for feather, as 
the following code gives me an error:
{code:python}
import pandas as pd
import pyarrow as pa
import pyarrow.dataset as ds

df = pd.DataFrame({'a': [1, 2, 3], 'b': [4, 5, 6]})

df = pa.Table.from_pandas(df)

ds.write_dataset(
    df,
    base_dir='test',
    format='feather',

    
file_options=ds.FeatherFileFormat().make_write_options(compression='uncompressed'))
{code}
The error: `TypeError: FeatherFileFormat.make_write_options() takes no keyword 
arguments`

  was:
`ds.write_dataset` allows specifying Parquet compression, for example:
{code:python}
import pandas as pd
import pyarrow as pa
import pyarrow.dataset as ds
df = pd.DataFrame(
{'a': [1, 2, 3], 'b': [4, 5, 6]}
)
df = pa.Table.from_pandas(df)
ds.write_dataset(
    df,
    base_dir='test',
    format='parquet',
    
file_options=ds.ParquetFileFormat().make_write_options(compression='snappy'))
{code}
However, such trick (the `file_options` argument) doesn't work for feather, as 
the following code gives me an error:

{code:python}
import pandas as pd
import pyarrow as pa
import pyarrow.dataset as ds

df = pd.DataFrame(

{'a': [1, 2, 3], 'b': [4, 5, 6]}

)
df = pa.Table.from_pandas(df)

ds.write_dataset(
    df,
    base_dir='test',
    format='feather',

    
file_options=ds.FeatherFileFormat().make_write_options(compression='uncompressed'))
{code}

The error: `TypeError: FeatherFileFormat.make_write_options() takes no keyword 
arguments`


> [Python] `ds.write_dataset` doesn't allow feather compression
> -------------------------------------------------------------
>
>                 Key: ARROW-18370
>                 URL: https://issues.apache.org/jira/browse/ARROW-18370
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>    Affects Versions: 10.0.0
>         Environment: Ubuntu 22.04
>            Reporter: Yu Zhu
>            Priority: Major
>
> `ds.write_dataset` allows specifying Parquet compression, for example:
> {code:python}
> import pandas as pd
> import pyarrow as pa
> import pyarrow.dataset as ds
> df = pd.DataFrame({'a': [1, 2, 3], 'b': [4, 5, 6]})
> df = pa.Table.from_pandas(df)
> ds.write_dataset(
>     df,
>     base_dir='test',
>     format='parquet',
>     
> file_options=ds.ParquetFileFormat().make_write_options(compression='snappy'))
> {code}
> However, such trick (the `file_options` argument) doesn't work for feather, 
> as the following code gives me an error:
> {code:python}
> import pandas as pd
> import pyarrow as pa
> import pyarrow.dataset as ds
> df = pd.DataFrame({'a': [1, 2, 3], 'b': [4, 5, 6]})
> df = pa.Table.from_pandas(df)
> ds.write_dataset(
>     df,
>     base_dir='test',
>     format='feather',
>     
> file_options=ds.FeatherFileFormat().make_write_options(compression='uncompressed'))
> {code}
> The error: `TypeError: FeatherFileFormat.make_write_options() takes no 
> keyword arguments`



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to