[
https://issues.apache.org/jira/browse/ARROW-18225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
François Chareyron updated ARROW-18225:
---------------------------------------
Description:
When using {{write_metadata}}, {{kwargs}} can be used to pass a FileSystem to a
ParquetWriter. However, those {{kwargs}} are not passed to {{read_metadata}}
later on despite the function accepting a filesystem argument.
This creates an error when trying to write metadata on a S3FileSystem for
example.
{code:python}
def write_metadata(schema, where, metadata_collector=None, **kwargs):
writer = ParquetWriter(where, schema, **kwargs)
writer.close()
if metadata_collector is not None:
metadata = read_metadata(where) # kwargs should be passed here
for m in metadata_collector:
metadata.append_row_groups(m)
metadata.write_metadata_file(where) # kwargs should be passed here
{code}
{code:python}
def read_metadata(where, memory_map=False, decryption_properties=None,
filesystem=None):
...{code}
was:
When using {{{}write_metadata{}}}, {{kwargs }}can be used to pass a FileSystem
to a ParquetWriter. However, those {{kwargs }}are not passed to
{{read_metadata}} later on despite the function accepting a filesystem argument.
This creates an error when trying to write metadata on a S3FileSystem for
example.
{code:python}
def write_metadata(schema, where, metadata_collector=None, **kwargs):
writer = ParquetWriter(where, schema, **kwargs)
writer.close()
if metadata_collector is not None:
metadata = read_metadata(where) # kwargs should be passed here
for m in metadata_collector:
metadata.append_row_groups(m)
metadata.write_metadata_file(where) # kwargs should be passed here
{code}
{code:python}
def read_metadata(where, memory_map=False, decryption_properties=None,
filesystem=None):
...{code}
> [Python] write_metadata does not fully use **kwargs
> ---------------------------------------------------
>
> Key: ARROW-18225
> URL: https://issues.apache.org/jira/browse/ARROW-18225
> Project: Apache Arrow
> Issue Type: Bug
> Components: Python
> Reporter: François Chareyron
> Priority: Blocker
>
> When using {{write_metadata}}, {{kwargs}} can be used to pass a FileSystem to
> a ParquetWriter. However, those {{kwargs}} are not passed to
> {{read_metadata}} later on despite the function accepting a filesystem
> argument.
> This creates an error when trying to write metadata on a S3FileSystem for
> example.
> {code:python}
> def write_metadata(schema, where, metadata_collector=None, **kwargs):
> writer = ParquetWriter(where, schema, **kwargs)
> writer.close()
> if metadata_collector is not None:
> metadata = read_metadata(where) # kwargs should be passed here
> for m in metadata_collector:
> metadata.append_row_groups(m)
> metadata.write_metadata_file(where) # kwargs should be passed here
> {code}
> {code:python}
> def read_metadata(where, memory_map=False, decryption_properties=None,
> filesystem=None):
> ...{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)