amol- commented on a change in pull request #11008:
URL: https://github.com/apache/arrow/pull/11008#discussion_r709899978
##########
File path: python/pyarrow/dataset.py
##########
@@ -714,9 +729,12 @@ def write_dataset(data, base_dir, basename_template=None,
format=None,
and `format` is not specified, it defaults to the same format as the
specified FileSystemDataset. When writing a Table or RecordBatch, this
keyword is required.
- partitioning : Partitioning, optional
+ partitioning : Partitioning or list[str], optional
The partitioning scheme specified with the ``partitioning()``
- function.
+ function or as a list of field names.
+ partitioning_flavor : str, optional
Review comment:
Default behaviour is equal to providing `partitioning(pa.schema([])`
(haven't changed this).
I would gladly document it, but I'm unsure about what a partitioning with an
empty schema means.
From what I can see it works by creating files without any directory. What
I'm not sure about is if the data will ever be split in multiple files or if we
will always only save `part-0.parquet` because without a partitioning column no
chunks will exist.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]