westonpace commented on a change in pull request #10628:
URL: https://github.com/apache/arrow/pull/10628#discussion_r666601945
##########
File path: python/pyarrow/dataset.py
##########
@@ -731,6 +731,12 @@ def write_dataset(data, base_dir, basename_template=None,
format=None,
(e.g. S3)
max_partitions : int, default 1024
Maximum number of partitions any batch may be written into.
+ file_visitor : Function
+ If set, this function will be called with a WrittenFile instance
+ for each file created during the call. This object will contain
+ the path and (if the dataset is a parquet dataset) the parquet
Review comment:
Ah, I thought WrittenFile was exposed. I've improved the docstring
here. For my education, what is the concern with users relying on this class?
It seems less brittle than users relying on a snippet of documentation.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]