[GitHub] [arrow] westonpace commented on a change in pull request #10628: ARROW-12364: [Python] [Dataset] Add metadata_collector option to ds.write_dataset()

GitBox Wed, 07 Jul 2021 03:31:53 -0700


westonpace commented on a change in pull request #10628:
URL: https://github.com/apache/arrow/pull/10628#discussion_r665248325




##########
File path: python/pyarrow/dataset.py
##########
@@ -731,6 +731,9 @@ def write_dataset(data, base_dir, basename_template=None, 
format=None,
         (e.g. S3)
     max_partitions : int, default 1024
         Maximum number of partitions any batch may be written into.
+    file_visitor : Function
+        If set, this function will be called with a WrittenFile instance
+        for each file created during the call.

Review comment:
       I added a bit more details as suggested.  I added the bit about the 
parquet metadata and the written file path in the WrittenFile.metadata 
docstring.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] westonpace commented on a change in pull request #10628: ARROW-12364: [Python] [Dataset] Add metadata_collector option to ds.write_dataset()

Reply via email to