[GitHub] [arrow] westonpace commented on a change in pull request #11632: ARROW-14620: [Python] Missing bindings for existing_data_behavior makes it impossible to maintain old behavior

GitBox Mon, 08 Nov 2021 10:40:25 -0800


westonpace commented on a change in pull request #11632:
URL: https://github.com/apache/arrow/pull/11632#discussion_r744995633




##########
File path: python/pyarrow/dataset.py
##########
@@ -798,6 +799,18 @@ def write_dataset(data, base_dir, basename_template=None, 
format=None,
 
             def file_visitor(written_file):
                 visited_paths.append(written_file.path)
+    existing_data_behavior : 'error' | 'overwrite' | 'delete_matching'

Review comment:
       Let's stick with `overwrite_or_ignore`.  Should we decide we need to 
change at some point down the line it would be a fairly minor change even if we 
wanted to keep backwards compatibility with the old style.  The R & python 
dataset APIs are already pretty different.

##########
File path: python/pyarrow/_dataset.pyx
##########
@@ -3381,6 +3382,19 @@ def _filesystemdataset_write(
     c_options.partitioning = partitioning.unwrap()
     c_options.max_partitions = max_partitions
     c_options.basename_template = tobytes(basename_template)
+    if existing_data_behavior == 'error':
+        c_options.existing_data_behavior = ExistingDataBehavior_ERROR
+    elif existing_data_behavior == 'overwrite_or_ignore':
+        c_options.existing_data_behavior =\
+            ExistingDataBehavior_OVERWRITE_OR_IGNORE
+    elif existing_data_behavior == 'delete_matching':
+        c_options.existing_data_behavior = ExistingDataBehavior_DELETE_MATCHING
+    else:
+        raise ValueError(
+            ('existing_data_behavior must be one of error, ',
+             'overwrite_or_ignore or delete_matching')

Review comment:
       Good idea, added.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] westonpace commented on a change in pull request #11632: ARROW-14620: [Python] Missing bindings for existing_data_behavior makes it impossible to maintain old behavior

Reply via email to