[GitHub] [arrow] jorisvandenbossche commented on a diff in pull request #12811: ARROW-16122: [Python] Deprecate no-longer supported keywords in parquet.write_to_dataset

GitBox Fri, 15 Apr 2022 01:00:07 -0700


jorisvandenbossche commented on code in PR #12811:
URL: https://github.com/apache/arrow/pull/12811#discussion_r851129892



##########
python/pyarrow/tests/parquet/test_dataset.py:
##########
@@ -1290,7 +1290,7 @@ def _test_write_to_dataset_no_partitions(base_path,
     # Without partitions, append files to root_path
     n = 5
     for i in range(n):
-        pq.write_to_dataset(output_table, base_path,
+        pq.write_to_dataset(output_table, base_path, use_legacy_dataset=True,

Review Comment:
   Thinking more about this: if we switch the default as we are now doing, I 
think we _should_ try to preserve the current behaviour of overwriting/adding 
data (otherwise it would be a quite breaking change for people using 
`pq.write_to_dataset` this way). We can still try to deprecate this and later 
move towards the same default as the dataset.write_dataset implementation. 
   But that can be done in a later stage with a proper deprecation warning (eg 
detect if the directory already exists and is not empty, and in that case 
indicate this will start raising an error in the future).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] jorisvandenbossche commented on a diff in pull request #12811: ARROW-16122: [Python] Deprecate no-longer supported keywords in parquet.write_to_dataset

Reply via email to