[GitHub] [arrow] AlenkaF commented on a diff in pull request #12811: ARROW-16122: [Python] Deprecate no-longer supported keywords in parquet.write_to_dataset

GitBox Fri, 08 Apr 2022 02:13:22 -0700


AlenkaF commented on code in PR #12811:
URL: https://github.com/apache/arrow/pull/12811#discussion_r845909851



##########
python/pyarrow/tests/parquet/test_dataset.py:
##########
@@ -1290,7 +1290,7 @@ def _test_write_to_dataset_no_partitions(base_path,
     # Without partitions, append files to root_path
     n = 5
     for i in range(n):
-        pq.write_to_dataset(output_table, base_path,
+        pq.write_to_dataset(output_table, base_path, use_legacy_dataset=True,

Review Comment:
   Hm, thinking aloud: `existing_data_behavior` controls how the dataset will 
handle data that already exists. If I implement a unique way of writing parquet 
files when using the new API in `wrtie_to_dataset` I will also have to set 
`existing_data_behavior` to be `overwrite_or_ignore`. That will then make 
trouble when exposing the same parameter in 
https://issues.apache.org/jira/browse/ARROW-15757.
   
   I could do a check if the parameter is specified or not. But am not sure if 
there will be additional complications.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] AlenkaF commented on a diff in pull request #12811: ARROW-16122: [Python] Deprecate no-longer supported keywords in parquet.write_to_dataset

Reply via email to