jorisvandenbossche commented on a change in pull request #10224:
URL: https://github.com/apache/arrow/pull/10224#discussion_r625099784



##########
File path: python/pyarrow/dataset.py
##########
@@ -733,19 +733,17 @@ def write_dataset(data, base_dir, basename_template=None, 
format=None,
     """
     from pyarrow.fs import _resolve_filesystem_and_path
 
-    if isinstance(data, Dataset):
-        schema = schema or data.schema

Review comment:
       > Hmm, would the user ever want to pass a schema to force a cast? Though 
I suppose that's redundant (either configure the scanner or pass a scanner 
instead of a dataset).
   
   In that case, I think the user could have created the dataset with a 
specific (forced) schema: `ds.dataset(..., schema=schema)`. 
   But once you have a dataset, it's not easy to change the schema. The 
`scanner` method doesn't take a `schema` keyword (although that's maybe 
something to add, to project to the specified schema). 
   
   Yes, if we would want to raise, we can indeed raise a warning first. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to