jorisvandenbossche opened a new pull request #8912:
URL: https://github.com/apache/arrow/pull/8912


   The C++ `FileSystemDatasetFactory::Finish` method handles the schema 
inference or validation with two options: `InspectOptions::fragments` to 
indicate the number of fragments to use when inferring *or* validating the 
schema (default of 1), and the `FinishOptions::validate_fragments` to indicate 
whether to validate the specified schema (when not inferred). 
   
   For now, I decided to combine this in a single keyword on the Python side 
(`validate_schema`). This avoids adding 2 inter-dependent keywords for this, 
and makes it easier to express some typical use cases (eg validate the 
specified schema with all fragments is now `validate_schema=True` instead of 
`validate_schema=True, fragments=-1`). On the other hand, it gives a single 
keyword that accepts both boolean or int (which is not super clean). So this is 
certainly up for discussion.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to