[GitHub] [arrow-datafusion] rdettai commented on pull request #972: Set target_partitions on table scan in physical planner

GitBox Mon, 20 Sep 2021 01:21:22 -0700


rdettai commented on pull request #972:
URL: https://github.com/apache/arrow-datafusion/pull/972#issuecomment-922721020



   Thanks for your feedback. 
   - @alamb My personal feeling is that this is not the right direction to 
take, my vote would be a -0.9 [Apache 
style](https://www.apache.org/foundation/voting.html#expressing-votes-1-0-1-and-fractions).
 
   - @houqp I would prefer to have all the configurations at the 
`TableProvider` constructor level. I really find that different datasources 
will have specificity in the kind of configuration they support and how they 
behave with it. 
   
   Having specific configurations for each kind of datasource would also have 
another great benefit: we could provide much better API documentation. 
Currently, if someone specifies `ExecutionConfig.target_partition_size`, he 
would need to check a separate documentation (or even check the code) to know 
how different datasources use this config. If we have something like 
`ExecutionConfig.parquet.target_partition_size`, we can directly document how 
this works in the Parquet `TableProvider`. For instance for 
`target_partition_size`, we could document what size we are referring to 
(before/after uncompression...).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-datafusion] rdettai commented on pull request #972: Set target_partitions on table scan in physical planner

Reply via email to