[GitHub] [arrow-datafusion] houqp commented on pull request #972: Set target_partitions on table scan in physical planner

GitBox Sat, 18 Sep 2021 18:06:45 -0700


houqp commented on pull request #972:
URL: https://github.com/apache/arrow-datafusion/pull/972#issuecomment-922396736



   Sorry for the delay, still trying to catch up with all the interesting 
design discussions across different PRs. It looks like we all agree that it's 
better to drive the partitioning logic in table scan using 
`target_partition_size` (or `max_partition_size`) config.
   
   For this particular PR, I agree with @alamb that it is an improvement and 
doesn't make anything worse. I think the newly added ScanConfig also lays out a 
good foundation for us to add more scan config in the future including 
`target_partition_size` and potentially remove `target_partitions`.
   
   I also agree with @rdettai that we should avoid unnecessary API churn if 
possible. However, I do think the table `scan` method is the right place for us 
to pass in the `target_partition_size` config when we add it in the future. 
@rdettai do you envision us having to pass in the `target_partition_size` 
config at a different layer? If so, I can see the value of avoiding the API 
change in the TableProvider trait.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-datafusion] houqp commented on pull request #972: Set target_partitions on table scan in physical planner

Reply via email to