yjshen edited a comment on pull request #972:
URL: https://github.com/apache/arrow-datafusion/pull/972#issuecomment-919153521


   Sorry for not participating in the discussion on the relationship among 
parallelism, number of partitions, target concurrencies, target partition size, 
etc., at an early stage. *I was intended to keeping the original term 
`max_concurrency` in DataFusion from my first review.* 
   
   Since we are getting more on a generalized design discussion here, I want to 
share my understandings:
   
   My core idea is to partition the dataset referred to in each query based on 
**required size in bytes**, i.e., our users should set the desired partition 
size in bytes (or tune the partition size parameter step by step). Or in other 
words, users choose [execution 
granularity](https://en.wikipedia.org/wiki/Granularity_(parallel_computing)) 
based on the amount of work performed by each task.  This has several 
implications:
   -  We engine could adaptively change table processing speed by schedule more 
or fewer tasks to run concurrently, based on the current payload of the system 
or a more complex operator DAG scheduling policy. 
   - It should be CBO's responsibility to decide the partition number instead 
of at the level of `TableProvider`. I suppose the partition number for scanning 
each table should depend highly on the real data size involved in a query. It's 
meaningless to divide a table with the same partition number for both `select 
count(distinct a) from table1` and `select a, b, c .... f from table1`.
   - Users could provide more cores when anticipating an execution speed up 
during a period of time, known as elasticity, I think. Ballista is more likely 
to get this benefit.
   
   Therefore, I like to have`SourceConfig` contain `target_partition_size` and 
`target_partition_num`, which CBO can mutate or replace later in a *ScanExec 
node. And always prefer `target_partition_size` when statistics are available 
when stored in a catalog or lightweight enough to get. `target_partition_num` 
could be left out for num_cpus, for example. When the input table is small 
enough, there is no bother to care about the partition strategies.
   
   Besides, I'm aware that the scope is too much for this PR. Stop where it's 
appropriate, and we could move sophisticated later.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to