Rachelint commented on issue #11451: URL: https://github.com/apache/datafusion/issues/11451#issuecomment-2227415311
Seems to be a good supplement to the `coop_budget` in `tokio`! Actually, we are encountering the tail latencies problem in our production, the heavy queries block the scheduler and make the light ones timeout... This feature maybe help much. But I still don't quite understand about `I think this is probably not a big issue if you are setting the partition parallelism to the number` mentioned above. Assume a machine with 8 cores, and we set the parallelism to 8. The query is: ```sql SELECT `c1`, COUNT(*) FROM `test` WHERE `time` >= '2024-07-12 16:51:24' AND `time` < '2024-07-12 17:51:24' GROUP BY `c1` ``` will be translated to the physical plan like: ``` AggregateExec: mode=FinalPartitioned, gby=[c1], aggr=[COUNT(UInt8(1))] CoalesceBatchesExec: target_batch_size=8192 RepartitionExec: partitioning=Hash(c1, 8), input_partitions=8 AggregateExec: mode=Partial, gby=[c1], aggr=[COUNT(UInt8(1))] ProjectionExec: expr=[c1] TableScan ``` It looks like this scenario could occur? I split the physical plan above to two stages, and assume that `first stage` is `io bound`, and the `second stage` is `cpu bound` which will block the tokio scheduler(just a simple assume, may not entirely reflect reality). - The first stage ``` AggregateExec: mode=Partial, gby=[c1], aggr=[COUNT(UInt8(1))] ProjectionExec: expr=[c1] TableScan ``` - The second stage ``` AggregateExec: mode=FinalPartitioned, gby=[c1], aggr=[COUNT(UInt8(1))] CoalesceBatchesExec: target_batch_size=8192 RepartitionExec: partitioning=Hash(c1, 8), input_partitions=8 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org