scsmithr commented on issue #7001:
URL: 
https://github.com/apache/arrow-datafusion/issues/7001#issuecomment-1675625184

   With the current architecture, I think the easiest thing to do here is 
probably to not try to have a one-to-one mapping of task to core, and lean into 
tokio for scheduling to try to get good resource utilization.
   
   I did a quick experiment for this by increasing the number of partitions to 
use in the repartition optimization rule. Currently it gets set to 
`target_partitions`, and I just multiplied it by 3.
   
   The diff:
   
   ```diff
   diff --git a/datafusion/core/src/physical_optimizer/repartition.rs 
b/datafusion/core/src/physical_optimizer/repartition.rs
   index aa48fd77a..b4ef8c3a2 100644
   --- a/datafusion/core/src/physical_optimizer/repartition.rs
   +++ b/datafusion/core/src/physical_optimizer/repartition.rs
   @@ -288,7 +288,7 @@ impl PhysicalOptimizerRule for Repartition {
            plan: Arc<dyn ExecutionPlan>,
            config: &ConfigOptions,
        ) -> Result<Arc<dyn ExecutionPlan>> {
   -        let target_partitions = config.execution.target_partitions;
   +        let target_partitions = config.execution.target_partitions * 3;
            let enabled = config.optimizer.enable_round_robin_repartition;
            let repartition_file_scans = 
config.optimizer.repartition_file_scans;
            let repartition_file_min_size = 
config.optimizer.repartition_file_min_size;
   ```
   
   And some benchmarks using TPCH SF=10
   
   Macbook air m2:
   ```
   Comparing main and three
   --------------------
   Benchmark tpch.json
   --------------------
   ┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
   ┃ Query        ┃      main ┃     three ┃        Change ┃
   ┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
   │ QQuery 1     │ 3615.39ms │ 1745.94ms │ +2.07x faster │
   │ QQuery 2     │  447.68ms │  450.67ms │     no change │
   │ QQuery 3     │ 1320.39ms │  730.22ms │ +1.81x faster │
   │ QQuery 4     │  726.73ms │  394.54ms │ +1.84x faster │
   │ QQuery 5     │ 1619.80ms │ 1064.71ms │ +1.52x faster │
   │ QQuery 6     │  741.31ms │  371.64ms │ +1.99x faster │
   │ QQuery 7     │ 3269.57ms │ 2528.63ms │ +1.29x faster │
   │ QQuery 8     │ 1972.34ms │ 1200.81ms │ +1.64x faster │
   │ QQuery 9     │ 3032.73ms │ 2252.82ms │ +1.35x faster │
   │ QQuery 10    │ 2045.88ms │ 1381.35ms │ +1.48x faster │
   │ QQuery 11    │  466.09ms │  484.14ms │     no change │
   │ QQuery 12    │ 1400.27ms │  652.58ms │ +2.15x faster │
   │ QQuery 13    │ 2088.00ms │ 1181.92ms │ +1.77x faster │
   │ QQuery 14    │ 1047.22ms │  551.92ms │ +1.90x faster │
   │ QQuery 15    │  802.09ms │  650.86ms │ +1.23x faster │
   │ QQuery 16    │  429.19ms │  418.61ms │     no change │
   │ QQuery 17    │ 4392.38ms │ 4516.79ms │     no change │
   │ QQuery 18    │ 7849.27ms │ 7245.33ms │ +1.08x faster │
   │ QQuery 19    │ 2180.33ms │ 1012.43ms │ +2.15x faster │
   │ QQuery 20    │ 1657.01ms │ 1110.12ms │ +1.49x faster │
   │ QQuery 21    │ 3867.34ms │ 3671.96ms │ +1.05x faster │
   │ QQuery 22    │  490.25ms │  443.54ms │ +1.11x faster │
   └──────────────┴───────────┴───────────┴───────────────┘
   ```
   
   GCP n1 vm (8 cores/16 threads, 60G memory):
   ```
   Comparing main and three
   --------------------
   Benchmark tpch.json
   --------------------
   ┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
   ┃ Query        ┃       main ┃      three ┃        Change ┃
   ┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
   │ QQuery 1     │  8947.17ms │  3274.31ms │ +2.73x faster │
   │ QQuery 2     │  1562.68ms │  1519.65ms │     no change │
   │ QQuery 3     │  3839.36ms │  1758.32ms │ +2.18x faster │
   │ QQuery 4     │  1995.52ms │   967.04ms │ +2.06x faster │
   │ QQuery 5     │  4906.24ms │  2933.39ms │ +1.67x faster │
   │ QQuery 6     │  1941.19ms │   710.94ms │ +2.73x faster │
   │ QQuery 7     │  9391.94ms │  6423.27ms │ +1.46x faster │
   │ QQuery 8     │  5503.67ms │  2735.65ms │ +2.01x faster │
   │ QQuery 9     │  8840.15ms │  6129.22ms │ +1.44x faster │
   │ QQuery 10    │  5807.23ms │  3804.27ms │ +1.53x faster │
   │ QQuery 11    │  1699.21ms │  1744.50ms │     no change │
   │ QQuery 12    │  2993.60ms │  1277.92ms │ +2.34x faster │
   │ QQuery 13    │  4907.55ms │  2585.08ms │ +1.90x faster │
   │ QQuery 14    │  2633.90ms │  1114.95ms │ +2.36x faster │
   │ QQuery 15    │  2085.41ms │  1176.97ms │ +1.77x faster │
   │ QQuery 16    │  1683.76ms │  1572.89ms │ +1.07x faster │
   │ QQuery 17    │  7322.44ms │  6412.73ms │ +1.14x faster │
   │ QQuery 18    │ 15950.64ms │ 16131.98ms │     no change │
   │ QQuery 19    │  4936.98ms │  1834.62ms │ +2.69x faster │
   │ QQuery 20    │  5126.77ms │  3039.32ms │ +1.69x faster │
   │ QQuery 21    │ 14114.41ms │  9196.05ms │ +1.53x faster │
   │ QQuery 22    │  1449.16ms │  1206.74ms │ +1.20x faster │
   └──────────────┴────────────┴────────────┴───────────────┘
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to