Dandandan commented on PR #14411:
URL: https://github.com/apache/datafusion/pull/14411#issuecomment-2661425398

   > > I ran some tests yesterday and I can confirm the runtime improvements. I 
do get some high memory usage however especially with some queries (TPC-H Query 
18 I believe) than when using round-robin repartitioning. Are there some ways 
to get it down (e.g. use bounded channels or otherwise?)
   > 
   > I tried to avoid using `yield_now` when waiting for the child operator 
data; this should lower memory usage. benchmark after adopting [this 
approach](https://github.com/apache/datafusion/pull/13707)
   > 
   > The performance decreased in many cases @Dandandan
   > 
   > ```
   > --------------------
   > Benchmark clickbench_partitioned.json
   > --------------------
   > 
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
   > ┃ Query        ┃ on-demand-repartition-with-config ┃ 
on-demand-not-always-add-roundrobin ┃        Change ┃
   > 
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
   > │ QQuery 0     │                            1.36ms │                       
       1.44ms │  1.05x slower │
   > │ QQuery 1     │                           30.28ms │                       
      21.71ms │ +1.39x faster │
   > │ QQuery 2     │                           73.36ms │                       
      64.39ms │ +1.14x faster │
   > │ QQuery 3     │                           64.40ms │                       
      52.36ms │ +1.23x faster │
   > │ QQuery 4     │                          512.46ms │                       
     486.49ms │ +1.05x faster │
   > │ QQuery 5     │                          571.75ms │                       
     539.88ms │ +1.06x faster │
   > │ QQuery 6     │                           31.22ms │                       
      21.57ms │ +1.45x faster │
   > │ QQuery 7     │                           33.53ms │                       
      25.52ms │ +1.31x faster │
   > │ QQuery 8     │                          568.44ms │                       
     532.79ms │ +1.07x faster │
   > │ QQuery 9     │                          758.40ms │                       
     766.04ms │     no change │
   > │ QQuery 10    │                          180.18ms │                       
     170.90ms │ +1.05x faster │
   > │ QQuery 11    │                          202.26ms │                       
     184.72ms │ +1.09x faster │
   > │ QQuery 12    │                          601.67ms │                       
     578.79ms │     no change │
   > │ QQuery 13    │                          832.37ms │                       
     802.30ms │     no change │
   > │ QQuery 14    │                          581.32ms │                       
     528.55ms │ +1.10x faster │
   > │ QQuery 15    │                          625.30ms │                       
     590.98ms │ +1.06x faster │
   > │ QQuery 16    │                         1362.52ms │                       
    1162.10ms │ +1.17x faster │
   > │ QQuery 17    │                         1258.20ms │                       
    1107.68ms │ +1.14x faster │
   > │ QQuery 18    │                         3628.57ms │                       
    3589.80ms │     no change │
   > │ QQuery 19    │                           59.32ms │                       
      48.46ms │ +1.22x faster │
   > │ QQuery 20    │                          884.00ms │                       
     817.40ms │ +1.08x faster │
   > │ QQuery 21    │                         1066.61ms │                       
    1044.21ms │     no change │
   > │ QQuery 22    │                         1883.62ms │                       
    2261.84ms │  1.20x slower │
   > │ QQuery 23    │                         6562.12ms │                       
    6412.42ms │     no change │
   > │ QQuery 24    │                          337.47ms │                       
     329.45ms │     no change │
   > │ QQuery 25    │                          263.68ms │                       
     284.84ms │  1.08x slower │
   > │ QQuery 26    │                          366.64ms │                       
     359.23ms │     no change │
   > │ QQuery 27    │                         1185.09ms │                       
    1180.38ms │     no change │
   > │ QQuery 28    │                         8860.10ms │                       
    9427.73ms │  1.06x slower │
   > │ QQuery 29    │                          441.28ms │                       
     466.33ms │  1.06x slower │
   > │ QQuery 30    │                          625.09ms │                       
     585.79ms │ +1.07x faster │
   > │ QQuery 31    │                          552.68ms │                       
     587.15ms │  1.06x slower │
   > │ QQuery 32    │                         4358.41ms │                       
    4224.30ms │     no change │
   > │ QQuery 33    │                         6845.44ms │                       
    4678.45ms │ +1.46x faster │
   > │ QQuery 34    │                         7866.05ms │                       
    4627.91ms │ +1.70x faster │
   > │ QQuery 35    │                          841.92ms │                       
     802.52ms │     no change │
   > │ QQuery 36    │                           80.70ms │                       
     103.95ms │  1.29x slower │
   > │ QQuery 37    │                           33.40ms │                       
      48.88ms │  1.46x slower │
   > │ QQuery 38    │                           68.14ms │                       
      70.24ms │     no change │
   > │ QQuery 39    │                          144.39ms │                       
     188.87ms │  1.31x slower │
   > │ QQuery 40    │                           20.52ms │                       
      22.32ms │  1.09x slower │
   > │ QQuery 41    │                           20.79ms │                       
      20.35ms │     no change │
   > │ QQuery 42    │                           17.83ms │                       
      27.16ms │  1.52x slower │
   > 
└──────────────┴───────────────────────────────────┴─────────────────────────────────────┴───────────────┘
   > ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
   > ┃ Benchmark Summary                                  ┃            ┃
   > ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
   > │ Total Time (on-demand-repartition-with-config)     │ 55302.89ms │
   > │ Total Time (on-demand-not-always-add-roundrobin)   │ 49848.22ms │
   > │ Average Time (on-demand-repartition-with-config)   │  1286.11ms │
   > │ Average Time (on-demand-not-always-add-roundrobin) │  1159.26ms │
   > │ Queries Faster                                     │         19 │
   > │ Queries Slower                                     │         11 │
   > │ Queries with No Change                             │         13 │
   > └────────────────────────────────────────────────────┴────────────┘
   > --------------------
   > Benchmark tpch_mem_sf1.json
   > --------------------
   > 
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
   > ┃ Query        ┃ on-demand-repartition-with-config ┃ 
on-demand-not-always-add-roundrobin ┃        Change ┃
   > 
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
   > │ QQuery 1     │                           85.39ms │                       
      71.16ms │ +1.20x faster │
   > │ QQuery 2     │                           14.22ms │                       
      13.07ms │ +1.09x faster │
   > │ QQuery 3     │                           24.57ms │                       
      22.43ms │ +1.10x faster │
   > │ QQuery 4     │                           15.57ms │                       
      11.08ms │ +1.40x faster │
   > │ QQuery 5     │                           41.50ms │                       
      34.96ms │ +1.19x faster │
   > │ QQuery 6     │                            4.82ms │                       
       4.25ms │ +1.14x faster │
   > │ QQuery 7     │                           74.88ms │                       
      66.79ms │ +1.12x faster │
   > │ QQuery 8     │                           17.29ms │                       
      15.53ms │ +1.11x faster │
   > │ QQuery 9     │                           41.62ms │                       
      37.89ms │ +1.10x faster │
   > │ QQuery 10    │                           35.49ms │                       
      32.30ms │ +1.10x faster │
   > │ QQuery 11    │                            6.85ms │                       
       5.75ms │ +1.19x faster │
   > │ QQuery 12    │                           24.91ms │                       
      20.74ms │ +1.20x faster │
   > │ QQuery 13    │                           17.21ms │                       
      16.67ms │     no change │
   > │ QQuery 14    │                            5.16ms │                       
       5.06ms │     no change │
   > │ QQuery 15    │                           12.00ms │                       
      11.47ms │     no change │
   > │ QQuery 16    │                           12.75ms │                       
      12.75ms │     no change │
   > │ QQuery 17    │                           57.71ms │                       
      56.38ms │     no change │
   > │ QQuery 18    │                          124.63ms │                       
     121.32ms │     no change │
   > │ QQuery 19    │                           24.50ms │                       
      24.22ms │     no change │
   > │ QQuery 20    │                           20.86ms │                       
      20.35ms │     no change │
   > │ QQuery 21    │                           86.02ms │                       
      84.99ms │     no change │
   > │ QQuery 22    │                           18.71ms │                       
      18.49ms │     no change │
   > 
└──────────────┴───────────────────────────────────┴─────────────────────────────────────┴───────────────┘
   > ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┓
   > ┃ Benchmark Summary                                  ┃          ┃
   > ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━┩
   > │ Total Time (on-demand-repartition-with-config)     │ 766.66ms │
   > │ Total Time (on-demand-not-always-add-roundrobin)   │ 707.64ms │
   > │ Average Time (on-demand-repartition-with-config)   │  34.85ms │
   > │ Average Time (on-demand-not-always-add-roundrobin) │  32.17ms │
   > │ Queries Faster                                     │       12 │
   > │ Queries Slower                                     │        0 │
   > │ Queries with No Change                             │       10 │
   > └────────────────────────────────────────────────────┴──────────┘
   > --------------------
   > Benchmark tpch_mem_sf10.json
   > --------------------
   > 
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓
   > ┃ Query        ┃ on-demand-repartition-with-config ┃ 
on-demand-not-always-add-roundrobin ┃         Change ┃
   > 
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩
   > │ QQuery 1     │                         1335.99ms │                       
    1342.40ms │      no change │
   > │ QQuery 2     │                          121.22ms │                       
     131.85ms │   1.09x slower │
   > │ QQuery 3     │                          250.98ms │                       
     286.30ms │   1.14x slower │
   > │ QQuery 4     │                          125.97ms │                       
     121.37ms │      no change │
   > │ QQuery 5     │                          766.43ms │                       
     661.07ms │  +1.16x faster │
   > │ QQuery 6     │                          474.20ms │                       
     317.05ms │  +1.50x faster │
   > │ QQuery 7     │                         1190.87ms │                       
    1556.33ms │   1.31x slower │
   > │ QQuery 8     │                          584.23ms │                       
     743.68ms │   1.27x slower │
   > │ QQuery 9     │                         1150.48ms │                       
    1581.38ms │   1.37x slower │
   > │ QQuery 10    │                          898.01ms │                       
     875.53ms │      no change │
   > │ QQuery 11    │                          113.16ms │                       
     115.31ms │      no change │
   > │ QQuery 12    │                          321.89ms │                       
     665.48ms │   2.07x slower │
   > │ QQuery 13    │                          340.09ms │                       
     350.92ms │      no change │
   > │ QQuery 14    │                          566.53ms │                       
      49.00ms │ +11.56x faster │
   > │ QQuery 15    │                          123.92ms │                       
     159.39ms │   1.29x slower │
   > │ QQuery 16    │                           93.89ms │                       
     104.72ms │   1.12x slower │
   > │ QQuery 17    │                          912.10ms │                       
     842.42ms │  +1.08x faster │
   > │ QQuery 18    │                         4680.46ms │                       
    4226.43ms │  +1.11x faster │
   > │ QQuery 19    │                          857.78ms │                       
     825.55ms │      no change │
   > │ QQuery 20    │                          241.99ms │                       
     353.86ms │   1.46x slower │
   > │ QQuery 21    │                         1915.38ms │                       
    1949.02ms │      no change │
   > │ QQuery 22    │                           90.42ms │                       
      98.60ms │   1.09x slower │
   > 
└──────────────┴───────────────────────────────────┴─────────────────────────────────────┴────────────────┘
   > ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
   > ┃ Benchmark Summary                                  ┃            ┃
   > ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
   > │ Total Time (on-demand-repartition-with-config)     │ 17155.98ms │
   > │ Total Time (on-demand-not-always-add-roundrobin)   │ 17357.64ms │
   > │ Average Time (on-demand-repartition-with-config)   │   779.82ms │
   > │ Average Time (on-demand-not-always-add-roundrobin) │   788.98ms │
   > │ Queries Faster                                     │          5 │
   > │ Queries Slower                                     │         10 │
   > │ Queries with No Change                             │          7 │
   > └────────────────────────────────────────────────────┴────────────┘
   > --------------------
   > Benchmark tpch_sf1.json
   > --------------------
   > 
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
   > ┃ Query        ┃ on-demand-repartition-with-config ┃ 
on-demand-not-always-add-roundrobin ┃        Change ┃
   > 
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
   > │ QQuery 1     │                          106.51ms │                       
      95.95ms │ +1.11x faster │
   > │ QQuery 2     │                           20.61ms │                       
      19.91ms │     no change │
   > │ QQuery 3     │                           43.59ms │                       
      36.34ms │ +1.20x faster │
   > │ QQuery 4     │                           26.63ms │                       
      22.75ms │ +1.17x faster │
   > │ QQuery 5     │                           66.74ms │                       
      54.61ms │ +1.22x faster │
   > │ QQuery 6     │                           20.47ms │                       
      17.84ms │ +1.15x faster │
   > │ QQuery 7     │                           84.02ms │                       
      74.00ms │ +1.14x faster │
   > │ QQuery 8     │                           54.67ms │                       
      49.70ms │ +1.10x faster │
   > │ QQuery 9     │                           78.47ms │                       
      66.34ms │ +1.18x faster │
   > │ QQuery 10    │                           67.30ms │                       
      59.63ms │ +1.13x faster │
   > │ QQuery 11    │                           16.11ms │                       
      14.86ms │ +1.08x faster │
   > │ QQuery 12    │                           42.73ms │                       
      33.38ms │ +1.28x faster │
   > │ QQuery 13    │                           38.63ms │                       
      32.24ms │ +1.20x faster │
   > │ QQuery 14    │                           31.97ms │                       
      29.82ms │ +1.07x faster │
   > │ QQuery 15    │                           47.89ms │                       
      42.21ms │ +1.13x faster │
   > │ QQuery 16    │                           16.23ms │                       
      14.64ms │ +1.11x faster │
   > │ QQuery 17    │                          103.78ms │                       
      95.82ms │ +1.08x faster │
   > │ QQuery 18    │                          134.12ms │                       
     117.78ms │ +1.14x faster │
   > │ QQuery 19    │                           52.97ms │                       
      48.17ms │ +1.10x faster │
   > │ QQuery 20    │                           46.33ms │                       
      39.97ms │ +1.16x faster │
   > │ QQuery 21    │                          110.85ms │                       
      93.35ms │ +1.19x faster │
   > │ QQuery 22    │                           18.11ms │                       
      16.01ms │ +1.13x faster │
   > 
└──────────────┴───────────────────────────────────┴─────────────────────────────────────┴───────────────┘
   > ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
   > ┃ Benchmark Summary                                  ┃           ┃
   > ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
   > │ Total Time (on-demand-repartition-with-config)     │ 1228.72ms │
   > │ Total Time (on-demand-not-always-add-roundrobin)   │ 1075.33ms │
   > │ Average Time (on-demand-repartition-with-config)   │   55.85ms │
   > │ Average Time (on-demand-not-always-add-roundrobin) │   48.88ms │
   > │ Queries Faster                                     │        21 │
   > │ Queries Slower                                     │         0 │
   > │ Queries with No Change                             │         1 │
   > └────────────────────────────────────────────────────┴───────────┘
   > --------------------
   > Benchmark tpch_sf10.json
   > --------------------
   > 
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
   > ┃ Query        ┃ on-demand-repartition-with-config ┃ 
on-demand-not-always-add-roundrobin ┃        Change ┃
   > 
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
   > │ QQuery 1     │                          843.22ms │                       
     943.27ms │  1.12x slower │
   > │ QQuery 2     │                          123.24ms │                       
     136.28ms │  1.11x slower │
   > │ QQuery 3     │                          407.30ms │                       
     446.22ms │  1.10x slower │
   > │ QQuery 4     │                          198.17ms │                       
     231.51ms │  1.17x slower │
   > │ QQuery 5     │                          604.48ms │                       
     671.58ms │  1.11x slower │
   > │ QQuery 6     │                          136.38ms │                       
     158.40ms │  1.16x slower │
   > │ QQuery 7     │                          887.74ms │                       
     967.93ms │  1.09x slower │
   > │ QQuery 8     │                          628.50ms │                       
     698.72ms │  1.11x slower │
   > │ QQuery 9     │                         1009.86ms │                       
    1097.19ms │  1.09x slower │
   > │ QQuery 10    │                          570.20ms │                       
     559.98ms │     no change │
   > │ QQuery 11    │                           90.48ms │                       
      88.46ms │     no change │
   > │ QQuery 12    │                          299.50ms │                       
     282.05ms │ +1.06x faster │
   > │ QQuery 13    │                          421.75ms │                       
     421.93ms │     no change │
   > │ QQuery 14    │                          231.66ms │                       
     231.48ms │     no change │
   > │ QQuery 15    │                          384.69ms │                       
     412.02ms │  1.07x slower │
   > │ QQuery 16    │                           96.85ms │                       
      96.86ms │     no change │
   > │ QQuery 17    │                         1088.84ms │                       
    1088.24ms │     no change │
   > │ QQuery 18    │                         1874.24ms │                       
    1587.75ms │ +1.18x faster │
   > │ QQuery 19    │                          462.80ms │                       
     395.33ms │ +1.17x faster │
   > │ QQuery 20    │                          429.06ms │                       
     378.36ms │ +1.13x faster │
   > │ QQuery 21    │                         1564.34ms │                       
    1344.72ms │ +1.16x faster │
   > │ QQuery 22    │                          144.52ms │                       
     130.80ms │ +1.10x faster │
   > 
└──────────────┴───────────────────────────────────┴─────────────────────────────────────┴───────────────┘
   > ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
   > ┃ Benchmark Summary                                  ┃            ┃
   > ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
   > │ Total Time (on-demand-repartition-with-config)     │ 12497.84ms │
   > │ Total Time (on-demand-not-always-add-roundrobin)   │ 12369.07ms │
   > │ Average Time (on-demand-repartition-with-config)   │   568.08ms │
   > │ Average Time (on-demand-not-always-add-roundrobin) │   562.23ms │
   > │ Queries Faster                                     │          6 │
   > │ Queries Slower                                     │         10 │
   > │ Queries with No Change                             │          6 │
   > └────────────────────────────────────────────────────┴────────────┘
   > ```
   
   ok, let's revert it then, and we'll see later how it could be improved in 
other ways


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to