goldmedal commented on issue #15383: URL: https://github.com/apache/datafusion/issues/15383#issuecomment-2766551662
Based on https://github.com/goldmedal/datafusion/pull/3, I did the some benchmarks(`clieckbench_1`, `h2o_medium`) for it. `feat_zero-copy-hash-agg-false` is the branch that disables the configuration. `feat_zero-copy-hash-agg` is the branch enabling the configuration. In conclusion, HashAggregate is slower in the selection vector mode. ``` Comparing feat_zero-copy-hash-agg-false and feat_zero-copy-hash-agg -------------------- Benchmark clickbench_1.json -------------------- ┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Query ┃ feat_zero-copy-hash-agg-false ┃ feat_zero-copy-hash-agg ┃ Change ┃ ┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ QQuery 0 │ 0.24ms │ 0.32ms │ 1.33x slower │ │ QQuery 1 │ 26.98ms │ 24.56ms │ +1.10x faster │ │ QQuery 2 │ 55.89ms │ 52.55ms │ +1.06x faster │ │ QQuery 3 │ 48.20ms │ 45.62ms │ +1.06x faster │ │ QQuery 4 │ 313.79ms │ 347.08ms │ 1.11x slower │ │ QQuery 5 │ 490.80ms │ 471.41ms │ no change │ │ QQuery 6 │ 25.06ms │ 25.46ms │ no change │ │ QQuery 7 │ 28.17ms │ 27.29ms │ no change │ │ QQuery 8 │ 353.53ms │ 406.58ms │ 1.15x slower │ │ QQuery 9 │ 514.71ms │ 478.99ms │ +1.07x faster │ │ QQuery 10 │ 132.73ms │ 130.81ms │ no change │ │ QQuery 11 │ 142.59ms │ 143.29ms │ no change │ │ QQuery 12 │ 475.75ms │ 493.83ms │ no change │ │ QQuery 13 │ 569.90ms │ 630.60ms │ 1.11x slower │ │ QQuery 14 │ 435.30ms │ 444.02ms │ no change │ │ QQuery 15 │ 361.60ms │ 406.62ms │ 1.12x slower │ │ QQuery 16 │ 825.41ms │ 856.13ms │ no change │ │ QQuery 17 │ 752.13ms │ 766.95ms │ no change │ │ QQuery 18 │ 1813.04ms │ 1934.07ms │ 1.07x slower │ │ QQuery 19 │ 40.67ms │ 41.49ms │ no change │ │ QQuery 20 │ 621.14ms │ 625.89ms │ no change │ │ QQuery 21 │ 769.98ms │ 749.81ms │ no change │ │ QQuery 22 │ 1544.70ms │ 1560.61ms │ no change │ │ QQuery 23 │ 4471.51ms │ 4356.12ms │ no change │ │ QQuery 24 │ 257.77ms │ 265.81ms │ no change │ │ QQuery 25 │ 268.53ms │ 273.24ms │ no change │ │ QQuery 26 │ 294.19ms │ 307.36ms │ no change │ │ QQuery 27 │ 983.41ms │ 987.90ms │ no change │ │ QQuery 28 │ 7514.46ms │ 7533.94ms │ no change │ │ QQuery 29 │ 346.70ms │ 344.54ms │ no change │ │ QQuery 30 │ 387.65ms │ 405.92ms │ no change │ │ QQuery 31 │ 390.81ms │ 427.40ms │ 1.09x slower │ │ QQuery 32 │ 1597.45ms │ 1987.50ms │ 1.24x slower │ │ QQuery 33 │ 1753.56ms │ 1863.63ms │ 1.06x slower │ │ QQuery 34 │ 1950.84ms │ 1945.21ms │ no change │ │ QQuery 35 │ 510.78ms │ 560.47ms │ 1.10x slower │ │ QQuery 36 │ 105.22ms │ 110.02ms │ no change │ │ QQuery 37 │ 56.69ms │ 53.63ms │ +1.06x faster │ │ QQuery 38 │ 74.69ms │ 77.84ms │ no change │ │ QQuery 39 │ 189.59ms │ 193.83ms │ no change │ │ QQuery 40 │ 24.37ms │ 24.50ms │ no change │ │ QQuery 41 │ 23.01ms │ 23.45ms │ no change │ │ QQuery 42 │ 27.48ms │ 27.82ms │ no change │ └──────────────┴───────────────────────────────┴─────────────────────────┴───────────────┘ ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓ ┃ Benchmark Summary ┃ ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩ │ Total Time (feat_zero-copy-hash-agg-false) │ 31570.98ms │ │ Total Time (feat_zero-copy-hash-agg) │ 32434.14ms │ │ Average Time (feat_zero-copy-hash-agg-false) │ 734.21ms │ │ Average Time (feat_zero-copy-hash-agg) │ 754.28ms │ │ Queries Faster │ 5 │ │ Queries Slower │ 10 │ │ Queries with No Change │ 28 │ └──────────────────────────────────────────────┴────────────┘ -------------------- Benchmark h2o.json -------------------- ┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓ ┃ Query ┃ feat_zero-copy-hash-agg-false ┃ feat_zero-copy-hash-agg ┃ Change ┃ ┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩ │ QQuery 1 │ 1053.42ms │ 1044.36ms │ no change │ │ QQuery 2 │ 2155.01ms │ 2317.51ms │ 1.08x slower │ │ QQuery 3 │ 2275.93ms │ 2611.06ms │ 1.15x slower │ │ QQuery 4 │ 1236.64ms │ 1256.03ms │ no change │ │ QQuery 5 │ 1608.77ms │ 1892.13ms │ 1.18x slower │ │ QQuery 6 │ 1369.68ms │ 1382.62ms │ no change │ │ QQuery 7 │ 2258.63ms │ 2548.21ms │ 1.13x slower │ │ QQuery 8 │ 3876.11ms │ 3985.41ms │ no change │ │ QQuery 9 │ 5989.38ms │ 6721.88ms │ 1.12x slower │ │ QQuery 10 │ 3064.89ms │ 3677.05ms │ 1.20x slower │ └──────────────┴───────────────────────────────┴─────────────────────────┴──────────────┘ ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓ ┃ Benchmark Summary ┃ ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩ │ Total Time (feat_zero-copy-hash-agg-false) │ 24888.45ms │ │ Total Time (feat_zero-copy-hash-agg) │ 27436.27ms │ │ Average Time (feat_zero-copy-hash-agg-false) │ 2488.85ms │ │ Average Time (feat_zero-copy-hash-agg) │ 2743.63ms │ │ Queries Faster │ 0 │ │ Queries Slower │ 6 │ │ Queries with No Change │ 4 │ ``` I tried to profile Clickbench QQuery 4: When selection vector enable:  When selection vector disabled:  In the current implementation, the CPU time of `filter_record_batch` is significantly greater than `take_arrays`. Does arrow have a more efficient way to filter a record batch by a boolean array? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org