XiangpengHao commented on PR #6921: URL: https://github.com/apache/arrow-rs/pull/6921#issuecomment-2566839355
Implemented some more optimizations and tuning, here are ClickBench numbers on my machine. TLDR: about 15% total time reduction. We first compare no-pushdown vs our new push down implementation. Only Q27 has meaningful slow down, other queries are either similar or much faster. The fix for Q27 requires us to actually switch to a boolean mask-based selector implementation, like the one in https://github.com/apache/arrow-rs/pull/6624 ``` ┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Query ┃ no-pushdown ┃ new-pushdown ┃ Change ┃ ┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ QQuery 0 │ 0.47ms │ 0.43ms │ +1.10x faster │ │ QQuery 1 │ 51.10ms │ 50.10ms │ no change │ │ QQuery 2 │ 68.23ms │ 64.49ms │ +1.06x faster │ │ QQuery 3 │ 90.68ms │ 86.73ms │ no change │ │ QQuery 4 │ 458.93ms │ 458.59ms │ no change │ │ QQuery 5 │ 522.06ms │ 478.50ms │ +1.09x faster │ │ QQuery 6 │ 49.84ms │ 49.94ms │ no change │ │ QQuery 7 │ 55.09ms │ 55.77ms │ no change │ │ QQuery 8 │ 565.26ms │ 556.95ms │ no change │ │ QQuery 9 │ 575.83ms │ 575.05ms │ no change │ │ QQuery 10 │ 164.56ms │ 178.23ms │ 1.08x slower │ │ QQuery 11 │ 177.20ms │ 191.32ms │ 1.08x slower │ │ QQuery 12 │ 591.05ms │ 569.92ms │ no change │ │ QQuery 13 │ 861.06ms │ 848.59ms │ no change │ │ QQuery 14 │ 596.20ms │ 580.73ms │ no change │ │ QQuery 15 │ 554.96ms │ 548.77ms │ no change │ │ QQuery 16 │ 1175.08ms │ 1146.07ms │ no change │ │ QQuery 17 │ 1150.45ms │ 1121.49ms │ no change │ │ QQuery 18 │ 2634.75ms │ 2494.07ms │ +1.06x faster │ │ QQuery 19 │ 90.15ms │ 89.24ms │ no change │ │ QQuery 20 │ 620.15ms │ 591.67ms │ no change │ │ QQuery 21 │ 782.38ms │ 703.15ms │ +1.11x faster │ │ QQuery 22 │ 1927.94ms │ 1404.35ms │ +1.37x faster │ │ QQuery 23 │ 8104.11ms │ 3610.76ms │ +2.24x faster │ │ QQuery 24 │ 360.79ms │ 330.55ms │ +1.09x faster │ │ QQuery 25 │ 290.61ms │ 252.54ms │ +1.15x faster │ │ QQuery 26 │ 395.18ms │ 362.72ms │ +1.09x faster │ │ QQuery 27 │ 891.76ms │ 959.39ms │ 1.08x slower │ │ QQuery 28 │ 4059.54ms │ 4137.37ms │ no change │ │ QQuery 29 │ 235.88ms │ 228.99ms │ no change │ │ QQuery 30 │ 564.22ms │ 584.65ms │ no change │ │ QQuery 31 │ 741.20ms │ 757.87ms │ no change │ │ QQuery 32 │ 2652.48ms │ 2574.19ms │ no change │ │ QQuery 33 │ 2373.71ms │ 2327.10ms │ no change │ │ QQuery 34 │ 2391.00ms │ 2342.15ms │ no change │ │ QQuery 35 │ 700.79ms │ 694.51ms │ no change │ │ QQuery 36 │ 151.51ms │ 152.93ms │ no change │ │ QQuery 37 │ 108.18ms │ 86.03ms │ +1.26x faster │ │ QQuery 38 │ 114.64ms │ 106.22ms │ +1.08x faster │ │ QQuery 39 │ 260.80ms │ 239.13ms │ +1.09x faster │ │ QQuery 40 │ 60.74ms │ 73.29ms │ 1.21x slower │ │ QQuery 41 │ 58.75ms │ 67.85ms │ 1.15x slower │ │ QQuery 42 │ 65.49ms │ 68.11ms │ no change │ └──────────────┴─────────────┴──────────────┴───────────────┘ ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓ ┃ Benchmark Summary ┃ ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩ │ Total Time (no-pushdown) │ 38344.79ms │ │ Total Time (new-pushdown) │ 32800.50ms │ │ Average Time (no-pushdown) │ 891.74ms │ │ Average Time (new-pushdown) │ 762.80ms │ │ Queries Faster │ 13 │ │ Queries Slower │ 5 │ │ Queries with No Change │ 25 │ └─────────────────────────────┴────────────┘ ``` Now we compare our new implementation with the old pushdown implementation -- only Q23 is a bit slower, others are either faster or similar. We do need some extra work to get the optimal performance of Q23. Nonetheless, we are faster than no-pushdown. I believe getting a fix for Q23 does not require foundamental changes to the existing decoding pipeline. ``` ┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Query ┃ pushdown ┃ new-pushdown ┃ Change ┃ ┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ QQuery 0 │ 0.48ms │ 0.43ms │ +1.12x faster │ │ QQuery 1 │ 51.49ms │ 50.10ms │ no change │ │ QQuery 2 │ 67.83ms │ 64.49ms │ no change │ │ QQuery 3 │ 89.68ms │ 86.73ms │ no change │ │ QQuery 4 │ 469.88ms │ 458.59ms │ no change │ │ QQuery 5 │ 523.97ms │ 478.50ms │ +1.10x faster │ │ QQuery 6 │ 50.37ms │ 49.94ms │ no change │ │ QQuery 7 │ 56.89ms │ 55.77ms │ no change │ │ QQuery 8 │ 560.69ms │ 556.95ms │ no change │ │ QQuery 9 │ 583.14ms │ 575.05ms │ no change │ │ QQuery 10 │ 155.75ms │ 178.23ms │ 1.14x slower │ │ QQuery 11 │ 170.31ms │ 191.32ms │ 1.12x slower │ │ QQuery 12 │ 723.13ms │ 569.92ms │ +1.27x faster │ │ QQuery 13 │ 1181.34ms │ 848.59ms │ +1.39x faster │ │ QQuery 14 │ 736.95ms │ 580.73ms │ +1.27x faster │ │ QQuery 15 │ 551.74ms │ 548.77ms │ no change │ │ QQuery 16 │ 1171.99ms │ 1146.07ms │ no change │ │ QQuery 17 │ 1152.34ms │ 1121.49ms │ no change │ │ QQuery 18 │ 2555.82ms │ 2494.07ms │ no change │ │ QQuery 19 │ 84.20ms │ 89.24ms │ 1.06x slower │ │ QQuery 20 │ 606.77ms │ 591.67ms │ no change │ │ QQuery 21 │ 704.86ms │ 703.15ms │ no change │ │ QQuery 22 │ 1633.53ms │ 1404.35ms │ +1.16x faster │ │ QQuery 23 │ 2691.84ms │ 3610.76ms │ 1.34x slower │ │ QQuery 24 │ 528.09ms │ 330.55ms │ +1.60x faster │ │ QQuery 25 │ 465.38ms │ 252.54ms │ +1.84x faster │ │ QQuery 26 │ 562.40ms │ 362.72ms │ +1.55x faster │ │ QQuery 27 │ 1121.76ms │ 959.39ms │ +1.17x faster │ │ QQuery 28 │ 4455.16ms │ 4137.37ms │ +1.08x faster │ │ QQuery 29 │ 234.18ms │ 228.99ms │ no change │ │ QQuery 30 │ 596.22ms │ 584.65ms │ no change │ │ QQuery 31 │ 754.21ms │ 757.87ms │ no change │ │ QQuery 32 │ 2570.52ms │ 2574.19ms │ no change │ │ QQuery 33 │ 2357.37ms │ 2327.10ms │ no change │ │ QQuery 34 │ 2377.89ms │ 2342.15ms │ no change │ │ QQuery 35 │ 703.78ms │ 694.51ms │ no change │ │ QQuery 36 │ 162.29ms │ 152.93ms │ +1.06x faster │ │ QQuery 37 │ 129.96ms │ 86.03ms │ +1.51x faster │ │ QQuery 38 │ 90.79ms │ 106.22ms │ 1.17x slower │ │ QQuery 39 │ 220.71ms │ 239.13ms │ 1.08x slower │ │ QQuery 40 │ 72.87ms │ 73.29ms │ no change │ │ QQuery 41 │ 70.04ms │ 67.85ms │ no change │ │ QQuery 42 │ 68.17ms │ 68.11ms │ no change │ └──────────────┴───────────┴──────────────┴───────────────┘ ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓ ┃ Benchmark Summary ┃ ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩ │ Total Time (pushdown) │ 34116.80ms │ │ Total Time (new-pushdown) │ 32800.50ms │ │ Average Time (pushdown) │ 793.41ms │ │ Average Time (new-pushdown) │ 762.80ms │ │ Queries Faster │ 13 │ │ Queries Slower │ 6 │ │ Queries with No Change │ 24 │ └─────────────────────────────┴────────────┘ ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
