Dandandan opened a new pull request #8822: URL: https://github.com/apache/arrow/pull/8822
This PR fixes the specialization around data types. I found during profiling that the compiler doesn't remove the `if T::DATA_TYPE == DataType::Boolean` (and `PartialEq`) implementation and accounts for around 9%(~!) of the instruction fetches (mostly related to `append` which makes sense). Using pattern matching instead of using `==` seems to fix this issue and brings the query from ~1700ms to 1500ms. Benchmark results for this query: ``` Query 12 iteration 0 took 1500 ms Query 12 iteration 1 took 1499 ms Query 12 iteration 2 took 1502 ms Query 12 iteration 3 took 1506 ms Query 12 iteration 4 took 1500 ms Query 12 iteration 5 took 1497 ms Query 12 iteration 6 took 1501 ms Query 12 iteration 7 took 1500 ms Query 12 iteration 8 took 1501 ms Query 12 iteration 9 took 1498 ms Query 12 iteration 10 took 1500 ms Query 12 iteration 11 took 1498 ms Query 12 iteration 12 took 1502 ms Query 12 iteration 13 took 1499 ms Query 12 iteration 14 took 1497 ms Query 12 iteration 15 took 1497 ms Query 12 iteration 16 took 1500 ms Query 12 iteration 17 took 1496 ms Query 12 iteration 18 took 1499 ms Query 12 iteration 19 took 1493 ms ``` Master: ``` Query 12 iteration 0 took 1762 ms Query 12 iteration 1 took 1734 ms Query 12 iteration 2 took 1734 ms Query 12 iteration 3 took 1730 ms Query 12 iteration 4 took 1731 ms Query 12 iteration 5 took 1758 ms Query 12 iteration 6 took 1727 ms Query 12 iteration 7 took 1727 ms Query 12 iteration 8 took 1727 ms Query 12 iteration 9 took 1730 ms Query 12 iteration 10 took 1719 ms Query 12 iteration 11 took 1731 ms Query 12 iteration 12 took 1735 ms Query 12 iteration 13 took 1724 ms Query 12 iteration 14 took 1713 ms Query 12 iteration 15 took 1712 ms Query 12 iteration 16 took 1729 ms Query 12 iteration 17 took 1721 ms Query 12 iteration 18 took 1713 ms Query 12 iteration 19 took 1710 ms ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
