HuaHuaY commented on PR #50281: URL: https://github.com/apache/arrow/pull/50281#issuecomment-4849908388
With the help of AI, I investigated why `VisitBitRuns` is slower than `VisitSetBitRuns` by modifying benchmarks and examining the generated assembly code. I concluded that the primary reason lies in the absence of the `NOINLINE` attribute on `BitRunReader::NextRun`; this results in a larger code size for `VisitBitRuns` compared to `VisitSetBitRuns`, making `VisitBitRuns` itself less likely to be inlined. However, adding the `NOINLINE` attribute to `BitRunReader::NextRun` actually degrades performance further, as it forces a non-inlined call to `NextRun` regardless of whether the bitmap value is true or false. Additionally, `BitRunReader` contains some minor code duplication that can be optimized. I will upload the changes shortly. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
