HuaHuaY commented on PR #50281: URL: https://github.com/apache/arrow/pull/50281#issuecomment-4852100741
Yeah, my goal was to make `BitRunReader`'s performance match `SetBitRunReader`, and removing the `NOINLINE` from `SetBitRunReader::NextRun` wasn't what I needed. I submitted a new commit that made some optimizations to `BitRunReader`: 1. The assignment `word_ = 0` only needs to occur when reading a portion of the word. 2. Removed redundant calculations of `bit_util::IsMultipleOf64`. 3. Prioritized checking `position_ + 128 <= length_` to avoid performing two boolean checks in the `AdvanceUntilChange` and `LoadWord` functions when null values are rare. > what matters is actual users of this function (for example in the compute library), not the bitmap traversal micro-benchmarks :-) As you said, user performance is more important than benchmark performance. These changes not only improved my local benchmark but also, I believe, semantically optimized the execution process. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
