emkornfield commented on pull request #8320: URL: https://github.com/apache/arrow/pull/8320#issuecomment-702615585
> I also notice that we call internal::GreaterThanBitmap for each 64 levels, which always goes through the dynamic dispatch indirection (meaning two function calls, I think). We could call GreaterThanBitmapImpl but that requires compiling a specialized version of level_conversion_inc.h for AVX2, otherwise we lose performance. yeah it isn't ideal, it is possible there is a better factoring in there but it seemed hard to do and isolate BMI2 special instructions, I guess if this isn't too much slower then BMI2 on intel we could potentially collapse everything, but I would not expect that to be the case. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
