mapleFU commented on issue #40845: URL: https://github.com/apache/arrow/issues/40845#issuecomment-2037851101
For SIMD decoding, I've read some materials: 1. currently some impl (including arrow unpack32 ) uses similar methods in paper "SIMD-Scan: Ultra Fast in-Memory Table Scan using on Chip Vector Processing Units", I might investigate this for int8 and int16 2. https://github.com/lemire/LittleIntPacker/blob/8777f574a5ab3c653881371819383c986292843c/src/bmipacking32.c#L2169 LittleIntPacker says that for little int stride, bmi2 + simd convert might be a good way. Velox also uses this method, I think some decoding method also better using this -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
