yordan-pavlov commented on issue #200: URL: https://github.com/apache/arrow-rs/issues/200#issuecomment-845464200
UPDATE: I still haven't been able to figure out why the current implementation in `PrimitiveArrayReader` is still faster in the "read Int32Array, dictionary encoded, mandatory, no NULLs" benchmark. But I have made the `VariableLenDictionaryDecoder` even faster - up to 4.8 times faster for reading string arrays compared to the current implementation in `ComplexObjectArrayReader`. Here are the relevant benchmark results: read StringArray, dictionary encoded, mandatory, no NULLs - old: time: [1.3798 ms 1.3884 ms 1.3987 ms] read StringArray, dictionary encoded, mandatory, no NULLs - new: time: [280.68 us 288.89 us 298.13 us] read StringArray, dictionary encoded, optional, no NULLs - old: time: [1.5283 ms 1.5432 ms 1.5601 ms] read StringArray, dictionary encoded, optional, no NULLs - new: time: [334.56 us 346.34 us 362.41 us] read StringArray, dictionary encoded, optional, half NULLs - old: time: [1.3208 ms 1.3432 ms 1.3676 ms] read StringArray, dictionary encoded, optional, half NULLs - new: time: [516.13 us 523.28 us 531.78 us] And here are the latest changes: https://github.com/yordan-pavlov/arrow/commit/9cac678a47f747882a9bc00a179135cccef6a5f5 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
