mapleFU commented on PR #8258: URL: https://github.com/apache/arrow-rs/pull/8258#issuecomment-3399662768
The problem of DELTA_BYTE_ARRAY is both: 1. In most case, the delta is not better. It works better in the case of sharing-prefix ( like url ). For mostly equal strings / binarys, zstd can works well 2. DELTA_BYTE_ARRAY is also more memory consuming in decoder, this is also good to understand, for `DELTA_LENGTH_BYTE_ARRAY`, no more copies for buffer is required, just zero-copying the data is ok. But Delta might need construct the buffer( and hard to vectorize ). So it only works well in merely cases like sorted url or some sharing prefix cases -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
