bkietz commented on issue #38479: URL: https://github.com/apache/arrow/issues/38479#issuecomment-1804782704
This looks like an empty buffer happens to be read for decompression, so a null pointer is used since there's no data to read. Even though the length of the buffer is zero, it's still being recognized as undefined behavior. That's odd to me since null pointer arithmetic with a zero offset is [not UB in c++17](https://timsong-cpp.github.io/cppwp/n4659/expr.add#7) and the lz4 external project [should inherit that](https://github.com/bkietz/arrow/blob/0b7384272b82088b110edb9c8e95adf9372af997/cpp/cmake_modules/ThirdpartyToolchain.cmake#L961). I'm not familiar enough with C to know if this differs there. In any case, it seems the fix should be to explicitly check for null in Lz4Decompressor and skip passing that through to LZ4F_decompress. Possibly we should also look for the `Readable` implementation which is producing a null/empty buffer in the first place; patching that to return a pointer to `kZeroSizeArea` instead will prevent this kind of error next time. Actually, @pitrou what would you think of changing `Buffer`'s constructor such that data_ would never be nullptr? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
