sahvx655-wq commented on PR #64091: URL: https://github.com/apache/doris/pull/64091#issuecomment-4659305565
Filled the description in against the template. To recap how I got here: reading the lz4 block path I noticed remaining_output_len is computed once per large block at the top of the outer loop, then passed straight to LZ4_decompress_safe as dstCapacity for every small block, while output_ptr keeps advancing per small block. So from the second small block on the decompressor is handed the full large-block capacity at an already-advanced pointer. Root cause is that stale capacity. A crafted lz4block stream (e.g. via a CSV load) can drive the inner loop to write past the line reader's output buffer, a heap out-of-bounds write; left unfixed it's a remotely-triggerable memory corruption on the load path. The one-line fix measures the real remaining space from output_ptr (output_max_len - (output_ptr - output)), so a block that would not fit now returns InvalidArgument and well-formed streams that already fit are untouched. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
