pchintar opened a new issue, #9801:
URL: https://github.com/apache/arrow-rs/issues/9801

   ## Description
   
   The IPC format specifies that compressed buffers are encoded as:
   
   > `[8 bytes uncompressed length] + compressed data` 
   
   The current implementation assumes this invariant when reading the prefix 
during decompression.
   
   However, in the reader path, buffers are constructed from metadata 
(`offset`, `length`) and passed to the decompression logic without validating 
that they contain at least the required 8-byte prefix.
   
   This means that malformed or truncated IPC input can reach the decompression 
path with buffers shorter than 8 bytes.
   
   ---
   
   ## Problem
   
   The decompression logic reads the prefix unconditionally:
   
   ```rust
   let decompressed_length = read_uncompressed_size(input);
   ```
   
   In the absence of a length check, this can lead to a panic when reading the 
prefix from a short buffer.
   
   This creates a mismatch between:
   
   * Format expectation: buffers include an 8-byte prefix
   * Runtime behavior: no validation that this invariant holds
   
   ---
   
   ## Comparison
   
   Other components in this repository (e.g., Parquet) perform a much more 
careful validation before parsing compressed data, including:
   
   * checking input length before reading fixed-size prefix fields
   * validating compressed and uncompressed sizes before decompression
   
   This ensures malformed input is handled with errors rather than panics.
   
   ---
   
   ## Expected Behavior
   
   The IPC decompression path should validate that the input buffer contains at 
least 8 bytes before reading the prefix, and return an `ArrowError` otherwise.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to