JingsongLi commented on code in PR #364:
URL: https://github.com/apache/paimon-rust/pull/364#discussion_r3384926724


##########
crates/paimon/src/spec/binary_row.rs:
##########
@@ -92,7 +92,29 @@ impl BinaryRow {
             });
         }
         let arity = i32::from_be_bytes([data[0], data[1], data[2], data[3]]);
-        Ok(Self::from_bytes(arity, data[4..].to_vec()))
+        if arity < 0 {
+            return Err(crate::Error::UnexpectedError {
+                message: format!("BinaryRow: serialized data has negative 
arity: {arity}"),
+                source: None,
+            });
+        }
+        let body = &data[4..];
+        // The body must hold at least the null bitmap and the fixed part
+        // (8 bytes per field); reject truncated input rather than panicking
+        // later when reading the null bitmap or a field. The size is computed
+        // in i64 so an absurd arity in malformed input cannot overflow.
+        let bit_set_width = ((arity as i64 + 63 + Self::HEADER_SIZE_IN_BYTES 
as i64) / 64) * 8;
+        let fix_part_size = bit_set_width + 8 * arity as i64;
+        if (body.len() as i64) < fix_part_size {
+            return Err(crate::Error::UnexpectedError {
+                message: format!(
+                    "BinaryRow: serialized body too short for arity {arity}: 
{} bytes, need at least {fix_part_size}",
+                    body.len()
+                ),
+                source: None,
+            });
+        }
+        Ok(Self::from_bytes(arity, body.to_vec()))

Review Comment:
   One more hardening edge case: this validates the required body size with 
`i64`, but after the check passes `from_bytes` recomputes 
`null_bits_size_in_bytes` via `cal_bit_set_width_in_bytes(arity)`, which still 
does the arithmetic in `i32`. For a corrupt row with an extremely large 
positive arity and a sufficiently large body, that unchecked helper can still 
overflow/panic. Could we either cap `arity` before this point or avoid 
recomputing the width with the `i32` helper?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to