tustvold opened a new issue, #1813:
URL: https://github.com/apache/arrow-rs/issues/1813

   **Describe the bug**
   
   The validation logic added in #1767 is incorrect. In particular
   
   ```
   let values_buffer = &self.buffers[0];
   
   for pos in 0..values_buffer.len() {
       let raw_val = unsafe {
           std::slice::from_raw_parts(
               values_buffer.as_ptr().add(pos),
               16_usize,
           )
       };
       let value = i128::from_le_bytes(raw_val.try_into().unwrap());
       validate_decimal_precision(value, *p)?;
   }
   ```
   
   This will read 16 byte slices at 1 byte intervals in the underlying array of 
decimal data, which will as a result:
   
   * This will interpret data at non-value intervals (avoiding UB largely by 
accident)
   * This will read beyond the bounds of values_buffer 
   * It will not take into account any offset
   
   **To Reproduce**
   
   ```
   #[test]
   fn test_decimal_validation() {
       let mut builder = DecimalBuilder::new(4, 10, 4);
       builder.append_value(10000).unwrap();
       builder.append_value(20000).unwrap();
       let array = builder.finish();
   
       array.data().validate_full().unwrap();
   }
   ```
   
   Fails with
   
   ```
   Invalid argument error: 42535295865117307932921825928971026471 is too large 
to store in a Decimal of precision 10. Max is 9999999999"
   ```
   
   **Expected behavior**
   
   This should not incorrectly fail for valid data
   
   **Additional context**
   
   This was discovered by DataFusion's test suite for Decimals
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to