jorgecarleitao opened a new pull request #9492:
URL: https://github.com/apache/arrow/pull/9492


   This PR is derived from the observation that creating arrays from 
`ArrayData` can result in undefined behavior. This is because we:
   
   1. transmute buffers to whatever type the concrete arrays are derived from;
   2. rely on `ArrayData::len` to create slices (in primitive arrays) and check 
bounds (in arrays with offset buffers).
   
   Example:
   
   ```rust
       #[test]
       fn test_bla() {
           let data = ArrayData::new(DataType::Int32, 1000, None, None, 0, 
vec![Buffer::from(&[1i32])], vec![]);
           let array = unsafe { make_array(Arc::new(data)) };
   
           let array = array.as_any().downcast_ref::<Int32Array>().unwrap();
   
           let slice = array.values();
           println!("{}", slice[150]); // we never allocated this...
       }
   ```
   
   Therefore, the function `make_array` must be marked as `unsafe`.
   
   Note that all `from(ArrayDataRef)` in concrete implementations are also 
`unsafe` and not marked as such (e.g. Int32Array::from(ArrayDataRef)`. Since 
they are implemented via the trait `From`, which does not support the `unsafe` 
keyword on it, we will need to re-name all those functions accordingly (as they 
collide with another function `from`). That is a larger change.
   
   The correct way of addressing this atm is to perform all the validations 
necessary on `From<ArrayData>`.
   
   
   Anyone relying on `make_array` and `From<>` can unknowingly cause out of 
bound reads, which is a typical source of an escalation of privileges via reads 
to the applications' memory, as the example above shows.
   
   @alamb , my suggestion here is that we start by marking any method currently 
`unsafe` as such, so that we fix problem 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to