Evian-Zhang opened a new issue, #10281: URL: https://github.com/apache/arrow-rs/issues/10281
### Describe the bug Hello, we are security researchers targeting Rust security. By running our tool in this repository, we found several soundness issues. **This report is written by 100% human**. We promise that all you read will never be generated by LLM. Moreover, we are responsible, so if you confirm these issues do exist, we are happy to file PRs to fix it. # Issue 1: `arrow-buffer`'s `MutableBuffer::into_buffer()` violates exception safety, leading to double-free Sources are located here: https://github.com/apache/arrow-rs/blob/d9919194eef5ecca7f3b42726359a820825b30e6/arrow-buffer/src/buffer/mutable.rs#L536-L546 If the mutex `self.reservation` was poisoned before, then the `unwrap` will leading to panic. However, The `mem::forget` has not yet been invoked, so both `bytes` and `self`'s drop implementation will be invoked when unwinding, and these two variable owns the same memory region, leading to a double free (both `Bytes` and `MutableBuffer`'s `drop` implementation involves `std::alloc::dealloc`) To fix it, I think we can just use `if let Some` instead of `unwrap` to the mutex. # Issue 2: `arrow-buffer`'s `Buffer::shrink_to_fit()` violates exception safety, leading to use-after-free Sources are located here: https://github.com/apache/arrow-rs/blob/d9919194eef5ecca7f3b42726359a820825b30e6/arrow-buffer/src/buffer/immutable.rs#L194-L227 https://github.com/apache/arrow-rs/blob/d9919194eef5ecca7f3b42726359a820825b30e6/arrow-buffer/src/bytes.rs#L131-L178 In `Buffer::shrink_to_fit`, it will invokes `Bytes::try_realloc` to reallocate memories, and update the `ptr` if the pointer is updated. In `Bytes::try_realloc`, **after** the memory hold by `Bytes` is reallocated, `self.resize_reservation()` will be invoked. `Bytes::resize_reservation` will further invokes user-defined custom reservation function. If such reservation function panics, in view of buffer, the memory is in fact reallocated, but `self.ptr` has not updated yet. If user uses `catch_unwind` to recover from unwinding, methods such as `Buffer::as_slice` will let user access freed-memories, leading to use-after-free. I can't come up with a graceful simple solution to fix it. For me, there are three approaches: * Declare `shrink_to_fit` as unsafe, and requires the memory pool should not panic in the `SAFETY` requirement * Modify `Bytes::try_realloc` to accept a callback function, which will be invoked after the reallocating and before calling `self.resize_reservation()` * Before calling `Bytes::try_realloc`, let `self.ptr` points to a static valid memory buffer. If `try_realloc` succeeds, `self.ptr` will still be updated correctly; If it panics, at least there will be no memory-related issues. # Issue 3 (may be): The SAFETY documentation of `arrow-buffer`'s `BooleanBufferBuilder::extend_trusted_len` is not accurate, potentially violating exception safety Sources are located here: https://github.com/apache/arrow-rs/blob/d9919194eef5ecca7f3b42726359a820825b30e6/arrow-buffer/src/builder/boolean.rs#L330-L343 https://github.com/apache/arrow-rs/blob/d9919194eef5ecca7f3b42726359a820825b30e6/arrow-buffer/src/buffer/mutable.rs#L705-L838 `BooleanBufferBuilder::extend_trusted_len` will call `MutableBuffer::extend_bool_trusted_len`. In this function, there will be firstly a `set_len` (line 746), and then invokes `iter.next` to fill those slots. If `I::next` panics, then there will be a typical exception safety violation. The SAFETY requirement of these two functions are "Callers must ensure that `iter` reports an exact size via `size_hint`." I think maybe this is a little bit inaccurate, although this can implicitly indicate that `I::next` must never panic. Maybe we should explicitly write this requirement in the SAFETY documentation? ### To Reproduce # Issue 1 1. User create a `MaliciousMemoryPool` that implements `arrow_buffer::pool::MemoryPool`, which panics at `resize` function. 2. User register this pool with `MutableBuffer::claim()`, in which the `MutableBuffer` will hold this memory pool 3. User invokes `MutableBuffer::resize()` under `std::panics::catch_unwind()`. Although the `resize` will immediately panic, while the `catch_unwind` helps recover from panic, and user can access the `MutableBuffer` instance after panicking. However, the mutex `self.reservation` is poisoned here. 4. User invokes `MutableBuffer::into_buffer()`. Since the mutex is poisoned, the code will panic by `unwrap`. And during unwinding, the destructors of `MutableBuffer` and `Bytes` are automatically invoked, leading to UAF. # Issue 2 1. User create a `MaliciousMemoryPool` that implements `arrow_buffer::pool::MemoryPool`. The `reserve` function will create a malicious `MemoryReservation` implementor, which will panic at `resize`. 2. User register this pool with `Buffer::claim()`, in which the `self.data` will hold this memory reservation 3. User invokes `Buffer::shrink_to_fit()` under `std::panics::catch_unwind()` 4. User invokes `Buffer::as_slice()` after the panic is recovered ### Expected behavior Be safe and sound, even in context of exception safety. ### Additional context _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
