viirya opened a new issue, #5896:
URL: https://github.com/apache/arrow-rs/issues/5896

   **Describe the bug**
   <!--
   A clear and concise description of what the bug is.
   -->
   
   We move buffer pointer of offset buffer when slicing a string array and keep 
data buffer pointer unchanged. When exporting it through FFI, we simply export 
the moved pointer of the offset buffer.
   
   When importing the array, we calculate the length of data buffer by taking 
the difference of last offset and first offset in the (slice) offset buffer. 
Note that the calculated length is not correct.
   
   For example, the original string array's data buffer is 346536 bytes, last 
offset is 346536. We take a slice of 8192 strings from it, the slice of offsets 
are `[147456, ..., 294912]`. The calculated length is `294912 - 147456 = 
147456`. But actually the length of data buffer is `346536`. So the data buffer 
of the imported array has incorrect length.
   
   It doesn't cause issues so far because we access imported data buffer using 
pointers at most time (and we don't actually check the range). But for some 
cases where we access the data as slice (i.e., `[]`), it will cause runtime 
panic like:
   
   ```
   ---- ffi::tests_from_ffi::test_extend_imported_string_slice stdout ----
   thread 'ffi::tests_from_ffi::test_extend_imported_string_slice' panicked at 
arrow-data/src/transform/variable_size.rs:38:29:
   range end index 10890 out of range for slice of length 5500
   ``` 
   
   Note `test_extend_imported_string_slice` is new test I added in #5895.
   
   **To Reproduce**
   <!--
   Steps to reproduce the behavior:
   -->
   
   **Expected behavior**
   <!--
   A clear and concise description of what you expected to happen.
   -->
   
   **Additional context**
   <!--
   Add any other context about the problem here.
   -->


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to