fenfeng9 opened a new pull request, #50166:
URL: https://github.com/apache/arrow/pull/50166

   ### Rationale for this change
   
     Casting to `binary_view` or `string_view` could leave a null variadic 
buffer slot when all values were inline. This could happen for casts from 
`binary`, `large_binary`, `string`, `large_string`, and `fixed_size_binary`.
   
     The C Data Interface exporter reads every variadic buffer to get its size. 
Because of that, exporting such an array could crash, for example through 
PyArrow `_export_to_c`.
   
   Validation also passed for these arrays. For all-inline view arrays, 
validation never needed to read an out-of-line data buffer.
   
   ### What changes are included in this PR?
   
   This PR fixes the cast kernels so all-inline view arrays do not keep a null 
variadic buffer slot.
   
   It also makes validation reject null variadic buffer slots, and makes C Data 
export return an error instead of crashing.
   
   C++ and Python regression tests cover the cast, validation, and export paths.
   
   ### Are these changes tested?
   
   Yes.
   
   ### Are there any user-facing changes?
   
   No.
   
   **This PR contains a "Critical Fix"** Exporting an all-inline view array 
through the C Data Interface could crash the process while using only public 
APIs.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to