tustvold commented on PR #6616: URL: https://github.com/apache/arrow-rs/pull/6616#issuecomment-2432794522
> For example, you tries to modify the indices array after calling take. But some cases you can do it because the array is not cloned, but some cases you cannot because the array is cloned. > I don't think this is good design for a kernel behavior. The contract of a kernel should be on the semantic value of the output, this leaves kernels free to implement physical layout optimisations such as this. I don't believe we ever document or articulate any contract on how various kernels should behave w.r.t their inputs, this would not only be very restrictive but extremely fragile. Take is far from the only kernel that will simply clone input buffers, especially if one considers types like DictionaryArray where the underlying dictionary may or may not be recomputed depending on heuristics. From the take kernel's perspective it has no way to know that you want to reuse the null buffer that it was given, so the correct thing is for it to not create a fresh allocation unnecessarily. If code then wants to try to reuse the buffer, it can try, falling back to performing the allocation if necessary. This is safe, sound and optimal. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
