alamb opened a new issue, #6961:
URL: https://github.com/apache/arrow-datafusion/issues/6961

   Basically the `Unest` exec plan could be made faster if we reduced some 
copies. Here is the basic idea in case anyone wants to do that
   
                 Thanks @vincev  -- what I was confused about is that if I look 
at this description:
   
   ```
       // Create an array with the unnested values of the list array, given the 
list
       // array:
       //
       //   [1], null, [2, 3, 4], null, [5, 6]
       //
       // the result array is:
       //
       //   1, null, 2, 3, 4, null, 5, 6
       //
       let unnested_array = unnest_array(list_array)?;
   ``
   
   This looks very much the same to me as calling `list_array.vaules()` to get 
access to the underlying values: 
https://docs.rs/arrow/latest/arrow/array/struct.GenericListArray.html#method.values
   
   In this case the values array would be more like
   
   ```
   [1, 2, 3, 4, 5, 6]
   ```
   
   And the offsets of the list array would be would be like (I think):
   
   ```
   [0, 1, 1, 3, 3, 6]
   ```
   
   With a null mask showing the second and fourth element are null
   
   
   
   So I was thinking you could calculate the take indices directly from the 
offsets / nulls without having to copy all the values out of the underlying 
array
   
   _Originally posted by @alamb in 
https://github.com/apache/arrow-datafusion/pull/6903#discussion_r1262610832_
               


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to