rluvaton opened a new issue, #7992:
URL: https://github.com/apache/arrow-rs/issues/7992

   Slicing on a variable length arrays like list/map/binary, does not slice the 
underlying values but instead only slice the offsets due to performance.
   
   However when wanting to get the underlying values for a list/map/binary, 
calling `values` (on list for example) give you the child values
   
   I've lost count on the number of times I saw or had bugs with getting the 
values instead of the the actual values that the list point to.
   
   this is an example where it's counter intuitive 
   
   ```rust
   let list = ListArray::from_iter_primitive::<Int32Type, _, _>(vec![
       Some(vec![Some(1), Some(2)]),
       None,
       Some(vec![Some(3), None, Some(5)]),
   ]);
   
   let list = list.slice(1, 2);
   
   // [null, [3, null, 5]]
   println!("{:?}", list);
   
   // [1, 2, 3, null, 5]
   println!("{:?}", list.values());
   ```  
   
   we added comments on list `values` that mention that `The list array may not 
refer to all values in the `values` array ...` but this is still not enough IMO.
   
   also, creating 2 functions that explicitly state what is returned will force 
the developer to think there is a cost (for example, casting list to string, 
you should only cast the sliced underlying values and not the entire underlying 
values and then slice)
   
   -----
   
   References to related changes/bugs:
   - https://github.com/apache/arrow-rs/pull/7037 - fix for a bug I had in 
concat of sliced list
   - https://github.com/apache/arrow-rs/issues/4409 - IPC slice
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to