Rich-T-kid commented on PR #10094:
URL: https://github.com/apache/arrow-rs/pull/10094#issuecomment-4671511320

   Yea I agree, wanted to give it an attempt. The main issue is that if the 
run_ends aren't re-written  the logical array that is expressed is incorrect. 
For example 
   ```
   // represents: [a,a,b,b,b,a,a,a,c,c]
   Runarray = {run_ends: [2,5,8,10], values:["a","b","a","c"]}
   let sliced = Runarray.slice(3,5)
   // represents: [b,b,a,a,a]
   ```
   but in both arrays the run_ends are `[2,5,8,10]` and the values are 
`["a","b","a","c"]`. Logical_len & logical_offset are the only differences 
between the two. We also cant just naively cut up the values array, this would 
cause the run_ends buffer to misrepresent the correct logical form.
   I think its fine to leave as it is, the swapping values pattern is somewhat 
uncommon and the performance boost should be negligible. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to