[GitHub] [arrow-datafusion] smiklos commented on pull request #7002: avoid copying listarray in unset exec

via GitHub Thu, 20 Jul 2023 12:50:37 -0700


smiklos commented on PR #7002:
URL: 
https://github.com/apache/arrow-datafusion/pull/7002#issuecomment-1644510004


   > Thank you very much @smiklos -- this looks neat but I don't really 
understand how it is faster as it still calls `take` twice
   > 
   > It would be great if you could share your benchmark and result with us. If 
it is faster then I think this PR is good to go.
   > 
   > The test case in dataframe.rs for FixedSizeList is 👨‍🍳 👌
   > 
   > cc @vincev
   
   It calls take for each column. It may skip calling take for the column being 
unnested if there are no null values. (that is a special case but can speed up 
certain queries even more).
   
   There could be another special case if fixed size arrays have only one value 
per array and no nulls as in that case there's no need for transforming the 
data. Otherwise for most cases I don't see how we can avoid take 
    


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-datafusion] smiklos commented on pull request #7002: avoid copying listarray in unset exec

Reply via email to