Re: [PR] arrow-select: add support for merging primitive dictionary values [arrow-rs]

via GitHub Wed, 21 May 2025 03:10:20 -0700


asubiotto commented on PR #7519:
URL: https://github.com/apache/arrow-rs/pull/7519#issuecomment-2897393095


   Thanks @tustvold, I wasn't aware of that PR.
   
   In our case we don't do any computations on these columns but care about the 
memory savings since we run in memory-constrained environments. Our data for 
these columns specifically has very few unique values (0.03% is a recent 
number). Additionally, our schema is deeply nested and these columns are 
usually found in the leaves (within lists of structs) so memory savings per 
batch is magnified. Granted, our idea was to have these be REE but that wasn't 
working for some reason (can't remember why but I should experiment when I have 
some time).
   
   Let me know how you'd like to proceed. I guess one point in favor of this PR 
is that if someone is using primitive dictionary batches it is likely that they 
value memory over perf.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] arrow-select: add support for merging primitive dictionary values [arrow-rs]

Reply via email to