Dandandan commented on PR #6800:
URL: 
https://github.com/apache/arrow-datafusion/pull/6800#issuecomment-1622272404

   > > The easiest approach would be storing elements inVec (as it may need to 
grow) or similar. We can mutate the original strings (instead of creating new 
ones for replacements) to keep the allocations a bit lower.
   > 
   > What about using the row format ?
   > 
   > We could store the current minimum for all groups in the same `Rows` 🤔 and 
track an index into that `Rows` for the current minimum for each group.
   > 
   > This would require an extra copy of the input values, but it could 
probably be vectorized pretty well
   > 
   > @tustvold any other clever thoughts on dynamically building up 
StringArrays to store the currently seen min/max?
   
   I agree that would be fast, but this comes at the cost of storing every seen 
value? How would we restrict memory usage this way?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to