Dandandan commented on PR #6800: URL: https://github.com/apache/arrow-datafusion/pull/6800#issuecomment-1622236683
My > > MIN/MAX are going to be a bit trickier than the other ones as they also support non-primitive types (e.g. strings). I will first implement the primitive version of those if that sounds fair @alamb . > > Sounds great! Figuring out how to extend the model for strings will be a good exercise I think The easiest approach would be storing elements in`Vec<String>` (as it may need to grow) or similar. We can mutate the original strings (instead of creating new ones for replacements) to keep the allocations a bit lower. An alternative approach would be to keep a number of buffers which can hold variable-sized data until a certain maximum size (in buckets of say 10, 20, 30 bytes) and a list of "free" items that have been moved. I think this might be a fast approach for small strings, but also introduce some complexity. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
