Rachelint commented on issue #6906:
URL: https://github.com/apache/datafusion/issues/6906#issuecomment-2356428850

   I think 
   https://github.com/apache/datafusion/pull/6800#issuecomment-1622236683
   
   > > I am not familiar enough with StringViewArray, is it ok to do that? And 
will it lead to a extremely bad performance?
   > 
   > I think using a single `Buffer` for each string will be bad for 
performance (likely worse than storing as `String` and copying them at the end. 
`StringViewArray` is really optimized for a small number of buffers (even 
though in theory it could have 2B of them as it is indexed on `i32`)
   
   Ok, for `StringView min/max`, seems we can just start with using 
`Vec<u128>(views)` to store the inlined state(<= 12), use `Vec<String>` to 
store the unlined.
   
   And when conerting it to `StringViewArray`, we just copy the `Vec<String>` 
to create the buffer (`GroupsAccumulatorAdapter` copy the states too).
   
   For the short strings(<=12), it can avoid allocating `String`, and for the 
long ones, it just do the same thing as `GroupsAccumulatorAdapter`. Seems it 
can have a better performance(due optimization for shorts)?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to