Omega359 commented on PR #12027: URL: https://github.com/apache/datafusion/pull/12027#issuecomment-2308528161
> @Omega359 and @XiangpengHao -- what do you think we should do with the conversation above? [#12027 (comment)](https://github.com/apache/datafusion/pull/12027#issuecomment-2295332991) I can't tell if we are suggesting we go with the `StringArrays` approach or if the `StringArrayType` trait is ok? > > My "gut" feel is the same as @XiangpengHao that implementing using generics (not `dyn StringArrayType` but a function that is generic over `StringArrayType` should be at least as fast as the "dynamic dispatch" mechanism of `StringArray`s (because the compiler gets a change to build special code for it) > > The downside of the generics approach is that now we'll end up with 3 copies of most functions and the extra performance, if any, may not justify the binary overhead 🤔 Sorry, missed the notification for this comment till now. The StringArrays may be the nicest API wise but it does incur unavoidable overhead for anything but .iter() (or at least I couldn't find a way to make that approach faster). I would say that for StringArrayType should be used in DataFusion for most use cases except where one would want specialized handling or it to be object safe (which that trait can't be). I'll have another go at the to_timestamp UDF using the StringArrayType approach but I suspect I may need to refactor it quite a bit to make that work. Hopefully I'll have a pull request next week for that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org