adriangb commented on PR #18449: URL: https://github.com/apache/datafusion/pull/18449#issuecomment-3508566030
It looks like there are indeed some regressions. I propose we do two things: 1. Add a `create_hashes_unbuffered(…) -> &[u64]` that uses a thread local to re-use the buffer. I think this will be helpful in other contexts as well. 2. Create a `make_typed_comparator` that returns an enum that is typed for non-recursive types and delegates to a fallback dynamically typed variant for recursive types. I’ll implement it here for now but make a note that it would be good to upstream into arrow. When it is up streamed into arrow we can re-implement the current version in terms of their new version and deprecate the current function. I think that will get us the broader type support and code re-use while avoiding any slowdown. Once we do the upstreaming into arrow it won’t even be any more code than it is now (a bit more code in arrow but not even that much). And we should be able to do it all in one PR here -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
