findepi commented on issue #16677: URL: https://github.com/apache/datafusion/issues/16677#issuecomment-3092338265
How we can solve this: ## Add two new traits ```rust trait UdfHash { fn udf_hash(&self) -> u64; } trait UdfEq { fn udf_eq(&self, other: &dyn Any) -> bool; } ``` with blanket implementations ```rust impl<T: Hash + Any> UdfHash for T { fn udf_hash(&self) -> u64 { let hasher = &mut DefaultHasher::new(); self.type_id().hash(hasher); self.hash(hasher); hasher.finish() } } impl<T: PartialEq + Any> UdfEq for T { fn udf_eq(&self, other: &dyn Any) -> bool { let Some(other) = other.downcast_ref::<Self>() else { return false; }; self == other } } ``` ## Make `ScalarUDFImpl` depend on these ```rust trait ScalarUDFImpl: Send + Sync + UdfHash + UdfEq + Any { ... } ``` ## Usage ```rust #[derive(Debug, Clone, PartialEq, Hash)] struct MyFunc { state: SomeEquitableStateType, } impl ScalarUDFImpl for MyFunc { // nothing eq/hash related needed here } ``` As shown, the end result is that `ScalarUDFImpl` can be hashed and compared for equality and all the implementor needs is `#[derive(PartialEq, Hash)]` to make it work. ## Notes ### `UdfEq` accepts any `Any` `UdfEq` accepts any `Any` so it's possible to call it with instances other than `&dyn ScalarUDFImpl`. I don't know how to avoid this. In particular, this is not allowed in Rust: ```rust // Error: cycle detected when computing the super predicates of `ScalarUDFImpl` [E0391] trait ScalarUDFImpl: PartialEq<dyn ScalarUDFImpl> { /* ... */ } ``` We can cover this with additional eq/hash methods directly in `ScalarUDFImpl` trait that will have default implementation and will not need to be reimplemented. ```rust trait ScalarUDFImpl: Send + Sync + UdfHash + UdfEq + Any { /* ... */ fn eq(&self, other: &dyn ScalarUDFImpl) -> bool { self.udf_eq(other) } fn hash(&self) -> u64 { self.udf_hash() } } ``` ### Compiler errors don't make it obvious As with any trait with blanket implementations (such as `Into`), when you don't implement it yet, the compiler won't tell you how you should implement. For example, if a function lacks `PartialEq` implementation, the compiler will complain about `UdfEq` instead. ```rust // Missing eq implementation #[derive(Debug, Clone, /*PartialEq,*/ Hash)] struct MyFunc { safe: bool, } // Error: the trait bound `MyFunc: UdfEq` is not satisfied impl ScalarUDFImpl for MyFunc { /* ... */ } ``` It can be mitigated by a good documentation for the `UdfEq`, `UdfHash` traits informing about blanket implementation and that they should not be implemented directly. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org