findepi commented on issue #16677:
URL: https://github.com/apache/datafusion/issues/16677#issuecomment-3092338265
How we can solve this:
## Add two new traits
```rust
trait UdfHash {
fn udf_hash(&self) -> u64;
}
trait UdfEq {
fn udf_eq(&self, other: &dyn Any) -> bool;
}
```
with blanket implementations
```rust
impl<T: Hash + Any> UdfHash for T {
fn udf_hash(&self) -> u64 {
let hasher = &mut DefaultHasher::new();
self.type_id().hash(hasher);
self.hash(hasher);
hasher.finish()
}
}
impl<T: PartialEq + Any> UdfEq for T {
fn udf_eq(&self, other: &dyn Any) -> bool {
let Some(other) = other.downcast_ref::<Self>() else {
return false;
};
self == other
}
}
```
## Make `ScalarUDFImpl` depend on these
```rust
trait ScalarUDFImpl: Send + Sync + UdfHash + UdfEq + Any {
...
}
```
## Usage
```rust
#[derive(Debug, Clone, PartialEq, Hash)]
struct MyFunc {
state: SomeEquitableStateType,
}
impl ScalarUDFImpl for MyFunc {
// nothing eq/hash related needed here
}
```
As shown, the end result is that `ScalarUDFImpl` can be hashed and compared
for equality and all the implementor needs is `#[derive(PartialEq, Hash)]` to
make it work.
## Notes
### `UdfEq` accepts any `Any`
`UdfEq` accepts any `Any` so it's possible to call it with instances other
than `&dyn ScalarUDFImpl`.
I don't know how to avoid this.
In particular, this is not allowed in Rust:
```rust
// Error: cycle detected when computing the super predicates of
`ScalarUDFImpl` [E0391]
trait ScalarUDFImpl: PartialEq<dyn ScalarUDFImpl> {
/* ... */
}
```
We can cover this with additional eq/hash methods directly in
`ScalarUDFImpl` trait that will have default implementation and will not need
to be reimplemented.
```rust
trait ScalarUDFImpl: Send + Sync + UdfHash + UdfEq + Any {
/* ... */
fn eq(&self, other: &dyn ScalarUDFImpl) -> bool {
self.udf_eq(other)
}
fn hash(&self) -> u64 {
self.udf_hash()
}
}
```
### Compiler errors don't make it obvious
As with any trait with blanket implementations (such as `Into`), when you
don't implement it yet, the compiler won't tell you how you should implement.
For example, if a function lacks `PartialEq` implementation, the compiler
will complain about `UdfEq` instead.
```rust
// Missing eq implementation
#[derive(Debug, Clone, /*PartialEq,*/ Hash)]
struct MyFunc {
safe: bool,
}
// Error: the trait bound `MyFunc: UdfEq` is not satisfied
impl ScalarUDFImpl for MyFunc {
/* ... */
}
```
It can be mitigated by a good documentation for the `UdfEq`, `UdfHash`
traits informing about blanket implementation and that they should not be
implemented directly.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]