andygrove opened a new issue, #4430:
URL: https://github.com/apache/datafusion-comet/issues/4430
## Describe the bug
In `native/spark-expr/src/json_funcs/to_json.rs`, `ToJson` has two equality
impls that compare different sets of fields:
```rust
impl PartialEq for ToJson {
fn eq(&self, other: &Self) -> bool {
self.expr.eq(&other.expr)
&& self.timezone.eq(&other.timezone)
&& self.ignore_null_fields.eq(&other.ignore_null_fields)
}
}
impl PartialEq<dyn Any> for ToJson {
fn eq(&self, other: &dyn Any) -> bool {
if let Some(other) = other.downcast_ref::<ToJson>() {
self.expr.eq(&other.expr) && self.timezone.eq(&other.timezone)
} else {
false
}
}
}
```
The `PartialEq<dyn Any>` impl omits `timezone` historically, and after #3875
also omits `ignore_null_fields`. (Originally only `timezone` was the
discrepancy; the gap widened when `ignore_null_fields` was added.)
This matters because `PartialEq<dyn Any>` is what DataFusion uses to compare
`Arc<dyn PhysicalExpr>` values, so two `ToJson` exprs that differ only in
`timezone` or `ignore_null_fields` will compare equal through the trait-object
path. That can cause incorrect deduplication or caching in the physical planner.
## Steps to reproduce
Construct two `ToJson` exprs with the same child but different `timezone` or
`ignore_null_fields`, compare them via `Arc<dyn PhysicalExpr>` equality (e.g.
through `PhysicalExpr::eq` on the `Any` path), and observe that they compare
equal.
## Expected behavior
Both equality paths should compare the same fields. The `PartialEq<dyn Any>`
impl should include `timezone` and `ignore_null_fields`, matching the regular
`PartialEq` impl. Equivalent fix: derive `PartialEq<dyn Any>` through the
regular `PartialEq` by downcasting and delegating to `self == other`.
## Additional context
Spotted while reviewing #3875. The PR added `ignore_null_fields` to the
regular `PartialEq` but did not touch the `dyn Any` impl, which made an
existing inconsistency slightly worse.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]