jacksonrnewhouse opened a new issue, #9254: URL: https://github.com/apache/arrow-datafusion/issues/9254
### Describe the bug If you attempt to join two tables on a struct field, the query will plan it successfully, albeit with the struct equality in a the `filter`, rather than in the `on` vector. However, when it runs it fails with "Invalid comparison operation". In particular, it triggers this error from arrow-rs: https://github.com/apache/arrow-rs/blob/db811083669df66992008c9409b743a2e365adb0/arrow-ord/src/cmp.rs#L202. ### To Reproduce I wrote a failing test that just does a self join at https://github.com/apache/arrow-datafusion/compare/35.0.0...ArroyoSystems:arrow-datafusion:bug_report/struct_join_fails_at_execution. The failure message is ``` thread 'user_defined::user_defined_aggregates::test_struct_join' panicked at datafusion/core/tests/user_defined/user_defined_aggregates.rs:172:60: called `Result::unwrap()` on an `Err` value: Execution("Fail to build join indices in NestedLoopJoinExec, error:Arrow error: Invalid argument error: Invalid comparison operation: Struct([Field { name: \"value\", data_type: Float64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"time\", data_type: Timestamp(Nanosecond, None), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }]) == Struct([Field { name: \"value\", data_type: Float64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"time\", data_type: Timestamp(Nanosecond, None), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }])") ``` ### Expected behavior Either the join should fail at planning, reporting a clear error that joins on structs are not supported or, preferably, datafusion should support joins on two structs of the same type. ### Additional context This comes up with Arroyo where we want to join on time windows, e.g. sliding and tumbling windows. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
