tobixdev opened a new issue, #15891: URL: https://github.com/apache/datafusion/issues/15891
### Is your feature request related to a problem or challenge? Building a system that works with graph-like data on DataFusion will stumble upon the need to join the intermediate results of graph patterns. However, null handling is a bit different in these systems compared to SQL. Usually you combine two intermediary results based on a notion of compatibility instead of strict equality. In these semantics, `NULL` is compatible with everything. Here is a small table that demonstrates this behavior on a single value: | Lhs | Rhs | Matches? | |--------|--------|--------| | `NULL` | `NULL` | Yes | | "A" | `NULL` | Yes | | NULL | "A" | Yes | | `"A"` | `"A"` | Yes | | `"A"` | `"B"` | No | Is this something that you'd be interested in having in DF? ### Describe the solution you'd like I propose addressing this problem in three steps: 1. Replace `Join::null_equals_null` with an enum `JoinNullBehavior` (or similar). 2. Add an additional variant `JoinNullBehavior::NullMatchesEverything` and implement them in the respective join implementations. 3. Extending join implementations one-by-one by checking in the planner whether a join implementation is available for the given `JoinNullBehavior`. ### Describe alternatives you've considered Currently, we use UDFs to check for compatibility which can be implemented using a `NestedLoopJoinExec` as we do not have a "native" equal join condition. Having access to the HashJoin etc. implementation of DataFusion would be great, as we would not have to re-invent the join infrastructure. ### Additional context Definition of Solution Compatibility in SPARQL 1.1: - https://www.w3.org/TR/sparql11-query/#defn_algCompatibleMapping This could also be helpful for SQL/PGQ or GQL implementations based on DF. Related Issues: - https://github.com/apache/datafusion/issues/13545 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org