abellina commented on PR #36505: URL: https://github.com/apache/spark/pull/36505#issuecomment-1129153161
> All other queries in the test are passing, except for the negative case for the multi-column support. It is commented out in my last patch (obviously that's not the solution) https://github.com/apache/spark/commit/baac1e4119f755b3a906f27c4f7324022fc27e85#diff-4279d0074cd860be3eab329b3716ac502a14ae4512c3e371a5f0e68636cec07dR1163-R1166. I believe this also has to do with the nullability, in this case of the second column b, and I am looking into it, but I could use some help. Ok, the reason for this is that the condition: ``` (((key = a) OR isnull((key = a))) AND ((key + 1) = b)) ``` Is split into the equi-join part (`(key + 1) = b`) and the rest is turned into a conditional expression `(key = a) OR isnull((key = a))` by `ExtractEquiJoinKeys`. If `b` were nullable, we wind up with a different condition: ``` (((key = a) OR isnull((key = a))) AND (((key + 1) = b) OR isnull(((key + 1) = b)))) ``` This can't be split by `ExtractEquiJoinKeys`, so the join isn't an equi join so far. It goes on to `ExtractSingleColumnNullAwareAntiJoin` and passes through since it can't be matched with the single-column expression this rule is expecting => this is the negative case for the test. I'll change the table used to have nullable key,value columns in the test. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
