abellina commented on PR #36505: URL: https://github.com/apache/spark/pull/36505#issuecomment-1129117318
Update on the SPARK-32290: SingleColumn Null Aware Anti Join Optimize failure: - The original test used a table in the subquery `testData2` which has no nulls, so I added `testData2WithNulls` which brings nulls to the `a` column, which we are using in the `not in` statement. This helps the first query in this test to be planned as a null-aware. - All other queries in the test work, except for the negative case for the multi-column support. It is commented out in my last patch (obviously that's not the solution) https://github.com/apache/spark/pull/36505/commits/baac1e4119f755b3a906f27c4f7324022fc27e85#diff-4279d0074cd860be3eab329b3716ac502a14ae4512c3e371a5f0e68636cec07dR1163-R1166. I believe this also has to do with the nullability, in this case of the second column `b`, and I am looking into it, but I could use some help. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
