Asif created SPARK-53264: ---------------------------- Summary: Conversion of correlated scala subquery to Left Outer Join , results in nullability as false, of the right side attribute Key: SPARK-53264 URL: https://issues.apache.org/jira/browse/SPARK-53264 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 4.0.0, 4.1.0 Reporter: Asif
In the RewriteCorrelatedScalarSubquery, when a correlated scalal subquery gets converted into a Left Outer Join, the Project node just above the Left Outer Join Node, has the nullability as false, of the attribute coming out of the right side of the Join Table. Any attribute coming out of the right side of the Join Table for Left Outer Join should have nullability as false. This results in the query ( from SQLQueryTestSuite) : {quote}select * {quote} {quote}from range(1, 3) t1 {quote} {quote}where (select t2.id c {quote} {quote}from range (1, 2) t2 where t1.id = t2.id {quote} {quote}) is not null {quote} eventually wrongly getting optimized into an Inner Join. But the bug remains hidden in the current code base, due to the inefficiency in the PushDownPredicates rule, which indirectly sorts of hide the issue. If the PushDownPredicates was working efficiently ( i.e combining and pushing predicates in a single pass), the bug would get exposed. The inefficiency in PushDownPredicates rule is itself described in bug [SPARK-53263|[https://issues.apache.org/jira/projects/SPARK/issues/SPARK-53263].] Will be submitting a PR with bug test in some time. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org