jon-chuang commented on code in PR #2451:
URL: https://github.com/apache/arrow-datafusion/pull/2451#discussion_r865696413


##########
datafusion/core/src/optimizer/subquery_filter_to_join.rs:
##########
@@ -348,11 +556,36 @@ mod tests {
         let expected = "Projection: #test.b [b:UInt32]\
         \n  Semi Join: #test.b = #sq.a [a:UInt32, b:UInt32, c:UInt32]\
         \n    TableScan: test projection=None [a:UInt32, b:UInt32, c:UInt32]\
-        \n    Projection: #sq.a [a:UInt32]\
-        \n      Semi Join: #sq.a = #sq_nested.c [a:UInt32, b:UInt32, c:UInt32]\
-        \n        TableScan: sq projection=None [a:UInt32, b:UInt32, c:UInt32]\
-        \n        Projection: #sq_nested.c [c:UInt32]\
-        \n          TableScan: sq_nested projection=None [a:UInt32, b:UInt32, 
c:UInt32]";
+        \n    Semi Join: #sq.a = #sq_nested.c [a:UInt32, b:UInt32, c:UInt32]\

Review Comment:
   One way to remove the quirk is to allow for multi-column `InSubquery`s.
   
   The predicate pullup rule (next PR) will be able to "commute" a projection 
and a filter as follows:
   ```
   Project([col(a)])
     Filter(col(b)=(col(t.b))
   =>
   Filter(col(b)=(col(t.b))
     Project([col(a), col(b)])
   ```
   
   This way, one can move the Filter to the top of the subquery tree, just like 
with Exists.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to