ygf11 commented on code in PR #4826:
URL: https://github.com/apache/arrow-datafusion/pull/4826#discussion_r1068006850


##########
datafusion/optimizer/src/decorrelate_where_in.rs:
##########
@@ -554,14 +636,12 @@ mod tests {
             .project(vec![col("customer.c_custkey")])?
             .build()?;
 
-        // Query will fail, but we can still transform the plan
         let expected = "Projection: customer.c_custkey [c_custkey:Int64]\
-        \n  LeftSemi Join: customer.c_custkey = __correlated_sq_1.o_custkey 
[c_custkey:Int64, c_name:Utf8]\
+        \n  LeftSemi Join:  Filter: customer.c_custkey = 
__correlated_sq_1.o_custkey AND customer.c_custkey = customer.c_custkey 
[c_custkey:Int64, c_name:Utf8]\

Review Comment:
   No, I do not handle this special case, just treat `customer.c_custkey = 
customer.c_custkey` as join filter.
   
   I think it is better to add this to the filter of outer table, then the plan 
will be like:
   ```
   "Projection: customer.c_custkey [c_custkey:Int64]\
      LeftSemi Join: customer.c_custkey = __correlated_sq_1.o_custkey 
[c_custkey:Int64, c_name:Utf8]
        Filter: customer.c_custkey = customer.c_custkey [..]\
           TableScan: customer [c_custkey:Int64, c_name:Utf8]\
        SubqueryAlias: __correlated_sq_1 [o_custkey:Int64]\
           Projection: orders.o_custkey AS o_custkey [o_custkey:Int64]\
             TableScan: orders [o_orderkey:Int64, o_custkey:Int64, 
o_orderstatus:Utf8, o_totalprice:Float64;N]"
   ``` 
   
   Is this the right direction?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to