alamb commented on code in PR #3334:
URL: https://github.com/apache/arrow-datafusion/pull/3334#discussion_r962139573


##########
datafusion/sql/src/planner.rs:
##########
@@ -2453,6 +2453,17 @@ fn remove_join_expressions(
                     (_, Some(rr)) => Ok(Some(rr)),
                     _ => Ok(None),
                 }
+            },
+            // Fix for issue#78 join predicates from inside of OR expr also 
pulled up properly.
+            Operator::Or => {

Review Comment:
   In general, I think this kind of rewrite should be done in an SQL optimizer 
pass so that it will apply to any query plan (e.g. that came from the DataFrame 
API) rather than only SQL



##########
datafusion/core/tests/sql/predicates.rs:
##########
@@ -427,11 +427,10 @@ async fn multiple_or_predicates() -> Result<()> {
     let expected =vec![
         "Explain [plan_type:Utf8, plan:Utf8]",
         "  Projection: #lineitem.l_partkey [l_partkey:Int64]",
-        "    Projection: #part.p_partkey = #lineitem.l_partkey AS 
BinaryExpr-=Column-lineitem.l_partkeyColumn-part.p_partkey, 
#lineitem.l_partkey, #lineitem.l_quantity, #part.p_brand, #part.p_size 
[BinaryExpr-=Column-lineitem.l_partkeyColumn-part.p_partkey:Boolean;N, 
l_partkey:Int64, l_quantity:Float64, p_brand:Utf8, p_size:Int32]",
-        "      Filter: #part.p_partkey = #lineitem.l_partkey AND #part.p_brand 
= Utf8(\"Brand#12\") AND #lineitem.l_quantity >= Int64(1) AND 
#lineitem.l_quantity <= Int64(11) AND #part.p_size BETWEEN Int64(1) AND 
Int64(5) OR #part.p_brand = Utf8(\"Brand#23\") AND #lineitem.l_quantity >= 
Int64(10) AND #lineitem.l_quantity <= Int64(20) AND #part.p_size BETWEEN 
Int64(1) AND Int64(10) OR #part.p_brand = Utf8(\"Brand#34\") AND 
#lineitem.l_quantity >= Int64(20) AND #lineitem.l_quantity <= Int64(30) AND 
#part.p_size BETWEEN Int64(1) AND Int64(15) [l_partkey:Int64, 
l_quantity:Float64, p_partkey:Int64, p_brand:Utf8, p_size:Int32]",
-        "        CrossJoin: [l_partkey:Int64, l_quantity:Float64, 
p_partkey:Int64, p_brand:Utf8, p_size:Int32]",
-        "          TableScan: lineitem projection=[l_partkey, l_quantity] 
[l_partkey:Int64, l_quantity:Float64]",
-        "          TableScan: part projection=[p_partkey, p_brand, p_size] 
[p_partkey:Int64, p_brand:Utf8, p_size:Int32]",
+        "    Filter: #part.p_brand = Utf8(\"Brand#12\") AND 
#lineitem.l_quantity >= Int64(1) AND #lineitem.l_quantity <= Int64(11) AND 
#part.p_size BETWEEN Int64(1) AND Int64(5) OR #part.p_brand = 
Utf8(\"Brand#23\") AND #lineitem.l_quantity >= Int64(10) AND 
#lineitem.l_quantity <= Int64(20) AND #part.p_size BETWEEN Int64(1) AND 
Int64(10) OR #part.p_brand = Utf8(\"Brand#34\") AND #lineitem.l_quantity >= 
Int64(20) AND #lineitem.l_quantity <= Int64(30) AND #part.p_size BETWEEN 
Int64(1) AND Int64(15) [l_partkey:Int64, l_quantity:Float64, p_partkey:Int64, 
p_brand:Utf8, p_size:Int32]",

Review Comment:
   This is definitely the better plan 👍 @DhamoPS 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to