Dandandan commented on code in PR #3578:
URL: https://github.com/apache/arrow-datafusion/pull/3578#discussion_r996309108


##########
benchmarks/expected-plans/q19.txt:
##########
@@ -3,7 +3,7 @@ Projection: SUM(lineitem.l_extendedprice * Int64(1) - 
lineitem.l_discount) AS re
     Projection: lineitem.l_shipinstruct = Utf8("DELIVER IN PERSON") AS 
lineitem.l_shipinstruct = Utf8("DELIVER IN PERSON")Utf8("DELIVER IN 
PERSON")lineitem.l_shipinstruct, lineitem.l_shipmode IN ([Utf8("AIR"), 
Utf8("AIR REG")]) AS lineitem.l_shipmode IN ([Utf8("AIR"), Utf8("AIR 
REG")])Utf8("AIR REG")Utf8("AIR")lineitem.l_shipmode, part.p_size >= Int32(1) 
AS part.p_size >= Int32(1)Int32(1)part.p_size, lineitem.l_quantity, 
lineitem.l_extendedprice, lineitem.l_discount, part.p_brand, part.p_size, 
part.p_container
       Filter: part.p_brand = Utf8("Brand#12") AND part.p_container IN 
([Utf8("SM CASE"), Utf8("SM BOX"), Utf8("SM PACK"), Utf8("SM PKG")]) AND 
lineitem.l_quantity >= Decimal128(Some(100),15,2) AND lineitem.l_quantity <= 
Decimal128(Some(1100),15,2) AND part.p_size <= Int32(5) OR part.p_brand = 
Utf8("Brand#23") AND part.p_container IN ([Utf8("MED BAG"), Utf8("MED BOX"), 
Utf8("MED PKG"), Utf8("MED PACK")]) AND lineitem.l_quantity >= 
Decimal128(Some(1000),15,2) AND lineitem.l_quantity <= 
Decimal128(Some(2000),15,2) AND part.p_size <= Int32(10) OR part.p_brand = 
Utf8("Brand#34") AND part.p_container IN ([Utf8("LG CASE"), Utf8("LG BOX"), 
Utf8("LG PACK"), Utf8("LG PKG")]) AND lineitem.l_quantity >= 
Decimal128(Some(2000),15,2) AND lineitem.l_quantity <= 
Decimal128(Some(3000),15,2) AND part.p_size <= Int32(15)
         Inner Join: lineitem.l_partkey = part.p_partkey
-          Filter: lineitem.l_shipmode IN ([Utf8("AIR"), Utf8("AIR REG")]) AND 
lineitem.l_shipinstruct = Utf8("DELIVER IN PERSON")
+          Filter: lineitem.l_shipmode IN ([Utf8("AIR"), Utf8("AIR REG")]) AND 
lineitem.l_shipinstruct = Utf8("DELIVER IN PERSON") AND lineitem.l_quantity >= 
Decimal128(Some(100),15,2) AND lineitem.l_quantity <= 
Decimal128(Some(1100),15,2) OR lineitem.l_quantity >= 
Decimal128(Some(1000),15,2) AND lineitem.l_quantity <= 
Decimal128(Some(2000),15,2) OR lineitem.l_quantity >= 
Decimal128(Some(2000),15,2) AND lineitem.l_quantity <= 
Decimal128(Some(3000),15,2)

Review Comment:
   I am sometimes surprised when profiling how little CPU the filters consume - 
it is **really** cheap compared to parquet reads / joins / etc. so almost any 
filtering we can do probably helps as long as it filters out _some_ data and a 
subsequent operators can benefit from it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to