askalt commented on code in PR #20003:
URL: https://github.com/apache/datafusion/pull/20003#discussion_r2735141165


##########
datafusion/sqllogictest/test_files/tpch/plans/q22.slt.part:
##########
@@ -64,7 +64,7 @@ logical_plan
 06)----------Inner Join:  Filter: CAST(customer.c_acctbal AS Decimal128(19, 
6)) > __scalar_sq_2.avg(customer.c_acctbal)
 07)------------Projection: customer.c_phone, customer.c_acctbal
 08)--------------LeftAnti Join: customer.c_custkey = 
__correlated_sq_1.o_custkey
-09)----------------Filter: substr(customer.c_phone, Int64(1), Int64(2)) IN 
([Utf8View("13"), Utf8View("31"), Utf8View("23"), Utf8View("29"), 
Utf8View("30"), Utf8View("18"), Utf8View("17")])
+09)----------------Filter: substr(customer.c_phone, Int64(1), Int64(2)) IN 
([Utf8View("13"), Utf8View("31"), Utf8View("23"), Utf8View("29"), 
Utf8View("30"), Utf8View("18"), Utf8View("17")]) AND Boolean(true)

Review Comment:
   It is an incorrect behavior of filter-push-down to forgot this `true` in 
FilterExec (just a coincidence that it is trivial filter).
   
   Currently, in `main` branch we have the following:
   
   
https://github.com/apache/datafusion/blob/16368983bdefca40ff0f7fd968ed2a0c6aa21452/datafusion/sqllogictest/test_files/tpch/plans/q22.slt.part#L67-L68
   
   We can notice that here `, Boolean(true)` in the `partial_filters` section. 
As this filter is not supported fully, it should be re-checked in filter node, 
so if it would non-trivial filter (push-down does not know if it is) it would 
lead to incorrect selection result.
   



##########
datafusion/sqllogictest/test_files/tpch/plans/q22.slt.part:
##########
@@ -64,7 +64,7 @@ logical_plan
 06)----------Inner Join:  Filter: CAST(customer.c_acctbal AS Decimal128(19, 
6)) > __scalar_sq_2.avg(customer.c_acctbal)
 07)------------Projection: customer.c_phone, customer.c_acctbal
 08)--------------LeftAnti Join: customer.c_custkey = 
__correlated_sq_1.o_custkey
-09)----------------Filter: substr(customer.c_phone, Int64(1), Int64(2)) IN 
([Utf8View("13"), Utf8View("31"), Utf8View("23"), Utf8View("29"), 
Utf8View("30"), Utf8View("18"), Utf8View("17")])
+09)----------------Filter: substr(customer.c_phone, Int64(1), Int64(2)) IN 
([Utf8View("13"), Utf8View("31"), Utf8View("23"), Utf8View("29"), 
Utf8View("30"), Utf8View("18"), Utf8View("17")]) AND Boolean(true)

Review Comment:
   It is an incorrect behavior of filter-push-down to omit this `true` in 
FilterExec (just a coincidence that it is trivial filter).
   
   Currently, in `main` branch we have the following:
   
   
https://github.com/apache/datafusion/blob/16368983bdefca40ff0f7fd968ed2a0c6aa21452/datafusion/sqllogictest/test_files/tpch/plans/q22.slt.part#L67-L68
   
   We can notice that here `, Boolean(true)` in the `partial_filters` section. 
As this filter is not supported fully, it should be re-checked in filter node, 
so if it would non-trivial filter (push-down does not know if it is) it would 
lead to incorrect selection result.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to