SaurabhChawla100 commented on a change in pull request #33232:
URL: https://github.com/apache/spark/pull/33232#discussion_r665152620
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
##########
@@ -1442,6 +1442,12 @@ object PushPredicateThroughNonJoin extends
Rule[LogicalPlan] with PredicateHelpe
pushDownPredicate(filter, u.child) { predicate =>
u.withNewChildren(Seq(Filter(predicate, u.child)))
}
+
+ // Push down filter predicates in case filter having child as TypedFilter.
+ // In this scenario inorder to push the filter predicates there is need to
+ // to push Filter beneath the TypedFilter.
+ case Filter(condition, typeFilter @ TypedFilter(_, _, _, _, _)) =>
+ typeFilter.copy(child = Filter(condition, typeFilter.child))
Review comment:
**BTW, looks like typed filter is separated from normal filter. So it
should be easier to adjust filter, typed filter operator in queries to make
filter pushdown-able even it is not optimized?**
This is what you mean to say here
Like this in the query itself df.filter("id=1").filter(SomeTypeFilter)
instead of df.filter(SomeTypeFilter).filter("id=1").
For this we need to tell user that used typeFilter after the normal filter,
since normal filter might partitioned filter and improve the the performance
of the query. I thought if we can do something in the optimizer rule to handle
such scenarios
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]