Github user srinathshankar commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14912#discussion_r77748668
  
    --- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/FilterPushdownSuite.scala
 ---
    @@ -171,6 +172,27 @@ class FilterPushdownSuite extends PlanTest {
         comparePlans(optimized, correctAnswer)
       }
     
    +  test("push down filters that are combined") {
    +    // The following predicate ('a === 2 || 'a === 3) && ('c > 10 || 'a 
=== 2)
    +    // will be simplified as ('a == 2) || ('c > 10 && 'a == 3).
    +    // ('a === 2 || 'a === 3) can be pushed down. But the simplified one 
can't.
    --- End diff --
    
    I agree with you that we should respect the interaction between 
CombineFilters, PushDownPredicates and other rules. I do think it's important 
that cnf conversion run before any of the push-down / reordering rules. And the 
simplification rules should run afterwards. 
    My concern with rolling this into CombineFilters is that it doesn't get 
triggered unless there are adjoining Filter nodes. In the example you have:
    val originalQuery = testRelation                                            
                        
      .select('a, 'b, ('c + 1) as 'cc)                                          
                        
      .groupBy('a)('a, count('cc) as 'c)                                        
                        
      .where('c > 10)                                                           
                        
      .where(('a === 2) || ('c > 10 && 'a === 3))
    
    I think that (a == 2 || a==3) should get pushed down even if you don't have 
".where (c > 10)",
    but I'm not sure that it will be since toCNF is in CombineFilters. Could 
you confirm ?
    My suggestion is that toCNF warrants a separate rule -- for example when 
you're doing joins, and you have
    select * from A inner join C on (A.a1 = C.c1) where A.a2 = 2 || (C.c2 = 10 
&& A.a2 = 3),
    you want (A.a2 = 2 || A.a2 = 3) pushed down into A


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to