Xiao Li created SPARK-13919:
-------------------------------

             Summary: Resolving the Conflicts of ColumnPruning and 
PushPredicateThroughProject 
                 Key: SPARK-13919
                 URL: https://issues.apache.org/jira/browse/SPARK-13919
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 2.0.0
            Reporter: Xiao Li


Now, {{ColumnPruning}} and {{PushPredicateThroughProject}} reverse each other's 
effect. Although it will not cause the max iteration now, some queries are not 
optimized to the best. 

For example, in the following query, 
{code}
    val input = LocalRelation('a.int, 'b.string, 'c.double, 'd.int)
    val originalQuery =
      input.select('a, 'b, 'c, 'd,
        WindowExpression(
          AggregateExpression(Count('b), Complete, isDistinct = false),
          WindowSpecDefinition( 'a :: Nil,
            SortOrder('b, Ascending) :: Nil,
            UnspecifiedFrame)).as('window)).where('window > 1).select('a, 'c)
{code}

{code}
Project [a#0,c#0]
+- Filter (window#0L > cast(1 as bigint))
   +- Project [a#0,c#0,window#0L]
      +- Window [(count(b#0),mode=Complete,isDistinct=false) 
windowspecdefinition(a#0, b#0 ASC, RANGE BETWEEN UNBOUNDED PRECEDING AND 
CURRENT ROW) AS window#0L], [a#0], [b#0 ASC]
         +- Project [a#0,b#0,c#0]
            +- LocalRelation [a#0,b#0,c#0,d#0]                                  
                                                                                
                        
{code}

{code}
Project [a#0,c#0]                                                               
                                                                                
                     
+- Filter (window#0L > cast(1 as bigint))                                       
                                                                                
                     
   +- Project [a#0,c#0,window#0L]                                               
                                                                                
                     
      +- Window [(count(b#0),mode=Complete,isDistinct=false) 
windowspecdefinition(a#0, b#0 ASC, RANGE BETWEEN UNBOUNDED PRECEDING AND 
CURRENT ROW) AS window#0L], [a#0], [b#0 ASC]   
         +- LocalRelation [a#0,b#0,c#0,d#0]                                     
                                                                                
                     
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to