[ 
https://issues.apache.org/jira/browse/SPARK-14939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-14939:
----------------------------------
    Description: 
This issue aims to add new FoldablePropagation optimizer that propagates 
foldable expressions up over the logical plan nodes. Currently, ORDER BY, GROUP 
BY, Nested-SELECT clauses are supported, and aliases and ordinal expressions 
are the main target to be transformed after propagation. Other optimizations 
will take advantage of the propagated foldable expressions: e.g. EliminateSorts 
optimizer now can handle Case 2 and 3. (Case 1 is the previous implementation.)

1. Literals and foldable expression, e.g. "ORDER BY 1.0, 'abc', Now()"
2. Foldable ordinals, e.g. "SELECT 1.0, 'abc', Now() ORDER BY 1, 2, 3"
3. Foldable aliases, e.g. "SELECT 1.0 x, 'abc' y, Now() z ORDER BY x, y, z"

**Before**
{code}
scala> sql("SELECT 1.0, Now() x ORDER BY 1, x").explain
== Physical Plan ==
WholeStageCodegen
:  +- Sort [1.0#5 ASC,x#0 ASC], true, 0
:     +- INPUT
+- Exchange rangepartitioning(1.0#5 ASC, x#0 ASC, 200), None
   +- WholeStageCodegen
      :  +- Project [1.0 AS 1.0#5,1461873043577000 AS x#0]
      :     +- INPUT
      +- Scan OneRowRelation[]
{code}

**After**
{code}
scala> sql("SELECT 1.0, Now() x ORDER BY 1, x").explain
== Physical Plan ==
WholeStageCodegen
:  +- Project [1.0 AS 1.0#5,1461873079484000 AS x#0]
:     +- INPUT
+- Scan OneRowRelation[]
{code}

  was:
This issue aims to improve `EliminateSorts` optimizer to handle ordinal (case 
2) and alias (case 3). Case 1 is the current implementation.

1. Literals and foldable expression, e.g. "ORDER BY 1.0, 'abc', Now()"
2. Foldable ordinals, e.g. "SELECT 1.0, 'abc', Now() ORDER BY 1, 2, 3"
3. Foldable aliases, e.g. "SELECT 1.0 x, 'abc' y, Now() z ORDER BY x, y, z"

**Before**
{code}
scala> sql("SELECT 1.0, Now() x ORDER BY 1, x").explain
== Physical Plan ==
WholeStageCodegen
:  +- Sort [1.0#5 ASC,x#0 ASC], true, 0
:     +- INPUT
+- Exchange rangepartitioning(1.0#5 ASC, x#0 ASC, 200), None
   +- WholeStageCodegen
      :  +- Project [1.0 AS 1.0#5,1461873043577000 AS x#0]
      :     +- INPUT
      +- Scan OneRowRelation[]
{code}

**After**
{code}
scala> sql("SELECT 1.0, Now() x ORDER BY 1, x").explain
== Physical Plan ==
WholeStageCodegen
:  +- Project [1.0 AS 1.0#5,1461873079484000 AS x#0]
:     +- INPUT
+- Scan OneRowRelation[]
{code}

    Component/s:     (was: SQL)
                 Optimizer
        Summary: [SPARK-14939][SQL] Add FoldablePropagation optimizer  (was: 
Improve EliminateSorts optimizer to handle Ordinal/Alias SortOrders)

> [SPARK-14939][SQL] Add FoldablePropagation optimizer
> ----------------------------------------------------
>
>                 Key: SPARK-14939
>                 URL: https://issues.apache.org/jira/browse/SPARK-14939
>             Project: Spark
>          Issue Type: Improvement
>          Components: Optimizer
>            Reporter: Dongjoon Hyun
>
> This issue aims to add new FoldablePropagation optimizer that propagates 
> foldable expressions up over the logical plan nodes. Currently, ORDER BY, 
> GROUP BY, Nested-SELECT clauses are supported, and aliases and ordinal 
> expressions are the main target to be transformed after propagation. Other 
> optimizations will take advantage of the propagated foldable expressions: 
> e.g. EliminateSorts optimizer now can handle Case 2 and 3. (Case 1 is the 
> previous implementation.)
> 1. Literals and foldable expression, e.g. "ORDER BY 1.0, 'abc', Now()"
> 2. Foldable ordinals, e.g. "SELECT 1.0, 'abc', Now() ORDER BY 1, 2, 3"
> 3. Foldable aliases, e.g. "SELECT 1.0 x, 'abc' y, Now() z ORDER BY x, y, z"
> **Before**
> {code}
> scala> sql("SELECT 1.0, Now() x ORDER BY 1, x").explain
> == Physical Plan ==
> WholeStageCodegen
> :  +- Sort [1.0#5 ASC,x#0 ASC], true, 0
> :     +- INPUT
> +- Exchange rangepartitioning(1.0#5 ASC, x#0 ASC, 200), None
>    +- WholeStageCodegen
>       :  +- Project [1.0 AS 1.0#5,1461873043577000 AS x#0]
>       :     +- INPUT
>       +- Scan OneRowRelation[]
> {code}
> **After**
> {code}
> scala> sql("SELECT 1.0, Now() x ORDER BY 1, x").explain
> == Physical Plan ==
> WholeStageCodegen
> :  +- Project [1.0 AS 1.0#5,1461873079484000 AS x#0]
> :     +- INPUT
> +- Scan OneRowRelation[]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to