[GitHub] spark pull request: [SPARK-15495][SQL][WIP] Improve the explain ou...

clockfly Mon, 23 May 2016 18:29:39 -0700

GitHub user clockfly opened a pull request:

    https://github.com/apache/spark/pull/13271


    [SPARK-15495][SQL][WIP] Improve the explain output

    ## What changes were proposed in this pull request?
    
    Improve the output of explain:
    Now, it looks like this:
    ```
    scala> Seq((1,2,3)).toDF("a","b","c").createTempView("src")
    
    scala> spark.sql("select src.a,sum(src1.b) from src join src1 where 
src.a=src1.b group by src.a").explain(true)
    == Parsed Logical Plan ==
    Aggregate groupingExpressions=['src.a], 
aggregateExpressions=['src.a,unresolvedalias('sum('src1.b))]
    +- Filter ('src.a = 'src1.b)
       +- Join Inner
          :- UnresolvedRelation `src`
          +- UnresolvedRelation `src1`
    
    == Analyzed Logical Plan ==
    a: int, sum(b): bigint
    Aggregate groupingExpressions=[a#25: int], aggregateExpressions=[a#25: 
int,sum AS sum(b)#45L]
    +- Filter (a#25: int = b#8: int)
       +- Join Inner
          :- SubqueryAlias alias=src
          :  +- Project projectList=[_1#21: int AS a#25,_2#22: int AS 
b#26,_3#23: int AS c#27]
          :     +- LocalRelation output=[_1#21: int,_2#22: int,_3#23: int]
          +- SubqueryAlias alias=src1
             +- Project projectList=[_1#3: int AS a#7,_2#4: int AS b#8,_3#5: 
int AS c#9]
                +- LocalRelation output=[_1#3: int,_2#4: int,_3#5: int]
    
    == Optimized Logical Plan ==
    Aggregate groupingExpressions=[a#25: int], aggregateExpressions=[a#25: 
int,sum AS sum(b)#45L]
    +- Join Inner, Some((a#25: int = b#8: int))
       :- LocalRelation output=[a#25: int]
       +- LocalRelation output=[b#8: int]
    
    == Physical Plan ==
    *TungstenAggregate key=[a#25: int], functions=[sum(Final)], output=[a#25: 
int,sum(b)#45: bigint]
    +- Exchange hashpartitioning(expressions=[a#25: int], numPartitions=200), 
output=[a#25: int,sum#47: bigint]
       +- *TungstenAggregate key=[a#25: int], functions=[sum(Partial)], 
output=[a#25: int,sum#47: bigint]
          +- *BroadcastHashJoin leftKeys=[a#25: int], rightKeys=[b#8: int], 
Inner, BuildRight, output=[a#25: int,b#8: int]
             :- LocalTableScan rows=[[1]], output=[a#25: int]
             +- BroadcastExchange 
HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint))), 
output=[b#8: int]
                +- LocalTableScan rows=[[2]], output=[b#8: int]
    
    scala> spark.sql("select src.a,sum(src1.b) from src join src1 where 
src.a=src1.b group by src.a").explain()
    == Physical Plan ==
    *TungstenAggregate [a#25: int], [sum]
    +- Exchange hashpartitioning([a#25: int], 200)
       +- *TungstenAggregate [a#25: int], [sum]
          +- *BroadcastHashJoin [a#25: int], [b#8: int], Inner, BuildRight
             :- LocalTableScan [[1]]
             +- BroadcastExchange 
HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)))
                +- LocalTableScan [[2]]
    ```
    
    
    ## How was this patch tested?
    Manual test.
    
    (If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)
    
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/clockfly/spark verbose3

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/13271.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #13271
    
----
commit d48c35b9d30359d9eae21ec1c0ee6ed917498d28
Author: Sean Zhong <[email protected]>
Date:   2016-05-11T18:06:52Z

    Improve the explain output

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-15495][SQL][WIP] Improve the explain ou...

Reply via email to