[GitHub] spark issue #20174: [SPARK-22951][SQL] aggregate should not produce empty ro...

maropu Sat, 06 Jan 2018 19:59:31 -0800

Github user maropu commented on the issue:

    https://github.com/apache/spark/pull/20174
  
    (This is not related to this pr and too trivial things though, I just leave 
comments) `PropagateEmptyRelation` does not collapse 
`spark.emptyDataFrame.dropDuplicates` because `spark.emptyDataFrame` uses 
`ExistingRDD` instead of empty `LocalRelation`;
    
    ```
    scala> spark.emptyDataFrame.dropDuplicates.explain(true)
    == Parsed Logical Plan ==
    Deduplicate
    +- AnalysisBarrier LogicalRDD false
    
    == Analyzed Logical Plan ==
    Deduplicate
    +- LogicalRDD false
    
    == Optimized Logical Plan ==
    Aggregate
    +- LogicalRDD false
    
    == Physical Plan ==
    *HashAggregate(keys=[], functions=[], output=[])
    +- Exchange SinglePartition
       +- *HashAggregate(keys=[], functions=[], output=[])
          +- Scan ExistingRDD[]
    
    scala> Seq.empty[Tuple2[Int, Int]].toDF("a", 
"b").dropDuplicates.explain(true)
    == Parsed Logical Plan ==
    Deduplicate [a#8, b#9]
    +- AnalysisBarrier Project [_1#5 AS a#8, _2#6 AS b#9]
    
    == Analyzed Logical Plan ==
    a: int, b: int
    Deduplicate [a#8, b#9]
    +- Project [_1#5 AS a#8, _2#6 AS b#9]
       +- LocalRelation <empty>, [_1#5, _2#6]
    
    == Optimized Logical Plan ==
    LocalRelation <empty>, [a#8, b#9]
    
    == Physical Plan ==
    LocalTableScan <empty>, [a#8, b#9]
    ```



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #20174: [SPARK-22951][SQL] aggregate should not produce empty ro...

Reply via email to