[GitHub] spark pull request #23128: [SPARK-26142][SQL] Support passing shuffle metric...

xuanyuanking Tue, 27 Nov 2018 04:45:01 -0800

Github user xuanyuanking commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23128#discussion_r236646423
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/ShuffledRowRDD.scala ---
    @@ -154,7 +156,14 @@ class ShuffledRowRDD(
     
       override def compute(split: Partition, context: TaskContext): 
Iterator[InternalRow] = {
         val shuffledRowPartition = split.asInstanceOf[ShuffledRowRDDPartition]
    -    val metrics = context.taskMetrics().createTempShuffleReadMetrics()
    +    val tempMetrics = context.taskMetrics().createTempShuffleReadMetrics()
    +    // metrics here could be empty cause user can use ShuffledRowRDD 
directly,
    --- End diff --
    
    ```
    do you mean we may leave the metrics empty when creating ShuffledRowRDD in 
tests?
    ```
    Yes, like we did in `UnsafeRowSerializerSuite`.
    ```
    I don't think we need to consider this case since ShuffledRowRDD is a 
private API
    ```
    Got it, after search `new ShuffledRowRDD` in all source code, 
`UnsafeRowSerializerSuite` is the only place, I'll change the test and delete 
the default value of `metrics` in this commit.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23128: [SPARK-26142][SQL] Support passing shuffle metric...

Reply via email to