[GitHub] spark issue #22232: [SPARK-25237][SQL]remove updateBytesReadWithFileSize bec...

maropu Sat, 25 Aug 2018 04:19:24 -0700

Github user maropu commented on the issue:

    https://github.com/apache/spark/pull/22232
  
    I'm not sure we can test the case though, for example, how about the 
sequence below?
    ```
    
    import org.apache.spark.TaskContext
    spark.range(10).selectExpr("id AS c0", "rand() AS 
c1").write.parquet("/tmp/t1")
    val df = spark.read.parquet("/tmp/t1")
    
    val fileScanRdd = 
df.repartition(1).queryExecution.executedPlan.children(0).children(0).execute()
    
    fileScanRdd.mapPartitions { part =>
      println(s"Initial 
bytesRead=${TaskContext.get.taskMetrics().inputMetrics.bytesRead}")
    
      TaskContext.get.addTaskCompletionListener[Unit] { taskCtx =>
        // Check if the metric is correct?
        println(s"Total 
bytesRead=${TaskContext.get.taskMetrics().inputMetrics.bytesRead}")
      }
      part
    }.collect
    ```



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #22232: [SPARK-25237][SQL]remove updateBytesReadWithFileSize bec...

Reply via email to