Github user maropu commented on the issue:
https://github.com/apache/spark/pull/22232
I'm not sure we can test the case though, for example, how about the
sequence below?
```
import org.apache.spark.TaskContext
spark.range(10).selectExpr("id AS c0", "rand() AS
c1").write.parquet("/tmp/t1")
val df = spark.read.parquet("/tmp/t1")
val fileScanRdd =
df.repartition(1).queryExecution.executedPlan.children(0).children(0).execute()
fileScanRdd.mapPartitions { part =>
println(s"Initial
bytesRead=${TaskContext.get.taskMetrics().inputMetrics.bytesRead}")
TaskContext.get.addTaskCompletionListener[Unit] { taskCtx =>
// Check if the metric is correct?
println(s"Total
bytesRead=${TaskContext.get.taskMetrics().inputMetrics.bytesRead}")
}
part
}.collect
```
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]