cloud-fan commented on a change in pull request #27021: [SPARK-30362][Core]
Update InputMetrics in DataSourceRDD
URL: https://github.com/apache/spark/pull/27021#discussion_r361587858
##########
File path:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceRDD.scala
##########
@@ -56,6 +58,15 @@ class DataSourceRDD(
context.addTaskCompletionListener[Unit](_ => reader.close())
val iter = new Iterator[Any] {
private[this] var valuePrepared = false
+ private val inputMetrics = context.taskMetrics().inputMetrics
+ private val existingBytesRead = inputMetrics.bytesRead
Review comment:
I'm not sure if we can support this if the data source don't report the size
metrics. AFAIK ds v1 doesn't support it either. We only support it in file
source.
I think we can only support the "recordsRead" metrics for now. We need to
design a general API for data sources to report metrics. cc @rdblue @brkyvz
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]