[
https://issues.apache.org/jira/browse/SPARK-26327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16716592#comment-16716592
]
ASF GitHub Bot commented on SPARK-26327:
----------------------------------------
xuanyuanking commented on a change in pull request #23277: [SPARK-26327][SQL]
Bug fix for `FileSourceScanExec` metrics update and name changing
URL: https://github.com/apache/spark/pull/23277#discussion_r240518885
##########
File path:
sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala
##########
@@ -316,7 +313,7 @@ case class FileSourceScanExec(
override lazy val metrics =
Map("numOutputRows" -> SQLMetrics.createMetric(sparkContext, "number of
output rows"),
"numFiles" -> SQLMetrics.createMetric(sparkContext, "number of files"),
- "metadataTime" -> SQLMetrics.createMetric(sparkContext, "metadata time
(ms)"),
+ "fileListingTime" -> SQLMetrics.createMetric(sparkContext, "file listing
time (ms)"),
Review comment:
Thanks, the pr description and title change done.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Metrics in FileSourceScanExec not update correctly while
> relation.partitionSchema is set
> ----------------------------------------------------------------------------------------
>
> Key: SPARK-26327
> URL: https://issues.apache.org/jira/browse/SPARK-26327
> Project: Spark
> Issue Type: Sub-task
> Components: SQL
> Affects Versions: 2.4.0
> Reporter: Yuanjian Li
> Priority: Major
>
> As currently approach in `FileSourceScanExec`, the metrics of "numFiles" and
> "metadataTime"(fileListingTime) were updated while lazy val
> `selectedPartitions` initialized in the scenario of relation.partitionSchema
> is set. But `selectedPartitions` will be initialized by `metadata` at first,
> which is called by `queryExecution.toString` in
> `SQLExecution.withNewExecutionId`. So while the
> `SQLMetrics.postDriverMetricUpdates` called, there's no corresponding
> liveExecutions in SQLAppStatusListener, the metrics update is not work.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]