AngersZhuuuu commented on a change in pull request #31522:
URL: https://github.com/apache/spark/pull/31522#discussion_r582610913
##########
File path:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala
##########
@@ -221,7 +221,7 @@ object FileFormatWriter extends Logging {
val (_, duration) = Utils.timeTakenMs { committer.commitJob(job,
commitMsgs) }
logInfo(s"Write Job ${description.uuid} committed. Elapsed time:
$duration ms.")
- processStats(description.statsTrackers, ret.map(_.summary.stats))
+ processStats(description.statsTrackers, ret.map(_.summary.stats),
duration)
Review comment:
Just now, my friend ask me why job finished then cost 80s to job
committed.
```
21/02/25 15:42:12 INFO DAGScheduler: ResultStage 1 (run at
AccessController.java:0) finished in 82.189 s
21/02/25 15:42:12 INFO DAGScheduler: Job 1 finished: run at
AccessController.java:0, took 84.330846 s
21/02/25 15:43:38 INFO FileFormatWriter: Job null committed.
21/02/25 15:43:38 WARN DFSClient: Slow ReadProcessor read fields took
41202ms (threshold=30000ms); ack: seqno: 140 status: SUCCESS
downstreamAckTimeNanos: 33201980 4: "\000", targets: [172.16.1.71:9866,
172.16.1.104:9866, 172.16.1.18:9866, 172.16.1.33:9866]
```
His SQL task run 80s, job commit cost 80s and hive metadata load data cost
100s.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]