[GitHub] [spark] AngersZhuuuu commented on a change in pull request #31522: [SPARK-34399][SQL] Add commit duration to SQL tab's graph node.

GitBox Wed, 24 Feb 2021 23:49:59 -0800


AngersZhuuuu commented on a change in pull request #31522:
URL: https://github.com/apache/spark/pull/31522#discussion_r582610913




##########
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala
##########
@@ -221,7 +221,7 @@ object FileFormatWriter extends Logging {
       val (_, duration) = Utils.timeTakenMs { committer.commitJob(job, 
commitMsgs) }
       logInfo(s"Write Job ${description.uuid} committed. Elapsed time: 
$duration ms.")
 
-      processStats(description.statsTrackers, ret.map(_.summary.stats))
+      processStats(description.statsTrackers, ret.map(_.summary.stats), 
duration)

Review comment:
       Just now, my friend ask me why job finished then cost 80s to job 
committed.
   ```
   21/02/25 15:42:12 INFO DAGScheduler: ResultStage 1 (run at 
AccessController.java:0) finished in 82.189 s
   21/02/25 15:42:12 INFO DAGScheduler: Job 1 finished: run at 
AccessController.java:0, took 84.330846 s
   21/02/25 15:43:38 INFO FileFormatWriter: Job null committed.
   21/02/25 15:43:38 WARN DFSClient: Slow ReadProcessor read fields took 
41202ms (threshold=30000ms); ack: seqno: 140 status: SUCCESS 
downstreamAckTimeNanos: 33201980 4: "\000", targets: [172.16.1.71:9866, 
172.16.1.104:9866, 172.16.1.18:9866, 172.16.1.33:9866]
   ```
   
   His SQL task run 80s, job commit cost 80s and hive metadata load data cost 
100s.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #31522: [SPARK-34399][SQL] Add commit duration to SQL tab's graph node.

Reply via email to