warrenzhu25 commented on a change in pull request #29082:
URL: https://github.com/apache/spark/pull/29082#discussion_r494468057



##########
File path: core/src/main/scala/org/apache/spark/status/AppStatusListener.scala
##########
@@ -597,6 +598,20 @@ private[spark] class AppStatusListener(
           None
       }
       task.errorMessage = errorMessage
+      task.failureReason = event.reason match {

Review comment:
       Yes, in previous version, I directly passed error message to build 
failure reason, but I found it hard, complex and error-prone. This due to 
different failure reason has different format:
   1. `ExceptionFailure` is most common and easy one. The message is only 
exception's message, we can parse it from errorString.
   2. `ExecutorLostFailure` errorString has format like `ExecutorLostFailure 
(executor 72 exited caused by one of the running tasks) Reason: Executor 
heartbeat timed out after 123153 ms`. We can split the message out using 
keyword `Reason:`
   3. `FetchFailed` errorString has format like `FetchFailed($bmAddressString, 
shuffleId=$shuffleId, mapId=$mapId, reduceId=$reduceId, message=\n$message\n)`. 
We need to parse message from this.
   
   In summary, parsing will rely on and couple with specific format of 
different failure reason. If in the future, the format changed, it's easy to 
break. So I chose the solution to directly get from TaskEndReason. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to