agrawaldevesh commented on a change in pull request #29452:
URL: https://github.com/apache/spark/pull/29452#discussion_r475815059



##########
File path: 
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala
##########
@@ -926,18 +926,21 @@ private[spark] class TaskSchedulerImpl(
         // and some of those can have isHostDecommissioned false. We merge 
them such that
         // if we heard isHostDecommissioned ever true, then we keep that one 
since it is
         // most likely coming from the cluster manager and thus authoritative
-        val oldDecomInfo = executorsPendingDecommission.get(executorId)
-        if (!oldDecomInfo.exists(_.isHostDecommissioned)) {
-          executorsPendingDecommission(executorId) = decommissionInfo
+        val oldDecomState = executorsPendingDecommission.get(executorId)
+        if (!oldDecomState.exists(_.isHostDecommissioned)) {
+          executorsPendingDecommission(executorId) = ExecutorDecommissionState(
+            decommissionInfo.message,
+            oldDecomState.map(_.startTime).getOrElse(clock.getTimeMillis()),

Review comment:
       Actually as I was writing this I realized that this logic is really 
complicated for no good reason. I think the whole pithyness of using `exists` 
here is a bad idea and I reverted to a plain and simple if-else to better 
follow along the code. 
   
   I also realized a bug here: That we are keeping the new message but the old 
timestamp. We should consistently keep the old message with the old timestamp.
   
   Longer term, I was wondering what do you think of adding a 
`fromClusterManager` field to the `ExecutorDecommissionInfo` that will be set 
to false if this is from the executor (perhaps in response to a SIGPWR there) ? 
I tried doing this in this PR but it blew up and I backed it out. Basically we 
should then use this field `fromClusterManager` to keep track of whether the 
prior decommission state is authoritative or not. Currently it is a bit hacked 
by using `isHostDecommissioned` to infer this -- as stated in the code comments.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to