agrawaldevesh commented on a change in pull request #29422:
URL: https://github.com/apache/spark/pull/29422#discussion_r470889645
##########
File path: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
##########
@@ -2022,16 +2024,25 @@ private[spark] class DAGScheduler(
blockManagerMaster.removeExecutor(execId)
clearCacheLocs()
}
- if (fileLost &&
- (!shuffleFileLostEpoch.contains(execId) ||
shuffleFileLostEpoch(execId) < currentEpoch)) {
- shuffleFileLostEpoch(execId) = currentEpoch
- hostToUnregisterOutputs match {
- case Some(host) =>
- logInfo(s"Shuffle files lost for host: $host (epoch $currentEpoch)")
- mapOutputTracker.removeOutputsOnHost(host)
- case None =>
- logInfo(s"Shuffle files lost for executor: $execId (epoch
$currentEpoch)")
- mapOutputTracker.removeOutputsOnExecutor(execId)
+ if (fileLost) {
Review comment:
Will do. Good idea.
Added a comment on the caller. The changes in this function are simply
tweaking the existing logic to honor the newly added flag. So I thought it
would be more interesting to describe why this unconditional forcing is
required when a host is decommissioned.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]