Github user SaintBacchus commented on the pull request:
https://github.com/apache/spark/pull/3771#issuecomment-68321575
what @tgravescs says is close to the scenario, but it happens during the
RM recover after broke down.
```scala
if (finalStatus == FinalApplicationStatus.SUCCEEDED ||
isLastAttempt) {
unregister(finalStatus, finalMsg)
cleanupStagingDir(fs)
}
```
In the code, it won't check the `isLastAttempt` if the `finalStatus` was
`FinalApplicationStatus.SUCCEEDED` .
When the RM recovering happens, it would not check the `isLastAttempt`
since the yarn-client had no chance to change the value of `finalStatus`. It's
going to the `unregister` and this application can't recover itself.
So the yarn-client can't support the RM HA now.(yarn-cluster is OK)
And dividing the `finalStatus` into two parts is an easy way to avoid this
problem and compatible with previous design.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]