Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/19145
>But if we restart the RM, then, the lost containers in the NM will be
reported to RM as lost again because of recovery
Since you already enabled RM and NM recovery, IIUC the failure of RM/NM
will not lead to container exit. And after RM/NM restart, it will recover the
persistent container metadata, so I think there should be no lost containers
reported. Sorry I'm not so familiar with this part in YARN.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]