Github user mxm commented on the issue:
https://github.com/apache/flink/pull/2257
Thank you for the pull request! Looking at the changes, it looks like it
could have been broken up into two pull requests and jira issues. 1) Avoiding
duplicate RegisterTaskManager messages 2) Changing core behavior of the
ResourceManager.
Concerning 2, I would like to understand why it was necessary to change so
much code. It seems like it would have sufficed to change one line of code (not
clearing the bookkeeping on leader ship change). I'm not saying your changes
don't make sense but I don't think they are backed by the original JIRA issue.
I'm not sure about the role change of the RM in this PR. The RM should be
the authority for allocating new resources. If those resources are not properly
reported back to the RM (e.g. message loss), the resource allocation won't work
properly.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---