Github user mccheah commented on the pull request:
https://github.com/apache/spark/pull/8007#issuecomment-137294983
Accidentally missed one of @andrewor14 's comments but everything else is
addressed I think.
One point to note: I moved the loss-reason-request-reply logic into
YarnAllocator. In order to find if a get-loss-reason-request is valid such that
the request would only be enqueued if the loss reason were found on a
subsequent allocateResources() call, I needed to check executorIdToContainer in
YarnAllocator. But that would require synchronizing on YarnAllocator... and in
my previous patch, I was also synchronizing on pendingLossReasonRequests in
ApplicationMaster, so there could have potentially been nested synchronized
blocks on different objects which is a little scary.
I just moved pendingExecutorLossReasonRequests into YarnAllocator and put
the context replying in processCompletedContainers() instead, so we only need
to synchronize strictly on YarnAllocator's methods.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]