Github user vanzin commented on a diff in the pull request:
https://github.com/apache/spark/pull/8007#discussion_r38495305
--- Diff:
yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala ---
@@ -590,6 +605,18 @@ private[spark] class ApplicationMaster(
case None => logWarning("Container allocator is not ready to
kill executors yet.")
}
context.reply(true)
+
+ case GetExecutorLossReason(eid) =>
+ Option(allocator) match {
+ case Some(a) =>
+ pendingLossReasonRequests.synchronized {
+ pendingLossReasonRequests
--- End diff --
When would multiple requests for the same executor arrive?
Because if that case is possible, there's a possible race that would lead
to a slow leak. If the allocator notifies of the executor reason, you'll reply
to the existing requests; but if a new request for that executor's loss reason
arrives after that, it will never be replied to, and forever live in
`pendingLossReasonRequests`.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]