Github user squito commented on the issue: https://github.com/apache/spark/pull/21068 I mean when `YarnAllocatorBlacklistTracker` decides to blacklist because of allocation failures, it doesn't send any message back to the driver -- so the driver doesn't have a msg in the logs, nor in the event log nor a UI update. So in client mode, the user would need to get AM logs to know what was going on. Attila wanted to do it this way because of `mostRelevantSubsetOfBlacklistedNodes` -- it seemed weird to send an update to the driver when the blacklisting wasn't necessarily even in effect. Though now that I'm thinking about this, maybe it should just send the update anyway, even though that blacklist may effectively be ignored. Re: starvation -- I agree, though "eventually" for resources can be so long in practice that to users it all looks the same. Anyway, though you say you're OK with removing the limit, it seems like you feel more strongly about this then I do. So I think we can keep it, I don't think it prevents us from doing something else down the road. I do think we should add the notification to the driver, including a listener event, which just ignores `mostRelevantSubsetOfBlacklistedNodes`, unless anyone has a reason for not doing it. I suggest @attilapiros does that in a followup. If that plan sounds OK, then this is probably nearly ready to merge. But its been a little while since I've looked closely so I'll do another pass (probably tomorrow).
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org