Github user squito commented on the issue:

    https://github.com/apache/spark/pull/21068
  
    I mean when `YarnAllocatorBlacklistTracker` decides to blacklist because of 
allocation failures, it doesn't send any message back to the driver -- so the 
driver doesn't have a msg in the logs, nor in the event log nor a UI update.  
So in client mode, the user would need to get AM logs to know what was going on.
    
    Attila wanted to do it this way because of 
`mostRelevantSubsetOfBlacklistedNodes` -- it seemed weird to send an update to 
the driver when the blacklisting wasn't necessarily even in effect.  Though now 
that I'm thinking about this, maybe it should just send the update anyway, even 
though that blacklist may effectively be ignored.
    
    Re: starvation -- I agree, though "eventually" for resources can be so long 
in practice that to users it all looks the same.
    
    Anyway, though you say you're OK with removing the limit, it seems like you 
feel more strongly about this then I do.  So I think we can keep it, I don't 
think it prevents us from doing something else down the road.
    
    I do think we should add the notification to the driver, including a 
listener event, which just ignores `mostRelevantSubsetOfBlacklistedNodes`, 
unless anyone has a reason for not doing it.  I suggest @attilapiros does that 
in a followup.
    
    If that plan sounds OK, then this is probably nearly ready to merge.  But 
its been a little while since I've looked closely so I'll do another pass 
(probably tomorrow).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to