Github user squito commented on the issue:

    https://github.com/apache/spark/pull/15249
  
    @mridulm on the questions about expiry from blacklists, you are not missing 
anything -- this explictly does not do any timeouts at the taskset level (this 
is mentioned in the design doc).  The timeout code you see is mostly just 
incremental stuff as a step towards https://github.com/apache/spark/pull/14079, 
but doesn't actually add any value here.
    
    The primary motivation for blacklisting that I've seen is actually quite 
different from the use case you are describing -- its not to help deal w/ 
resource contention, but to deal w/ truly broken resources (a bad disk in all 
the cases I can think of).  In fact, in these cases, 1 hour is really short -- 
users really want something more like 6-12 hours probably.  But 1 hr really 
isn't so bad, it just means that the bad resources need to be "rediscovered" 
that often, with a scheduling hiccup while that happens.
    
    This is really different from the use case you are describing -- its a form 
of back off to deal w/ resource contention.  I have actually talked to a couple 
of different folks about doing something like this recently and think it would 
be great, though I see problems with this approach, since it allows other tasks 
to still be scheduled on those executors, and also the time isn't relative to 
the task runtime etc.
    
    Nonetheless, an issue here might be that the old option serves some purpose 
which is no longer supported.  Do we need to add it back in?  Just adding the 
logic for the timeouts again is pretty easy, though
    (a) I need to figure out the right place to do it so that it doesn't impact 
scheduling performance
    
    and more importantly
    
    (b) I really worry about being able to configure things so that 
blacklisting can actually handle totally broken resources.  Eg., say that you 
set the timeout to 10s.  If your tasks take 1 minute each, then your one bad 
executor might cycle through the leftover tasks, fail them all, pass the 
timeout, and repeat that cycle a few times till you go over 
spark.task.maxFailures.  I don't see a good way to deal w/ while setting a 
sensible a timeout for the entire application.
    
    Two other workarounds:
    
    (2) just enable the timeout per-task when the legacy configuration is used. 
 Leave it undocumented.  We don't change behavior then, but configuration is 
kind of a mess, and it'll be a headache to continue to maintain this
    
    (3) Add a timeout just to *taskset* level blacklisting.  So its a behavior 
change from the existing blacklisting, which has a timeout per *task*.  This 
removes the interaction w/ spark.task.maxFailures that we've always got to 
tiptoe around.  I also think it might satisfy your use case even better.  I 
still don't think its a great solution to the problem, and we need something 
else for handling this sort of backoff better, so I don't feel great about it 
getting shoved into this feature.
    
    I'm thinking (3) is the best but will give it a bit more thought.  Also 
@kayousterhout @tgravescs @markhamstra for opinions as well since this is a 
bigger design point to consider.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to