Github user sebastienrainville commented on the pull request:

    https://github.com/apache/spark/pull/10924#issuecomment-216688931
  
    @mgummelt the problem we were seeing is when running many spark apps (~20) 
and most of them small (long lived small streaming apps), then the bigger apps 
just get allocated a small number of cores even though the cluster still has a 
lot of available cores. In that scenario the big apps are not actually 
receiving offers from Mesos anymore, and that's because the small apps have a 
much smaller "max share" so they get the offers first. With a low number of 
apps it's okay because with the default `refuse_seconds` value of 5 seconds 
it's enough time for Mesos to cycle through every app and send offers to each 
of them. But as the number of apps increases it becomes more and more 
problematic, to the point where Mesos stop sending offers to the apps ranked 
the lowest by DRF, i.e. the big apps.
    
    The solution implemented in this PR is to refuse the offers for a long 
period of time when we know that we don't need offers anymore because the app 
already acquired `spark.cores.max`. The only case where we would need to 
acquire more cores is if we lost an executor, so a value of `120s` for 
`refuse_seconds` seems like a good tradeoff.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to