Allocation algorithm

Hans van den Bogert Mon, 24 Aug 2015 08:43:13 -0700

Can anyone tell how the Mesos allocation algorithm works:
Does Mesos offer every free resource it has to one framework at a time? Or does 
the allocator divide the max offer size by the amount of active/registered 
frameworks?
  and
in case of:
  FW1 has a high dominant resource fraction (>50%), which it does not release. 
FW2 and FW3 have a lot of churn for their tasks, both have outstanding short 
lived tasks in their queue (shorter than the mesos allocation interval), these 
2 FWs accept all resources Mesos has to offer - if they get the offer.
 
Reading the DRF paper and presentation, am I to assume the online DRF algorithm 
would favour FW2 and FW3 always before FW1? As one of the two (FW2/3) will 
always (or at least more likely to,) have a lower dominant resource than FW1. 
According to the presentation on DRF, the framework with the lowest dominant 
resource gets the offer. But this is a potential starvation e.g., if a 
framework has allocated memory, but needs a new offer with CPUs to actually do 
something. You might wonder why the framework didn’t use memory AND cpu from 
the same offer, but Spark for example does exactly this.


To give some context, I think I’m seeing this behaviour with Spark in 
fine-grained mode. I have 4 spark instances which are long-lived, emulating 
interactive queries. The first Spark instance to get an offer “installs” 
executors (with high memory demand) on every slave node it sees. The next 
framework tries to do the same, but for these later instances, theres not 
always enough executor memory, that’s why I end up with an instance, which was 
first to get the offer, with a lot of memory it doesn’t let go, but it also 
gets way less offers for CPU afterwards. In contrast the later spark instances 
with less long-living executors do not have a high memory usage, and get 
relatively more CPU offers. 
Of course setting a max amount  of  Spark executors per framework instance 
would mitigate this, but then I’m basically back to static allocation of 
resources.

Thanks in advance, 

Hans

Allocation algorithm

Reply via email to