Re: Loadbalancer Improvements

Michael Marth Tue, 13 Jun 2017 01:43:11 -0700

Hi Markus,

I like the approach you have taken. Here’s a more general comment (relevant not 
to only this PR):
The stated motivation for the PR is to improve performance. I have no doubt 
that the (average) performance of action invocation is increased by this PR for 
most relevant traffic patterns one would see in real life. However, I think you 
make some implicit assumptions on what these traffic patterns look like (and 
also what the deployed topology looks like).


Which brings me to the actual comment :) - it would be great if there was also 
a performance test case that simulates the traffic patterns you have in mind. 
That would make it easier to discuss the improvement. (e.g. Some test like your 
repo [1])

(again, I have no doubt in this case that the PR will help performance)

my2c
Michael

[1] https://github.com/markusthoemmes/openwhisk-performance/pull/1


From: Markus Thömmes <[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Monday 12 June 2017 20:20
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Loadbalancer Improvements

Hey folks,

it's me again with the latest news on performance :).

As some of you probably now: Our current loadbalancer strategy is quite 
"simple" and doesn't take load in the system into account at all. It hops to 
the next available invoker after you've invoked an action X times (where X is a 
fixed value defined at deployment time). For many many cases that's suboptimal 
behavior and induces lots of cold-starts, even in a fairly unused system.

To improve on this here is a proposal to take the loadbalancer state we already 
have and make something out of it.

In a nutshell, the plan is: Before you schedule to an invoker, take into 
account how much load is on the invoker you want to schedule to. If it seems 
full already (determined by outstanding active-ack responses) search for 
another invoker.
Via hashing, we define a home invoker to for every subject/action combination. 
That is the invoker with the highest probability of having a warm container for 
that action. If that invoker is already busy, choose another invoker. 
"Stepping" through the invokers should be stable as well, as in: For a given 
subject/action it should always try the invokers in the same order. That way, 
the probability of getting a warm container is higher than if we chose 
randomly, but of course it gets lower the more "hops" you need to make.
The step-width is determined via hashing into a series of coprime numbers to 
the amount of invokers in the system to minimize collisions and chasing.

The proposal is expected to lead to a more stable warm-container rate and lead 
to a better utilization of the system as a whole.

I already took a stab at implementing the proposal above. The pull-request can 
be found here: https://github.com/apache/incubator-openwhisk/pull/2360

As always, comments, objections, praise. All feedback is very welcome :)

Cheers,
Markus

Re: Loadbalancer Improvements

Reply via email to