Perfect. Thanks. SJF
> On Feb 1, 2018, at 10:12 AM, Markus Thoemmes <[email protected]> > wrote: > > Hi Steve, > > fair point and that's exactly what we're trying to exploit here: While the > loadbalancer only sees 8 slots for example on that invoker, the invoker's > logic remains unchanged. It always sees the full-picture of its state and > doesn't care where the load is coming from. So OpenWhisk would (given the > load is slow enough) only provision one container for this action. Both > loadbalancers would operate using the same hashes and and stepSizes and thus > choose the very same invoker (given no other load in the system). > > Cheers. > > -----Stephen Fink <[email protected]> wrote: ----- > To: [email protected] > From: Stephen Fink <[email protected]> > Date: 02/01/2018 03:57PM > Subject: Re: Introducing sharding as an alternative for state sharing > > Hi Markus, > > Suppose I have a client which fires one action repeatedly in a loop. So the > server handles one invocation of this action at a time. > > Under the current system, OW will provision one container for this action, > and it will get reused for every invocation. > > Under horizontal sharding with factor N, it seems OW will provision N > containers — which is sub-optimal for several reasons (cold starts, memory > footprint, consumption of stem cells). > > Do you agree? > > If so — I wonder if there’s a hybrid strategy — suppose the load balancer > operates via horizontal sharding, but the invoke-side logic remains > unchanged. Then an invoker can continue to use all slots available to > maximize container reuse, while the load balancer avoids shared state. > > Make sense? > > SJF > > >> On Feb 1, 2018, at 9:25 AM, Markus Thoemmes <[email protected]> >> wrote: >> >> Hi folks, >> >> we (Christian Bickel and I) just opened a pull-request for comments on >> introducing a notion of sharding instead of sharing the state between our >> controllers (loadbalancers). It also addresses a few deficiencies of the old >> loadbalancer to remove any kinds of bottlenecks there and make it as fast as >> possible. >> >> Commit message for posterity: >> >> The current ContainerPoolBalancer suffers a couple of problems and >> bottlenecks: >> >> 1. Inconsistent state: The data-structures keeping the state for that >> loadbalancer are not thread-safely handled, meaning there can be queuing to >> some invokers even though there is free capacity on other invokers. >> 2. Asynchronously shared state: Sharing the state is needed for a >> high-available deployment of multiple controllers and for horizontal scale >> in those. Said state-sharing makes point 1 even worse and isn't anywhere >> fast enough to be able to efficiently schedule quick bursts. >> 3. Bottlenecks: Getting the state from the outside (like for the >> ActivationThrottle) is a very costly operation (at least in the shared state >> case) and actually bottlenecks the whole invocation path. Getting the >> current state of the invokers is a second bottleneck, where one request is >> made to the corresponding actor for each invocation. >> This new implementation aims to solve the problems mentioned above as >> follows: >> >> 1. All state is local: There is no shared state. Resources are managed >> through horizontal sharding. Horizontal sharding means: The invokers' slots >> are evenly divided between the loadbalancers in existence. If we deploy 2 >> loadbalancers and each invoker has 16 slots, each of the loadbalancers will >> have access to 8 slots on each invoker. >> 2. Slots are given away atomically: When scheduling an activation, the slot >> is immediately assigned to that activation (implemented through Semaphores). >> That means: Even in concurrent schedules, there will not be an overload on >> an invoker as long as there is capacity left on that invoker. >> 3. Asynchronous updates of slow data: Slowly changing data, like a change in >> the invoker's state, is asynchronously handled and updated to a local >> version of the state. Querying the state is as cheap as it can be. >> >> A few words on the implementation details: >> >> We chose to use horizontal sharding (evenly dividing the capacity of each >> invoker) vs. vertical sharding (evenly dividing the invokers as a whole) for >> the sake of staging these changes mainly. Once we divide vertically, we'll >> need a good loadbalancing strategy in front of our controllers themselves, >> to keep unnecessary cold-starts to a minimum and maximize container reuse. >> By dividing horizontally, we maintain the same reuse policies as today and >> can even keep the same loadbalancing strategies intact. Horizontal sharding >> of course only scales so far (maybe to 4 controllers, assuming 16 slots on >> each invoker) but it will give us time to figure out good strategies for >> vertical sharding and learn along the way. For vertical sharding to work, it >> will also be crucial to have the centralized overflow queue to be able to >> offload work between shards through workstealing. All in all: Vertical >> sharding is a much bigger change than horizontal sharding. >> >> We tried to implement everything in a single actor first, but that seemed to >> impose a bottleneck again. Note that this is very frequented code, it needs >> to be very tight. That might not match the actor model too well. >> >> Subsuming everything: This keeps all proposed changes intact (most notably >> Tyson's parallel executions and overloading queue). >> >> A note on the gains made by this: Our non-blocking invoke performance is now >> quite close to the raw Kafka produce performance that we have in the system >> (not that it's good in itself, but that's the next theoretical bottleneck). >> Before the changes, this was roughly bottlenecked to 4k requests per second >> on a sufficiently powerful machine. Blocking invoke performance was roughly >> doubled. >> >> Any comments, thoughts? >> >> Cheers, >> Markus >> > >
