Re: Introducing sharding as an alternative for state sharing

Stephen Fink Thu, 01 Feb 2018 06:57:52 -0800

Hi Markus,

Suppose I have a client which fires one action repeatedly in a loop.   So the 
server handles one invocation of this action at a time.


Under the current system,  OW will provision one container for this action, and 
it will get reused for every invocation.

Under horizontal sharding with factor N,  it seems OW will provision N 
containers — which is sub-optimal for several reasons (cold starts, memory 
footprint, consumption of stem cells).

Do you agree?

If so — I wonder if there’s a hybrid strategy — suppose the load balancer 
operates via horizontal sharding, but the invoke-side logic remains unchanged.  
 Then an invoker can continue to use all slots available to maximize container 
reuse, while the load balancer avoids shared state.

Make sense?

SJF


> On Feb 1, 2018, at 9:25 AM, Markus Thoemmes <[email protected]> 
> wrote:
> 
> Hi folks,
> 
> we (Christian Bickel and I) just opened a pull-request for comments on 
> introducing a notion of sharding instead of sharing the state between our 
> controllers (loadbalancers). It also addresses a few deficiencies of the old 
> loadbalancer to remove any kinds of bottlenecks there and make it as fast as 
> possible.
> 
> Commit message for posterity:
> 
> The current ContainerPoolBalancer suffers a couple of problems and 
> bottlenecks:
> 
> 1. Inconsistent state: The data-structures keeping the state for that 
> loadbalancer are not thread-safely handled, meaning there can be queuing to 
> some invokers even though there is free capacity on other invokers.
> 2. Asynchronously shared state: Sharing the state is needed for a 
> high-available deployment of multiple controllers and for horizontal scale in 
> those. Said state-sharing makes point 1 even worse and isn't anywhere fast 
> enough to be able to efficiently schedule quick bursts.
> 3. Bottlenecks: Getting the state from the outside (like for the 
> ActivationThrottle) is a very costly operation (at least in the shared state 
> case) and actually bottlenecks the whole invocation path. Getting the current 
> state of the invokers is a second bottleneck, where one request is made to 
> the corresponding actor for each invocation.
> This new implementation aims to solve the problems mentioned above as follows:
> 
> 1. All state is local: There is no shared state. Resources are managed 
> through horizontal sharding. Horizontal sharding means: The invokers' slots 
> are evenly divided between the loadbalancers in existence. If we deploy 2 
> loadbalancers and each invoker has 16 slots, each of the loadbalancers will 
> have access to 8 slots on each invoker.
> 2. Slots are given away atomically: When scheduling an activation, the slot 
> is immediately assigned to that activation (implemented through Semaphores). 
> That means: Even in concurrent schedules, there will not be an overload on an 
> invoker as long as there is capacity left on that invoker.
> 3. Asynchronous updates of slow data: Slowly changing data, like a change in 
> the invoker's state, is asynchronously handled and updated to a local version 
> of the state. Querying the state is as cheap as it can be.
> 
> A few words on the implementation details:
> 
> We chose to use horizontal sharding (evenly dividing the capacity of each 
> invoker) vs. vertical sharding (evenly dividing the invokers as a whole) for 
> the sake of staging these changes mainly. Once we divide vertically, we'll 
> need a good loadbalancing strategy in front of our controllers themselves, to 
> keep unnecessary cold-starts to a minimum and maximize container reuse. By 
> dividing horizontally, we maintain the same reuse policies as today and can 
> even keep the same loadbalancing strategies intact. Horizontal sharding of 
> course only scales so far (maybe to 4 controllers, assuming 16 slots on each 
> invoker) but it will give us time to figure out good strategies for vertical 
> sharding and learn along the way. For vertical sharding to work, it will also 
> be crucial to have the centralized overflow queue to be able to offload work 
> between shards through workstealing. All in all: Vertical sharding is a much 
> bigger change than horizontal sharding.
> 
> We tried to implement everything in a single actor first, but that seemed to 
> impose a bottleneck again. Note that this is very frequented code, it needs 
> to be very tight. That might not match the actor model too well.
> 
> Subsuming everything: This keeps all proposed changes intact (most notably 
> Tyson's parallel executions and overloading queue).
> 
> A note on the gains made by this: Our non-blocking invoke performance is now 
> quite close to the raw Kafka produce performance that we have in the system 
> (not that it's good in itself, but that's the next theoretical bottleneck). 
> Before the changes, this was roughly bottlenecked to 4k requests per second 
> on a sufficiently powerful machine. Blocking invoke performance was roughly 
> doubled.
> 
> Any comments, thoughts?
> 
> Cheers,
> Markus
>

Re: Introducing sharding as an alternative for state sharing

Reply via email to