Re: Introducing sharding as an alternative for state sharing

Stephen Fink Thu, 01 Feb 2018 07:17:31 -0800

Perfect.  Thanks.

SJF


> On Feb 1, 2018, at 10:12 AM, Markus Thoemmes <[email protected]> 
> wrote:
> 
> Hi Steve,
> 
> fair point and that's exactly what we're trying to exploit here: While the 
> loadbalancer only sees 8 slots for example on that invoker, the invoker's 
> logic remains unchanged. It always sees the full-picture of its state and 
> doesn't care where the load is coming from. So OpenWhisk would (given the 
> load is slow enough) only provision one container for this action. Both 
> loadbalancers would operate using the same hashes and and stepSizes and thus 
> choose the very same invoker (given no other load in the system).
> 
> Cheers.                        
> 
> -----Stephen Fink <[email protected]> wrote: -----
> To: [email protected]
> From: Stephen Fink <[email protected]>
> Date: 02/01/2018 03:57PM
> Subject: Re: Introducing sharding as an alternative for state sharing
> 
> Hi Markus,
> 
> Suppose I have a client which fires one action repeatedly in a loop.   So the 
> server handles one invocation of this action at a time.
> 
> Under the current system,  OW will provision one container for this action, 
> and it will get reused for every invocation.
> 
> Under horizontal sharding with factor N,  it seems OW will provision N 
> containers — which is sub-optimal for several reasons (cold starts, memory 
> footprint, consumption of stem cells).
> 
> Do you agree?
> 
> If so — I wonder if there’s a hybrid strategy — suppose the load balancer 
> operates via horizontal sharding, but the invoke-side logic remains 
> unchanged.   Then an invoker can continue to use all slots available to 
> maximize container reuse, while the load balancer avoids shared state.
> 
> Make sense?
> 
> SJF
> 
> 
>> On Feb 1, 2018, at 9:25 AM, Markus Thoemmes <[email protected]> 
>> wrote:
>> 
>> Hi folks,
>> 
>> we (Christian Bickel and I) just opened a pull-request for comments on 
>> introducing a notion of sharding instead of sharing the state between our 
>> controllers (loadbalancers). It also addresses a few deficiencies of the old 
>> loadbalancer to remove any kinds of bottlenecks there and make it as fast as 
>> possible.
>> 
>> Commit message for posterity:
>> 
>> The current ContainerPoolBalancer suffers a couple of problems and 
>> bottlenecks:
>> 
>> 1. Inconsistent state: The data-structures keeping the state for that 
>> loadbalancer are not thread-safely handled, meaning there can be queuing to 
>> some invokers even though there is free capacity on other invokers.
>> 2. Asynchronously shared state: Sharing the state is needed for a 
>> high-available deployment of multiple controllers and for horizontal scale 
>> in those. Said state-sharing makes point 1 even worse and isn't anywhere 
>> fast enough to be able to efficiently schedule quick bursts.
>> 3. Bottlenecks: Getting the state from the outside (like for the 
>> ActivationThrottle) is a very costly operation (at least in the shared state 
>> case) and actually bottlenecks the whole invocation path. Getting the 
>> current state of the invokers is a second bottleneck, where one request is 
>> made to the corresponding actor for each invocation.
>> This new implementation aims to solve the problems mentioned above as 
>> follows:
>> 
>> 1. All state is local: There is no shared state. Resources are managed 
>> through horizontal sharding. Horizontal sharding means: The invokers' slots 
>> are evenly divided between the loadbalancers in existence. If we deploy 2 
>> loadbalancers and each invoker has 16 slots, each of the loadbalancers will 
>> have access to 8 slots on each invoker.
>> 2. Slots are given away atomically: When scheduling an activation, the slot 
>> is immediately assigned to that activation (implemented through Semaphores). 
>> That means: Even in concurrent schedules, there will not be an overload on 
>> an invoker as long as there is capacity left on that invoker.
>> 3. Asynchronous updates of slow data: Slowly changing data, like a change in 
>> the invoker's state, is asynchronously handled and updated to a local 
>> version of the state. Querying the state is as cheap as it can be.
>> 
>> A few words on the implementation details:
>> 
>> We chose to use horizontal sharding (evenly dividing the capacity of each 
>> invoker) vs. vertical sharding (evenly dividing the invokers as a whole) for 
>> the sake of staging these changes mainly. Once we divide vertically, we'll 
>> need a good loadbalancing strategy in front of our controllers themselves, 
>> to keep unnecessary cold-starts to a minimum and maximize container reuse. 
>> By dividing horizontally, we maintain the same reuse policies as today and 
>> can even keep the same loadbalancing strategies intact. Horizontal sharding 
>> of course only scales so far (maybe to 4 controllers, assuming 16 slots on 
>> each invoker) but it will give us time to figure out good strategies for 
>> vertical sharding and learn along the way. For vertical sharding to work, it 
>> will also be crucial to have the centralized overflow queue to be able to 
>> offload work between shards through workstealing. All in all: Vertical 
>> sharding is a much bigger change than horizontal sharding.
>> 
>> We tried to implement everything in a single actor first, but that seemed to 
>> impose a bottleneck again. Note that this is very frequented code, it needs 
>> to be very tight. That might not match the actor model too well.
>> 
>> Subsuming everything: This keeps all proposed changes intact (most notably 
>> Tyson's parallel executions and overloading queue).
>> 
>> A note on the gains made by this: Our non-blocking invoke performance is now 
>> quite close to the raw Kafka produce performance that we have in the system 
>> (not that it's good in itself, but that's the next theoretical bottleneck). 
>> Before the changes, this was roughly bottlenecked to 4k requests per second 
>> on a sufficiently powerful machine. Blocking invoke performance was roughly 
>> doubled.
>> 
>> Any comments, thoughts?
>> 
>> Cheers,
>> Markus
>> 
> 
>

Re: Introducing sharding as an alternative for state sharing

Reply via email to