mdeuser commented on a change in pull request #3778: Add documentation to the 
loadbalancer.
URL: 
https://github.com/apache/incubator-openwhisk/pull/3778#discussion_r197457667
 
 

 ##########
 File path: 
core/controller/src/main/scala/whisk/core/loadBalancer/ShardingContainerPoolBalancer.scala
 ##########
 @@ -45,10 +45,77 @@ import scala.concurrent.{ExecutionContext, Future, Promise}
 import scala.util.{Failure, Success}
 
 /**
- * A loadbalancer that uses "horizontal" sharding to not collide with fellow 
loadbalancers.
+ * A loadbalancer that schedules workload based on a hashing-algorithm.
+ *
+ * ## Algorithm
+ *
+ * The loadbalancer schedules workload based on a hashing-algorithm. At first, 
for every namespace + action pair a hash
+ * is calculated and then bounded to the number of invokers available. The 
determined index is the so called "home
+ * invoker". This is the invoker where the following progression will 
**always** start. If this invoker is healthy and
+ * if there is capacity on that invoker, the request is scheduled to it. If 
one of these prerequisites is not true,
+ * the index is incremented by a step-size. The step-sizes available are the 
all coprime numbers smaller than the amount
+ * of invokers available (coprime, to minimize collisions while progressing 
through the invokers). The step-size is
+ * picked by the same hash calculated above, bounded to the number of 
step-sizes available. The home-invoker-index is
+ * now incremented by the step-size and the checks (healthy + capacity) are 
done on the invoker we land on now. This
+ * procedure is repeated until all invokers have been checked at which point 
the "overload" strategy will be employed,
+ * which is to choose a healthy invoker randomly.
+ *
+ * An example:
+ * - availableInvokers: 10 (all healthy)
+ * - stepSizes: 1, 3, 7 (note how 2 and 5 is not part of this because it's not 
coprime to 10)
+ * - hash: 13
+ * - homeInvoker: 13 % 10 = 3
+ * - stepSizeIndex: 13 % 3 = 1 => stepSize = 3
+ *
+ * Progression to check the invokers: 3, 6, 9, 2, 5, 8, 1, 4, 7, 0 --> done
+ *
+ * This heuristic is based on the assumption, that the chance to get a warm 
container is the best on the home invoker
+ * and degrades the more hops you make. The hashing makes sure that all 
loadbalancers in a cluster will always pick the
+ * same home invoker and do the same progression for a given action.
+ *
+ * Known caveats:
+ * - This assumption is not always true. For instance, two heavy workloads 
landing on the same invoker can override each
+ *   other, which results in many cold starts due to all containers being 
evicted by the invoker to make space for the
+ *   "other" workload respectively. Future work could be to keep a buffer of 
invokers last scheduled for each action and
+ *   to prefer to pick that one. Then the second-last one and so forth.
+ *
+ * ## Capacity checking
+ *
+ * Capacity is determined by what the loadbalancer thinks it scheduled to each 
invoker. Upon scheduling, an entry is
+ * made to update the books and a slot in a Semaphore is taken. That Semaphore 
is only released after the response from
+ * the invoker (active-ack) arrives **or** after the active-ack times out.
+ *
+ * Known caveats:
+ * - In an overload scenario, activations are queued directly to the invokers, 
which makes the active-ack timeout
+ *   unpredictable. Timing out active-acks in that case can cause the 
loadbalancer to prematurely assign new load to an
+ *   overloaded invoker, which can cause uneven queues.
+ * - The same is true if an invoker gets slow due to whatever reason (docker 
getting slow, machine overloaded). The
+ *   queue on this invoker will slowly rise if it gets slow to the point of 
still sending pings, but handling the load
+ *   so slowly, that the active-acks time out. The loadbalancer again will 
think there is capacity, when there is none.
+ *   Both caveats could be solved in future work by no queueing to invoker 
topics on overload, but to queue on a
 
 Review comment:
   "... in future work by ~no~ not queuing to invoker..."

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to