Revisiting this, from the starting point of: - ignoring controller - ignoring partitioning case
Can we provide a checkout system for ids? e.g.: - on start, acquire their id - on stop, release their id In a simple impl, we can leave it based on a int, but this becomes a bit arbitrary from invoker point of view - but is convenient that it works with existing load balancer. For a more detailed example, use a counter in zk: - invokera start, counter is 0 - invokera increments the counter (0->1) - invokera id is 0 - invokerb start, counter is 1 - invokerb increments the counter (1->2) - invokerb id is 1 ... The later one, we want to restart invokerb: - invokerb stop - invokerb decrements the counter (2 -> 1) - invokerb start, counter is 1 - invokerb increments the counter - invokerb id is 1 Now b has restarted, and is again id=1, but no name is used to associate this id. The ramifications are: * during restart, a stop must be processed before restarting * during restart, if multiple instances are restarting, the id previously used may end up on a different host after restart is complete * during a hard crash, the counter must be manually adjusted The last point is problematic, but may be solvable with a more complicated approach that checks cluster status (for kube/mesos) to determine cases where counter grows > number of instances, etc. Alternatively, if we can avoid invokers being tracked as int, we can more easily use some other id, so this may be a temporary solution. [ Full content available at: https://github.com/apache/incubator-openwhisk/issues/3858 ] This message was relayed via gitbox.apache.org for [email protected]
