Hi Dave - 
I think there should be a distinction between persistent stores and transient 
stores. 
This is slightly confusing because I think all these technology provide some 
notion of both to varying extents.

I would propose some guidelines around usage, like:
- use zookeeper for remote durable persistent metadata (will survive restart of 
invoker cluster AND the store); not used anywhere afaik
- use redis for remote non-durable transient metadata (will survive restart of 
invoker cluster, but NOT the store); used by dynamic invoker id and apigateway
- use akka distributed data for local (but replicated) transient metadata (will 
survive restart IFF at least one node is available at all times); used by 
controller state tracking

With that said I’m not sure there is a way to reduce these (unless kafka goes 
away), but definitely it should be clear reasoning why one is used over 
another. 

I’m not sure the impact of clustering the invokers (other than the same seed 
node related issues), but if there are none, it might be worth considering 
(instead of redis), unless there is a special requirement to restart all 
invokers at once without losing state, or similar. I would favor solving that 
restart problem (for akka clustering) than requiring the extra storage system 
(redis) 

Regarding the PR, I’m ok with it as long as its not required to use redis; I 
think it currently allows the existing behavior of assigned invoker id, 
skipping the redis usage. If/when invoker clustering is enable it may be good 
to revisit this.

Tyson


> On Oct 4, 2017, at 2:17 PM, David P Grove <[email protected]> wrote:
> 
> 
> This was a topic of discussion on last week's technical interchange call
> that we decided should be moved to the dev list for further discussion.
> 
> OpenWhisk uses multiple persistent data stores as part of its control
> plane. In addition to the main database for activations/actions/etc, we
> also rely on:
>       (1) zookeeper (required for kafka)
>       (2) Akka cluster with distributed data (used to replicate load
> balancer state if controller clustering is enabled).
>       (3) Redis (being used by apigateway;  In pending PR#2689, if dynamic
> invokerId assignment is enabled then Redis is used to store the mapping
> from invoker names to invoker ids).
> 
> I think the main unresolved question was if it was possible to
> control/reduce the number of persistent services being used in the
> OpenWhisk implementation.  Each one adds operational complexity.
> 
> A secondary question was whether there were objections to proceeding with
> the merge of PR#2689 as-is while we ponder the eventual overall persistence
> architecture for the control plane.
> 
> --dave

Reply via email to