GitHub user lhotari added a comment to the discussion: Thousands of cluster 
geo-replication for fan-in aggregation ?

> These c1..cN are pulsar-standalone instances running on a smaller footprint 
> devices (1core-1GBmem) with very few topics & very small byte rate (kb/sec) 
> to the cAgg cluster.

Ok, now I see the use case. I think geo-replication makes sense for this type 
of use case. I just don't have first hand experience of such a use case. 

> When we create a tenant/namespace/topic with replication-clusters, it'd 
> internally use geo-config-store, right ?

I guess it's a matter of definition of what you call "geo-config-store". In 
Pulsar there are concepts of "local configuration store" and "global 
configuration store". I guess "global configuration store" is what many would 
call the "geo configuration store". It is not mandatory to have a "global 
configuration store" at all. In that case, for a individual cluster, they would 
be completely independently managed and the global configuration store for an 
individual cluster would be it's local configuration store.
In this case, I wouldn't use a global configuration store for security reasons 
either. 

> If without geo-config store solution is possible, does it look something like 
> this. ?

yes those steps look about right.

> Any sweet spot for number of c1..cN grouping ? Online presentations talk 
> about upto 100 clusters geo-replication

you'd have to test the limits your self to get confidence. In scalable designs, 
you usually want sharding. Since the configuration of each remote cluster is 
handled individually, it's a matter of configuration of how to assign those to 
a aggregation cluster. One possible approach is to start with a few aggregation 
clusters (so that the scalable design is present from the beginning) and 
"assign" the remote clusters in a round robin fashion across the aggregation 
clusters.
In this case, it's about having [a cell-based 
architecture](https://docs.aws.amazon.com/wellarchitected/latest/reducing-scope-of-impact-with-cell-based-architecture/what-is-a-cell-based-architecture.html)
 from the beginning.




GitHub link: 
https://github.com/apache/pulsar/discussions/22438#discussioncomment-9018791

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to