On Fri, Oct 14, 2016 at 2:16 AM, Jay Juma <jayk.j...@gmail.com> wrote:
> based on my understanding, a global cluster can be setup to spread over
> multiple datacenters and the data placement policy will place the data over
> multiple datacenters.
> My question is "is it possible to create a log stream on a global cluster
> that just write to bookies within same datacenter".
In theory, yes. However currently the data placement policy is configured
We can consider pushing down the data placement policy as part of the log
segment metadata. So that bookie auto-recovery will be aware of what data
placement policy will be used for re-replicating/recovering a bookkeeper
> I had two use cases, one is for database replication. for example, if there
> are 2 datacenters A and B. A global dl cluster is setup over A and B. The
> database cluster in A will write updates into a (some) global log
> stream(s), and the database in B will tail-read those streams and apply
> changes. I think this is a very typical use case of DistributedLog, right?
> There is another use case, it is just used for replication within one
> datacenter. It doesn't need to replicate to the other datacenter. We want
> to share the DL cluster for these two use cases. Is there a way to achieve
> that? by tuning the data placement policy for individual streams?
Typically we don't mix global replicated log with local replicated log. But
it seems that making data placement policy configurable per stream is a
good option for your use case.
Do you mind creating a JIRA for us to track this use case?
> Let me know if you need more information. Appreciate your help.