Re: A global cluster for both global streams and local streams

2016-10-19 Thread Jay Juma
Sijie,

Thank you so much for answering my question. I've created a jira for the
discussion here - https://issues.apache.org/jira/browse/DL-58

When do you think this task will be prioritized?

- Jay

On Tue, Oct 18, 2016 at 2:00 PM, Sijie Guo 
wrote:

> On Fri, Oct 14, 2016 at 2:16 AM, Jay Juma  wrote:
>
> > based on my understanding, a global cluster can be setup to spread over
> > multiple datacenters and the data placement policy will place the data
> over
> > multiple datacenters.
> >
> > My question is "is it possible to create a log stream on a global cluster
> > that just write to bookies within same datacenter".
> >
>
> In theory, yes. However currently the data placement policy is configured
> per cluster.
>
> We can consider pushing down the data placement policy as part of the log
> segment metadata. So that bookie auto-recovery will be aware of what data
> placement policy will be used for re-replicating/recovering a bookkeeper
> ledger.
>
>
> >
> > I had two use cases, one is for database replication. for example, if
> there
> > are 2 datacenters A and B. A global dl cluster is setup over A and B. The
> > database cluster in A will write updates into a (some) global log
> > stream(s), and the database in B will tail-read those streams and apply
> > changes. I think this is a very typical use case of DistributedLog,
> right?
> >
> > There is another use case, it is just used for replication within one
> > datacenter. It doesn't need to replicate to the other datacenter. We want
> > to share the DL cluster for these two use cases. Is there a way to
> achieve
> > that? by tuning the data placement policy for individual streams?
> >
>
> Typically we don't mix global replicated log with local replicated log. But
> it seems that making data placement policy configurable per stream is a
> good option for your use case.
>
>
> Do you mind creating a JIRA for us to track this use case?
>
>
> >
> > Let me know if you need more information. Appreciate your help.
> >
> > Thanks,
> > Jay
> >
>


Re: A global cluster for both global streams and local streams

2016-10-18 Thread Sijie Guo
On Fri, Oct 14, 2016 at 2:16 AM, Jay Juma  wrote:

> based on my understanding, a global cluster can be setup to spread over
> multiple datacenters and the data placement policy will place the data over
> multiple datacenters.
>
> My question is "is it possible to create a log stream on a global cluster
> that just write to bookies within same datacenter".
>

In theory, yes. However currently the data placement policy is configured
per cluster.

We can consider pushing down the data placement policy as part of the log
segment metadata. So that bookie auto-recovery will be aware of what data
placement policy will be used for re-replicating/recovering a bookkeeper
ledger.


>
> I had two use cases, one is for database replication. for example, if there
> are 2 datacenters A and B. A global dl cluster is setup over A and B. The
> database cluster in A will write updates into a (some) global log
> stream(s), and the database in B will tail-read those streams and apply
> changes. I think this is a very typical use case of DistributedLog, right?
>
> There is another use case, it is just used for replication within one
> datacenter. It doesn't need to replicate to the other datacenter. We want
> to share the DL cluster for these two use cases. Is there a way to achieve
> that? by tuning the data placement policy for individual streams?
>

Typically we don't mix global replicated log with local replicated log. But
it seems that making data placement policy configurable per stream is a
good option for your use case.


Do you mind creating a JIRA for us to track this use case?


>
> Let me know if you need more information. Appreciate your help.
>
> Thanks,
> Jay
>