On Tue, Aug 14, 2012 at 4:59 AM, Flavio Junqueira <[email protected]> wrote:
> > On Aug 3, 2012, at 10:01 PM, Aniruddha Laud wrote: > > > Hi Flavio, > > > > Please find my responses inline. > > > > On Fri, Aug 3, 2012 at 3:28 AM, Flavio Junqueira <[email protected]> > wrote: > > > >> Hi Aniruddha, We had an offline discussion about your question and we > have > >> a few clarification questions back: > >> > >> On Aug 1, 2012, at 3:48 AM, Aniruddha Laud wrote: > >> > >>> Currently, all cross-region subscriptions have to succeed for a > >>> subscription to succeed locally. I understand that we might lose > messages > >>> published in another region if this doesn't succeed the very first > time, > >> > >> Very first time means receiving at least confirmation from at least one > >> region? How do we decide that it is ok to acknowledge locally that a > >> subscription went through? > >> > > Say we get a subscription in region ABC from subscriber id SUB-1 for > topic > > FOO-1, as of now, the hub responsible for FOO-1 will try to remotely > > subscribe to all regions for that topic and the local subscription will > > succeed only after all remote subscriptions succeed. Say, the regions are > > ABC, PQR and XYZ, region XYZ will have a subscription for topic FOO-1 > from > > subscriber id __ABC. Same with region PQR. If any of the remote > > subscriptions fail, this local subscription should fail because we cannot > > guarantee that topics published in all remote regions will be delivered. > > Sounds right. I think you're essentially saying that it is sufficient to > have one live subscription between any pair of regions. > Yes. That's correct. > > > Let's assume that all remote subscriptions succeeded. Now, region ABC > goes > > down temporarily because of a network partition or because the hubs are > > being restarted. If at this point, we get a subscription from subscriber > > SUB-2 for topic FOO-1 to the same hub, we need not subscribe to the > remote > > regions as a subscription already exists and is being delivered to the > > HubClient on this hub. Moreover, any subsequent subscription to that > topic > > need not wait for acknowledgements from remote regions because as far as > > regions PQR and XYZ go, they have a subscription from subscriber id __ABC > > already and all messages published in those regions are guaranteed to be > > delivered to this subscription. > > I can't remember now, do we keep information about consumed messages > locally to a subscriber even for messages published remotely? If so, then > your observation seems right to me. > I believe we do. Will confirm. > > > > >>> but subsequent subscribe requests should not be affected as the other > >>> regions are already aware of the remote subscription (that succeeded > the > >>> first time). > >> > >> Subscribe requests here means requests from clients local of the same > >> region, correct? > >> > > Yes. > > > >> > >>> A possible fix would be to store information in the local > >>> zookeeper about existing remote subscriptions. > >> > >> Store what information exactly? Does this information prevent > concurrently > >> published messages from being lost? If we don't wait for all regions to > >> acknowledge, then we might end up violating the contract that says that > >> once a client successfully subscribes to a topic, then delivery from > that > >> point on is guaranteed. > >> > > For every topic node in ZK,we have a child node named remote which stores > > the remote regions for which subscriptions have succeeded. We can make > use > > of the onLastLocalUnsubscribe function in RegionManager to clear the > state > > and the onFirstLocalSubscribe to check-and-set it. We need to store this > in > > ZK because this information has to be made available to all Hubs even as > > topics move around locally. I don't think it would violate any existing > > contracts. You are still guaranteed delivery of messages in all regions. > > Got it, makes sense. I believe false suspicions of hub crashes could cause > concurrent accesses to the same znode. If so, then we need to check that > znode version matches the one read to prevent concurrent accesses from > causing inconsistencies. Is this right? > Yes, that sounds right. Thanks :) Here's the ticket tracking this https://issues.apache.org/jira/browse/BOOKKEEPER-362 > > -Flavio > >
