Just bumping this thread. I'm still interested to know if I'm misunderstanding expected behaviour, or if something else is happening. Thanks.
On Mon, 21 Aug 2017 at 10:18 Paul Carey <[email protected]> wrote: > Hi > > In order to simplify handling of data center failover, I wanted to create > a ZooKeeper ensemble where writes were synchronously replicated to at least > one node in each DC before returning to the client. > > Quoting from the section on hierarchical quorums [1] that > "we are able to form a quorum once we have a majority of votes from a > majority of non-zero-weight groups" > I understand from this that if I have 2 non-zero-weight groups of 3 nodes, > then a quorum must be formed from 2 groups of at least 2 nodes. Which is at > least 4 nodes and hence at least node in each DC must be part of quorum, > thus ensuring each write is replicated to at least one node in each DC. > > I also understand from this line in the Programmer's Guide [2], and > various other places in the docs that the transaction log will reflect > every change applied to the znode tree. > "The most performance-critical part of ZooKeeper is the transaction log. > ZooKeeper must sync transactions to media before it returns a response. " > > Given the two points above, I would expect to see every zxid in at least > four transaction logs. But under failure conditions, this is not what I > see. I used `tc netem` to simulate a network split by progressively: > - increasing inter-DC latency from an average of 0.7ms (the DCs are 30km > apart) to 10ms > - dropping 50% of packets between DCs > I see the last zxid before total failure of the ensemble, 0x1b0002dfa5, in > only 3 of the 6 transaction logs, suggesting to me that the hierarchical > quroum was not correctly established before the write was accepted. > > But maybe I'm misunderstanding: > - maybe the presence of an entry in the transaction log is not the same > as saying that change will be applied to the in-memory state > - the zxids refer to createSession, perhaps quorum rules are not > enforced for such calls > > Anyway, I'd be very grateful if someone could help me understand what I'm > seeing here. Log snippets and config follow below. I'm running ZooKeeper > 3.4.6 on RHEL 6.8. > > Many thanks > > Paul > > == Transaction Logs == > > Host 3a > 8/17/17 6:01:44 AM UTC session 0x35dee3497560039 cxid 0x0 zxid > 0x1b0002dfa5 createSession 10000 > 8/17/17 6:02:13 AM UTC session 0x35deec8e21b0000 cxid 0x0 zxid > 0x1c00000001 createSession 10000 > > Host 3b > 8/17/17 6:00:54 AM GMT session 0x15dee2f698e000d cxid 0x0 zxid > 0x1b0002df44 closeSession null > EOF reached after 10556 txns. > > Host 4a > 8/17/17 6:01:44 AM GMT session 0x35dee3497560039 cxid 0x0 zxid > 0x1b0002dfa5 createSession 10000 > 8/17/17 6:02:13 AM GMT session 0x35deec8e21b0000 cxid 0x0 zxid > 0x1c00000001 createSession 10000 > > Host 4b > 8/17/17 6:01:25 AM GMT session 0x35dee3497560032 cxid 0x0 zxid > 0x1b0002df88 createSession 10000 > 8/17/17 6:02:13 AM GMT session 0x35deec8e21b0000 cxid 0x0 zxid > 0x1c00000001 createSession 10000 > EOF reached after 6619 txns. > > Host 7a > 8/17/17 6:01:44 AM GMT session 0x35dee3497560039 cxid 0x0 zxid > 0x1b0002dfa5 createSession 10000 > 8/17/17 6:02:13 AM GMT session 0x35deec8e21b0000 cxid 0x0 zxid > 0x1c00000001 createSession 10000 > > Host 7b > 8/17/17 6:01:25 AM GMT session 0x35dee3497560032 cxid 0x0 zxid > 0x1b0002df88 createSession 10000 > EOF reached after 49071 txns. > > > == Config == > > server.1=3a:2888:3888 > server.2=3b:2888:3888 > server.3=4a:2888:3888 > server.4=4b:2888:3888 > server.5=7a:2888:3888 > server.6=7b:2888:3888 > > group.1=1:2:4 > group.2=3:5:6 > > weight.1=1 > weight.2=1 > weight.3=1 > weight.4=1 > weight.5=1 > weight.6=1 > > [1] > http://zookeeper.apache.org/doc/r3.4.6/zookeeperHierarchicalQuorums.html > [2] http://zookeeper.apache.org/doc/r3.4.6/zookeeperProgrammers.html > >
