[
https://issues.apache.org/jira/browse/ZOOKEEPER-3466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16893710#comment-16893710
]
Jan-Philip Gehrcke commented on ZOOKEEPER-3466:
-----------------------------------------------
Thank you [~hanm] for another round of feedback. Much appreciated!
{quote}From the server log, it looks like DEBUG is enabled.
{quote}
Yes. Well, it's the TRACE log level.
{quote}Did you see any log statements related to the establishment of a session?
{quote}
In the bad state, *no*.
Just to confirm: in the good state ("bad state" --> restart leader --> "good
state") we see
{code:java}
DEBUG [CommitProcWorkThread-1:ZooKeeperServer@754] - Established session
0x2000ad110750000 with negotiated timeout 10000 for client /127.0.0.1:42470
{code}
In the bad state these log lines are *missing*. [~hanm] it looks like your
hunch is correct:
{quote}If there is no such statement, then we have a problem on server side
where we can't finish initialize a session.
{quote}
I want to add, for clarity, that there are not log statements like "Invalid
session 0x..." in the bad state.
{quote}On client side, looks like you are using Kazoo. What's the version of
the Kazoo client?
{quote}
The problem was observed with both, Kazoo 2.4 and 2.6. In the description of
the issue above I have pointed to an exact revision that I tried, and the
corresponding error condition:
[https://github.com/python-zk/kazoo/blob/88b657a0977161f3815657878ba48f82a97a3846/kazoo/protocol/connection.py#L249]
(this was the piece of code that told me that the client is actually waiting
for bytes to come in over the TCP connection, but they're seemingly not coming).
{quote}Did you try other client (Curator, ZooKeeper Java client, etc)?
{quote}
Yes (also mentioned in the initial description but no worries :)). I tried
zkCli from the 3.5.5 release. The error message it shows fits the picture:
{code:java}
Client session timed out, have not heard from server in 15003ms for sessionid
0x0, closing socket connection and attempting reconnect
(org.apache.zookeeper.ClientCnxn:main-SendThread(localhost:2181))
{code}
(it then retries, but that does not help).
> ZK cluster converges, but does not properly handle client connections (new in
> 3.5.5)
> ------------------------------------------------------------------------------------
>
> Key: ZOOKEEPER-3466
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3466
> Project: ZooKeeper
> Issue Type: Bug
> Affects Versions: 3.5.5
> Environment: Linux
> Reporter: Jan-Philip Gehrcke
> Priority: Major
>
> Hey, we explore switching from ZooKeeper 3.4.14 to ZooKeeper 3.5.5 in
> [https://github.com/dcos/dcos].
> DC/OS coordinates ZooKeeper via Exhibitor. We are not changing anything
> w.r.t. Exhibitor for now, and are hoping that we can use ZooKeeper 3.5.5 as a
> drop-in replacement for 3.4.14. This seems to work fine when Exhibitor uses a
> so-called static ensemble where the individual ZooKeeper instances are known
> a priori.
> When Exhibitor however discovers individual ZooKeeper instances ("dynamic"
> back-end) then I think we observe a regression where ZooKeeper 3.5.5 can get
> into the following bad state (often, but not always):
> # three ZooKeeper instances find each other, leader election takes place
> (*expected*)
> # leader election succeeds: two followers, one leader (*expected*)
> # all three ZK instances respond IAMOK to RUOK (*expected*)
> # all three ZK instances respond to SRVR (one says "Mode: leader", the other
> two say "Mode: follower") (*expected*)
> # all three ZK instances respond to MNTR and show plausible output
> (*expected*)
> # *{color:#ff0000}Unexpected:{color}* any ZooKeeper client trying to connect
> to any of the three nodes observes a "connection timeout", whereas notably
> this is *not* a TCP connect() timeout. The TCP connect() succeeds, but then
> ZK does not seem to send the expected byte sequence to the TCP connection,
> and the ZK client waits for it via recv() until it hits a timeout condition.
> Examples for two different clients:
> ## In Kazoo we specifically hit _Connection time-out: socket time-out during
> read_
> generated here:
> [https://github.com/python-zk/kazoo/blob/88b657a0977161f3815657878ba48f82a97a3846/kazoo/protocol/connection.py#L249]
> ## In zkCli we see _Client session timed out, have not heard from server in
> 15003ms for sessionid 0x0, closing socket connection and attempting reconnect
> (org.apache.zookeeper.ClientCnxn:main-SendThread(localhost:2181))_
> # This state is stable, it will last forever (well, at least for multiple
> hours and we didn't test longer than that).
> # In our system the ZooKeeper clients are crash-looping. They retry. What I
> have observed is that while they retry the ZK ensemble accumulates
> outstanding requests, here shown from MNTR output (emphasis mine):
> zk_packets_received 2008
> zk_packets_sent 127
> zk_num_alive_connections 18
> zk_outstanding_requests *1880*
> # The leader emits log lines confirming session timeout, example:
> _[myid:3] INFO [SessionTracker:ZooKeeperServer@398] - Expiring session
> 0x2000642b18f0020, timeout of 10000ms exceeded [myid:3] INFO
> [SessionTracker:QuorumZooKeeperServer@157] - Submitting global closeSession
> request for session 0x2000642b18f0020_
> # In this state, restarting any one of the two ZK followers results in the
> same state (clients don't get data from ZK upon connect).
> # In this state, restarting the ZK leader, and therefore triggering a leader
> re-election, almost immediately results in all clients being able to connect
> to all ZK instances successfully.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)