[
https://issues.apache.org/jira/browse/ZOOKEEPER-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16045627#comment-16045627
]
ASF GitHub Bot commented on ZOOKEEPER-2775:
-------------------------------------------
GitHub user arshadmohammad opened a pull request:
https://github.com/apache/zookeeper/pull/278
ZOOKEEPER-2775: ZK Client not able to connect with Xid out of order error
Same as PR https://github.com/apache/zookeeper/pull/254 but this is for
branch-3.4
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/arshadmohammad/zookeeper
ZOOKEEPER-2775-branch-3.4
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/zookeeper/pull/278.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #278
----
commit 31580024b18d57eff28acb7099bad0606b51efa3
Author: Mohammad Arshad <[email protected]>
Date: 2017-06-10T17:36:34Z
ZOOKEEPER-2775: ZK Client not able to connect with Xid out of order error
----
> ZK Client not able to connect with Xid out of order error
> ----------------------------------------------------------
>
> Key: ZOOKEEPER-2775
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2775
> Project: ZooKeeper
> Issue Type: Bug
> Components: java client
> Affects Versions: 3.4.10, 3.5.3, 3.6.0
> Reporter: Bhupendra Kumar Jain
> Assignee: Mohammad Arshad
> Priority: Critical
> Fix For: 3.5.4, 3.6.0
>
> Attachments: ZOOKEEPER-2775-01.patch
>
>
> During Network unreachable scenario in one of the cluster, we observed Xid
> out of order and Nothing in the queue error continously. And ZK client it
> finally not able to connect successully to ZK server.
> *Logs:*
> unexpected error, closing socket connection and attempting reconnect |
> org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1447)
> java.io.IOException: Xid out of order. Got Xid 52 with err 0 expected Xid 53
> for a packet with details: clientPath:null serverPath:null finished:false
> header:: 53,101 replyHeader:: 0,0,-4 request::
> 12885502275,v{'/app1/controller,'/app1/config/changes},v{},v{'/app1/config/changes}
> response:: null
> at
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:996)
> at
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
> at
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370)
> at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426)
> unexpected error, closing socket connection and attempting reconnect
> java.io.IOException: Nothing in the queue, but got 1
> at
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:983)
> at
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
> at
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370)
> at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426)
>
> *Analysis:*
> 1) First time Client fails to do SASL login due to network unreachable
> problem.
> 2017-03-29 10:03:59,377 | WARN | [main-SendThread(192.168.130.8:24002)] |
> SASL configuration failed: javax.security.auth.login.LoginException: Network
> is unreachable (sendto failed) Will continue connection to Zookeeper server
> without SASL authentication, if Zookeeper server allows it. |
> org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1307)
> Here the boolean saslLoginFailed becomes true.
> 2) After some time network connection is recovered and client is successully
> able to login but still the boolean saslLoginFailed is not reset to false.
> 3) Now SASL negotiation between client and server start happening and during
> this time no user request will be sent. ( As the socket channel will be
> closed for write till sasl negotiation complets)
> 4) Now response from server for SASL packet will be processed by the client
> and client assumes that tunnelAuthInProgress() is finished ( method checks
> for saslLoginFailed boolean Since the boolean is true it assumes its done.)
> and tries to process the packet as a other packet and will result in above
> errors.
> *Solution:* Reset the saslLoginFailed boolean every time before client login
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)