[ 
https://issues.apache.org/jira/browse/SPARK-9629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14659548#comment-14659548
 ] 

zengqiuyang commented on SPARK-9629:
------------------------------------

ZK runing well,
The verified thing is if i change the zk server system time  exceed the time 
rang of session negotiated timeount, session  shutdown.  
My zk server have time fluctuation before ,and now is checking in stabilize 
time. Waiting for  Exception apper;

Another thing .Two session allways lost same time.
I wish it RECONNECTED first , but not change to SUSPENDED and lost leadership 
immediatelyï¼›  

the log in ZK

2015-08-03 20:18:59,828 [myid:3] - WARN  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught end of 
stream exception
EndOfStreamException: Unable to read additional data from client sessionid 
0x34ee39684b70002, likely client has closed socket
        at 
org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
        at 
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
        at java.lang.Thread.run(Thread.java:745)
2015-08-03 20:18:59,829 [myid:3] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket 
connection for client /192.168.0.146:37829 which had sessionid 0x34ee39684b70002
2015-08-03 20:19:00,252 [myid:3] - WARN  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught end of 
stream exception
EndOfStreamException: Unable to read additional data from client sessionid 
0x34ee39684b70001, likely client has closed socket
        at 
org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
        at 
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
        at java.lang.Thread.run(Thread.java:745)
2015-08-03 20:19:00,253 [myid:3] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket 
connection for client /192.168.0.146:37828 which had sessionid 0x34ee39684b70001





>  Client session timed out, have not heard from server in
> --------------------------------------------------------
>
>                 Key: SPARK-9629
>                 URL: https://issues.apache.org/jira/browse/SPARK-9629
>             Project: Spark
>          Issue Type: Bug
>          Components: Deploy
>    Affects Versions: 1.4.0, 1.4.1
>         Environment: spark1.4.1    ./make-distribution.sh --tgz 
> -Dhadoop.version=2.5.2 -Dyarn.version=2.5.2 -Phive -Phive-thriftserver  
> -Pyarn  
> zookeeper-3.4.6.tar.gz 
> standalone HA
> Linux version 2.6.32-358.el6.x86_64 ([email protected]) (gcc 
> version 4.4.7 20120313 (Red Hat 4.4.7-3) (GCC) ) #1 SMP Fri Feb 22 00:31:26 
> UTC 2013
>            Reporter: zengqiuyang
>            Priority: Critical
>
> the spark  HA   running  every few days , then " Client session timed out" 
> appear。
> show reconnect but not do it,  and master shutting down.
> logs:
>  15/08/05 05:32:57 INFO zookeeper.ClientCnxn: Client session timed out, have 
> not heard from server in 37753ms for sessionid 0x34ee39684b70005, closing 
> socket connection and attempting reconnect
> 15/08/05 05:32:57 INFO state.ConnectionStateManager: State change: SUSPENDED
> 15/08/05 05:32:57 WARN state.ConnectionStateManager: There are no 
> ConnectionStateListeners registered.
> 15/08/05 05:32:57 INFO zookeeper.ClientCnxn: Opening socket connection to 
> server h5/192.168.0.18:2181. Will not attempt to authenticate using SASL 
> (unknown error)
> 15/08/05 05:32:57 INFO zookeeper.ClientCnxn: Socket connection established to 
> h5/192.168.0.18:2181, initiating session
> 15/08/05 05:32:57 INFO zookeeper.ClientCnxn: Session establishment complete 
> on server h5/192.168.0.18:2181, sessionid = 0x34ee39684b70005, negotiated 
> timeout = 40000
> 15/08/05 05:32:57 INFO state.ConnectionStateManager: State change: RECONNECTED
> 15/08/05 05:32:57 WARN state.ConnectionStateManager: There are no 
> ConnectionStateListeners registered.
> 15/08/05 05:32:58 INFO zookeeper.ClientCnxn: Client session timed out, have 
> not heard from server in 37753ms for sessionid 0x34ee39684b70006, closing 
> socket connection and attempting reconnect
> 15/08/05 05:32:58 INFO state.ConnectionStateManager: State change: SUSPENDED
> 15/08/05 05:32:58 INFO master.ZooKeeperLeaderElectionAgent: We have lost 
> leadership
> 15/08/05 05:32:58 ERROR master.Master: Leadership has been revoked -- master 
> shutting down.
> 15/08/05 05:32:58 INFO util.Utils: Shutdown hook called



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to