[ https://issues.apache.org/jira/browse/TRAFODION-2940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16388996#comment-16388996 ]
ASF GitHub Bot commented on TRAFODION-2940: ------------------------------------------- Github user kevinxu021 commented on a diff in the pull request: https://github.com/apache/trafodion/pull/1427#discussion_r172737047 --- Diff: dcs/src/main/java/org/trafodion/dcs/zookeeper/ZkClient.java --- @@ -176,7 +208,19 @@ public ZooKeeper getZk() { public void process(WatchedEvent event) { if(event.getState() == Watcher.Event.KeeperState.SyncConnected) { connectedSignal.countDown(); - } + } else if (event.getState() == Watcher.Event.KeeperState.Expired) { + LOG.info("session expired. now rebuilding"); + // session expired, may be never happending. but if it happen there + // need to close old client and rebuild new client + try { + connect(true); + } catch (IOException e) { + setSessionRecoverSuccessful(false); + LOG.error("session expired and throw IOException while do reconnect: " + e.getMessage()); + } catch (InterruptedException e) { + LOG.error("session expired and throw InterruptedException while do reconnect: " + e.getMessage()); --- End diff -- The same as above. > In HA env, one node lose network, when recover, trafci can't use > ---------------------------------------------------------------- > > Key: TRAFODION-2940 > URL: https://issues.apache.org/jira/browse/TRAFODION-2940 > Project: Apache Trafodion > Issue Type: Bug > Affects Versions: any > Reporter: mashengchen > Assignee: mashengchen > Priority: Major > Fix For: 2.3 > > > In HA env, if one node lose network for a long time , once network recover, > there will have two floating ip, two working dcs master, and trafci can't be > use. -- This message was sent by Atlassian JIRA (v7.6.3#76005)