[ https://issues.apache.org/jira/browse/TRAFODION-2236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15514290#comment-15514290 ]
ASF GitHub Bot commented on TRAFODION-2236: ------------------------------------------- GitHub user sbroeder opened a pull request: https://github.com/apache/incubator-trafodion/pull/723 JIRA [TRAFODION-2236] TM crashes during startup You can merge this pull request into a Git repository by running: $ git pull https://github.com/sbroeder/incubator-trafodion sean_traf3 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-trafodion/pull/723.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #723 ---- commit a84e0527571e920a08a577aba50fc4741078fe45 Author: Sean Broeder <sbroeder@edev05.esgyn.local> Date: 2016-09-22T19:38:16Z JIRA [TRAFODION-2236] TM crashes during startup ---- > TM crashesh following sqstart > ----------------------------- > > Key: TRAFODION-2236 > URL: https://issues.apache.org/jira/browse/TRAFODION-2236 > Project: Apache Trafodion > Issue Type: Bug > Components: dtm > Affects Versions: 2.0-incubating > Reporter: Sean Broeder > Assignee: Sean Broeder > Fix For: 2.1-incubating > > > When Trafodion is stopped abruptly when a region server has current recovery > requests posted in Zookeeper, the new TMs may be unable to start. This > happens because the TM recovery thread reads the ZK entries and attempts to > send the recovery resolution to the region server that posted the entry. It > gets a connection error because that region server no longer exists. > The partial solution is to remove the ZK entries as part of startup so the TM > can startup without error. > THis is safe to do because any region server needing recovery will repost to > zookeeper and the TM will have no issues connecting to this RS. > An additional fix will be made to the TM to handle exceptions in trying to > communicate with region servers during recovery. -- This message was sent by Atlassian JIRA (v6.3.4#6332)