[ https://issues.apache.org/jira/browse/HBASE-5860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13263851#comment-13263851 ]
Nicolas Spiegelberg commented on HBASE-5860: -------------------------------------------- +1 > splitlogmanager should not unnecessarily resubmit tasks when zk unavailable > --------------------------------------------------------------------------- > > Key: HBASE-5860 > URL: https://issues.apache.org/jira/browse/HBASE-5860 > Project: HBase > Issue Type: Improvement > Reporter: Prakash Khemani > Assignee: Prakash Khemani > Attachments: > 0001-HBASE-5860-splitlogmanager-should-not-unnecessarily-.patch > > > (Doesn't really impact the run time or correctness of log splitting) > say the master has lost connection to zk. splitlogmanager's timeoutmanager > will realize that all the tasks that were submitted are still unassigned. It > will resubmit those tasks (i.e. create dummy znodes) > splitlogmanager should realze that the tasks are unassigned but their znodes > have not been created. > 012-04-20 13:11:20,516 INFO org.apache.hadoop.hbase.master.SplitLogManager: > dead splitlog worker msgstore295.snc4.facebook.com,60020,1334948757026 > 2012-04-20 13:11:20,517 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: > Scheduling batch of logs to split > 2012-04-20 13:11:20,517 INFO org.apache.hadoop.hbase.master.SplitLogManager: > started splitting logs in > [hdfs://msgstore215.snc4.facebook.com:9000/MSGSTORE215-SNC4-HBASE/.logs/msgstore295.snc4.facebook.com,60020,1334948757026-splitting] > 2012-04-20 13:11:20,565 INFO org.apache.zookeeper.ClientCnxn: Opening socket > connection to server msgstore235.snc4.facebook.com/10.30.222.186:2181 > 2012-04-20 13:11:20,566 INFO org.apache.zookeeper.ClientCnxn: Socket > connection established to msgstore235.snc4.facebook.com/10.30.222.186:2181, > initiating session > 2012-04-20 13:11:20,575 INFO org.apache.hadoop.hbase.master.SplitLogManager: > total tasks = 4 unassigned = 4 > 2012-04-20 13:11:20,576 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: > resubmitting unassigned task(s) after timeout > 2012-04-20 13:11:21,577 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: > resubmitting unassigned task(s) after timeout > 2012-04-20 13:11:21,683 INFO org.apache.zookeeper.ClientCnxn: Unable to read > additional data from server sessionid 0x36ccb0f8010002, likely server has > closed socket, closing socket connection and attempting reconnect > 2012-04-20 13:11:21,683 INFO org.apache.zookeeper.ClientCnxn: Unable to read > additional data from server sessionid 0x136ccb0f4890000, likely server has > closed socket, closing socket connection and attempting reconnect > 2012-04-20 13:11:21,786 WARN > org.apache.hadoop.hbase.master.SplitLogManager$CreateAsyncCallback: create rc > =CONNECTIONLOSS for > /hbase/splitlog/hdfs%3A%2F%2Fmsgstore215.snc4.facebook.com%3A9000%2FMSGSTORE215-SNC4-HBASE%2F.logs%2Fmsgstore295.snc4.facebook.com%2C60020%2C1334948757026-splitting%2F10.30.251.186%253A60020.1334951586677 > retry=3 > 2012-04-20 13:11:21,786 WARN > org.apache.hadoop.hbase.master.SplitLogManager$CreateAsyncCallback: create rc > =CONNECTIONLOSS for > /hbase/splitlog/hdfs%3A%2F%2Fmsgstore215.snc4.facebook.com%3A9000%2FMSGSTORE215-SNC4-HBASE%2F.logs%2Fmsgstore295.snc4.facebook.com%2C60020%2C1334948757026-splitting%2F10.30.251.186%253A60020.1334951920332 > retry=3 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira