[ https://issues.apache.org/jira/browse/HBASE-10000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13844592#comment-13844592 ]
Ted Yu commented on HBASE-10000: -------------------------------- There was one unclosed log, shown below. {code} 2013-12-10 19:13:22,630 INFO [MASTER_SERVER_OPERATIONS-hor13n02:60000-2] master.SplitLogManager: started splitting 2 logs in [hdfs://hor13n01.gq1.ygridcore.net:8020/apps/hbase/data/WALs/hor13n05.gq1.ygridcore.net,60020,1386702460286-splitting] 2013-12-10 19:13:22,636 INFO [pool-19-thread-2] util.FSHDFSUtils: recoverLease=false, attempt=0 on file=hdfs://hor13n01.gq1.ygridcore.net:8020/apps/hbase/data/WALs/hor13n05.gq1.ygridcore.net,60020,1386702460286-splitting/hor13n05.gq1.ygridcore.net%2C60020%2C1386702460286.1386702750923 after 1386702802636ms 2013-12-10 19:13:22,636 INFO [pool-19-thread-1] util.FSHDFSUtils: recoverLease=true, attempt=0 on file=hdfs://hor13n01.gq1.ygridcore.net:8020/apps/hbase/data/WALs/hor13n05.gq1.ygridcore.net,60020,1386702460286-splitting/hor13n05.gq1.ygridcore.net%2C60020%2C1386702460286.1386702686049 after 1386702802636ms 2013-12-10 19:13:22,650 INFO [hor13n02.gq1.ygridcore.net,60000,1386702649564.splitLogManagerTimeoutMonitor] master.SplitLogManager: resubmitting task /hbase/splitWAL/WALs%2Fhor13n04.gq1.ygridcore.net%2C60020%2C1386702205712-splitting%2Fhor13n04.gq1.ygridcore.net%252C60020%252C1386702205712.1386702769323 {code} Log from NN to follow. > Initiate lease recovery for outstanding WAL files at the very beginning of > recovery > ----------------------------------------------------------------------------------- > > Key: HBASE-10000 > URL: https://issues.apache.org/jira/browse/HBASE-10000 > Project: HBase > Issue Type: Improvement > Reporter: Ted Yu > Assignee: Ted Yu > Fix For: 0.98.1 > > Attachments: 10000-0.96-v5.txt, 10000-0.96-v6.txt, > 10000-recover-ts-with-pb-2.txt, 10000-recover-ts-with-pb-3.txt, > 10000-recover-ts-with-pb-4.txt, 10000-recover-ts-with-pb-5.txt, > 10000-recover-ts-with-pb-6.txt, 10000-v4.txt, 10000-v5.txt, 10000-v6.txt > > > At the beginning of recovery, master can send lease recovery requests > concurrently for outstanding WAL files using a thread pool. > Each split worker would first check whether the WAL file it processes is > closed. > Thanks to Nicolas Liochon and Jeffery discussion with whom gave rise to this > idea. -- This message was sent by Atlassian JIRA (v6.1.4#6159)