[
https://issues.apache.org/jira/browse/HBASE-16056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15336363#comment-15336363
]
Stephen Yuan Jiang commented on HBASE-16056:
--------------------------------------------
+1 LGTM
> Procedure v2 - fix master crash for FileNotFound
> ------------------------------------------------
>
> Key: HBASE-16056
> URL: https://issues.apache.org/jira/browse/HBASE-16056
> Project: HBase
> Issue Type: Sub-task
> Components: proc-v2
> Affects Versions: 2.0.0, 1.3.0, 1.2.1, 1.1.5
> Reporter: Matteo Bertozzi
> Assignee: Matteo Bertozzi
> Priority: Minor
> Fix For: 2.0.0, 1.3.0, 1.2.2, 1.1.6
>
> Attachments: HBASE-16056-v0.patch, HBASE-16056-v1.patch,
> HBASE-16056-v2.patch
>
>
> [~syuanjiang] and [~tedyu] reported a backup master not able to start with
> FileNotFound during proc-v2 lease recovery. (another restart should have
> solved the problem)
> {noformat}
> FileNotFoundException: File does not exist:
> /hbase/MasterProcWALs/state-000001.log
> namenode.INodeFile.valueOf(INodeFile.java:61) at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLease(FSNamesystem.java:2877)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.recoverLease(NameNodeRpcServer.java:753)
> at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.recoverLease(ClientNamenodeProtocolServerSideTranslatorPB.java:671)
>
> {noformat}
> this may happen when the other master is still active (e.g. GC) and tries to
> remove files while the other master tries to become active. This operation is
> retryable so the code should able to handle that.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)