Daryn Sharp created HDFS-12914:
----------------------------------
Summary: Block report leases cause missing blocks until next report
Key: HDFS-12914
URL: https://issues.apache.org/jira/browse/HDFS-12914
Project: Hadoop HDFS
Issue Type: Bug
Components: namenode
Affects Versions: 2.8.0
Reporter: Daryn Sharp
Priority: Critical
{{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for conditions
such as "unknown datanode", "not in pending set", "lease has expired", wrong
lease id, etc. Lease rejection does not throw an exception. It returns false
which bubbles up to {{NameNodeRpcServer#blockReport}} and interpreted as
{{noStaleStorages}}.
A re-registering node whose FBR is rejected from an invalid lease becomes
active with _no blocks_. A replication storm ensues possibly causing DNs to
temporarily go dead (HDFS-12645), leading to more FBR lease rejections on
re-registration. The cluster will have many "missing blocks" until the DNs
next FBR is sent and/or forced.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]