[
https://issues.apache.org/jira/browse/HDFS-17188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shilun Fan updated HDFS-17188:
------------------------------
Target Version/s: 3.4.1 (was: 3.4.0)
> Data loss in our production clusters due to missing HDFS-16540
> ---------------------------------------------------------------
>
> Key: HDFS-17188
> URL: https://issues.apache.org/jira/browse/HDFS-17188
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: 2.10.1
> Reporter: Rushabh Shah
> Assignee: Rushabh Shah
> Priority: Major
>
> Recently we saw missing blocks in our production clusters running on dynamic
> environments like AWS. We are running some version of hadoop-2.10 code line.
> Events that led to data loss:
> # We have pool of available IP address and whenever datanode restarts we
> use any available IP address from that pool.
> # We have seen during the lifetime of namenode process, multiple datanodes
> were restarted and the same datanode has used different IP address.
> # One case that I was debugging was very interesting.
> DN with datanode UUID DN-UUID-1 moved from ip-address-1 --> ip-address-2 -->
> ip-address-3
> DN with datanode UUID DN-UUID-2 moved from ip-address-4 --> ip-address-5 -->
> ip-address-1
> Observe the last IP address change for DN-UUID-2. It is ip-address-1 which is
> the first ip address of DN-UUID-1
> # There was some bug in our operational script which led to all datanodes
> getting restarted at the same time.
> Just after the restart, we see the following log lines.
> {noformat}
> 2023-08-26 04:04:41,964 INFO [on default port 9000] namenode.NameNode -
> BLOCK* registerDatanode: 10.x.x.1:50010
> 2023-08-26 04:04:45,720 INFO [on default port 9000] namenode.NameNode -
> BLOCK* registerDatanode: 10.x.x.2:50010
> 2023-08-26 04:04:45,720 INFO [on default port 9000] namenode.NameNode -
> BLOCK* registerDatanode: 10.x.x.2:50010
> 2023-08-26 04:04:51,680 INFO [on default port 9000] namenode.NameNode -
> BLOCK* registerDatanode: 10.x.x.3:50010
> 2023-08-26 04:04:55,328 INFO [on default port 9000] namenode.NameNode -
> BLOCK* registerDatanode: 10.x.x.4:50010
> {noformat}
> This line is logged
> [here|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java#L1184].
> Snippet below:
> {code:java}
> DatanodeDescriptor nodeS = getDatanode(nodeReg.getDatanodeUuid());
> DatanodeDescriptor nodeN = host2DatanodeMap.getDatanodeByXferAddr(
> nodeReg.getIpAddr(), nodeReg.getXferPort());
>
> if (nodeN != null && nodeN != nodeS) {
> NameNode.LOG.info("BLOCK* registerDatanode: " + nodeN);
> // nodeN previously served a different data storage,
> // which is not served by anybody anymore.
> removeDatanode(nodeN);
> // physically remove node from datanodeMap
> wipeDatanode(nodeN);
> nodeN = null;
> } {code}
>
> This happens when the DatanodeDescriptor is not the same in datanodeMap and
> host2DatanodeMap. HDFS-16540 fixed this bug for lost data locality and not
> data loss. :)
> By filing this jira, I want to discuss following things:
> # Do we really want to call removeDatanode method from namenode whenever any
> such discrepancy in maps is spotted? or Can we rely on the first full block
> report or the periodic full block report from the datanode to fix the
> metadata?
> # Improve logging in the blockmanagement code to debug these issues faster.
> # Add a test case with the exact events that occured in our env and still
> make sure that datanodeMap and host2DatanodeMap are consistent.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]