ZhangHB created HDFS-16914:
------------------------------
Summary: Add some logs for updateBlockForPipeline RPC.
Key: HDFS-16914
URL: https://issues.apache.org/jira/browse/HDFS-16914
Project: Hadoop HDFS
Issue Type: Improvement
Components: namanode
Affects Versions: 3.3.4
Reporter: ZhangHB
Assignee: ZhangHB
Recently,we received an phone alarm about missing blocks. We found logs in one
datanode where the block was placed on like below:
{code:java}
2023-02-09 15:05:10,376 INFO org.apache.hadoop.hdfs.server.datanode.DataNode:
Received BP-578784987-x.x.x.x-1667291826362:blk_1305044966_231832415 src:
/clientAddress:44638 dest: /localAddress:50010 of size 45733720
2023-02-09 15:05:10,376 INFO org.apache.hadoop.hdfs.server.datanode.DataNode:
Received BP-578784987-x.x.x.x-1667291826362:blk_1305044966_231826462 src:
/upStreamDatanode:60316 dest: /localAddress:50010 of size 45733720 {code}
the datanode received the same block with different generation stamp because of
socket timeout exception. blk_1305044966_231826462 is received from upstream
datanode in pipeline which has two datanodes. blk_1305044966_231832415 is
received from client directly.
we have search all log info about blk_1305044966 in namenode and three
datanodes in original pipeline. but we could not obtain any helpful message
about the generation stamp 231826462. After diving into the source code, it
was assigned in NameNodeRpcServer#updateBlockForPipeline which was invoked in
DataStreamer#setupPipelineInternal. The updateBlockForPipeline RPC does not
have any log info. So I think we should add some logs in this RPC.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]