[
https://issues.apache.org/jira/browse/HDFS-17154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17753114#comment-17753114
]
ASF GitHub Bot commented on HDFS-17154:
---------------------------------------
zhangshuyan0 opened a new pull request, #5941:
URL: https://github.com/apache/hadoop/pull/5941
In the method `updateBlockForPipeline`, NameNode uses the
`BlockUnderConstructionFeature` of a BlockInfo to generate the member
`blockIndices` of `LocatedStripedBlock`.
https://github.com/apache/hadoop/blob/b6edcb9a84ceac340c79cd692637b3e11c997cc5/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java#L5308-L5319
And then, it uses `blockIndices` to generate block tokens for client.
https://github.com/apache/hadoop/blob/b6edcb9a84ceac340c79cd692637b3e11c997cc5/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java#L1618-L1632
However, if there is a failover, the location info in
BlockUnderConstructionFeature may be incomplete, which results in the absence
of the corresponding block tokens.
When the client receives these incomplete block tokens, it will throw a NPE
because `updatedBlks[i]` is null (line 825).
https://github.com/apache/hadoop/blob/b6edcb9a84ceac340c79cd692637b3e11c997cc5/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSStripedOutputStream.java#L820-L828
As a result, the write process in client fails. We need to fix this bug.
NameNode should just return block tokens for all indices to the client. Client
can pick whichever it likes to use.
> EC: Fix bug in updateBlockForPipeline after failover
> ----------------------------------------------------
>
> Key: HDFS-17154
> URL: https://issues.apache.org/jira/browse/HDFS-17154
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Shuyan Zhang
> Assignee: Shuyan Zhang
> Priority: Major
>
> In the method `updateBlockForPipeline`, NameNode uses the
> `BlockUnderConstructionFeature` of a BlockInfo to generate the member
> `blockIndices` of `LocatedStripedBlock`.
> And then, NameNode uses `blockIndices` to generate block tokens for client.
> However, if there is a failover, the location info in
> BlockUnderConstructionFeature may be incomplete, which results in the absence
> of the corresponding block tokens.
> When the client receives these incomplete block tokens, it will throw a NPE
> because `updatedBlks[i]` is null.
> NameNode should just return block tokens for all indices to the client.
> Client can pick whichever it likes to use.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]