[
https://issues.apache.org/jira/browse/HDFS-16598?focusedWorklogId=779073&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-779073
]
ASF GitHub Bot logged work on HDFS-16598:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 07/Jun/22 12:53
Start Date: 07/Jun/22 12:53
Worklog Time Spent: 10m
Work Description: Hexiaoqiao commented on PR #4366:
URL: https://github.com/apache/hadoop/pull/4366#issuecomment-1148627690
> getReplicaInfo(ExtendedBlock b) will check gs, and getReplicaInfo(String
bpid, long blkid) will not check the gs.
@ZanderXu Thanks for the great catch here.
> I would like to ask a question, after reading your discussion, is it
possible that block GS of client may be smaller than DN appears in all places
where getReplicaInfo(String bpid, long blkid) is called?
It is good question.
IMO, it is not necessary to compare GS for any cases when get fine-grained
lock for BLOCK_POOl or VOLUME, because both of them are not depended on block.
Just suggest to improve them together in one PR.
Thanks again.
Issue Time Tracking
-------------------
Worklog Id: (was: 779073)
Time Spent: 2h (was: 1h 50m)
> All datanodes
> [DatanodeInfoWithStorage[127.0.0.1:57448,DS-1b5f7e33-a2bf-4edc-9122-a74c995a99f5,DISK]]
> are bad. Aborting...
> --------------------------------------------------------------------------------------------------------------------------
>
> Key: HDFS-16598
> URL: https://issues.apache.org/jira/browse/HDFS-16598
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: ZanderXu
> Assignee: ZanderXu
> Priority: Major
> Labels: pull-request-available
> Time Spent: 2h
> Remaining Estimate: 0h
>
> org.apache.hadoop.hdfs.testPipelineRecoveryOnRestartFailure failed with the
> stack like:
> {code:java}
> java.io.IOException: All datanodes
> [DatanodeInfoWithStorage[127.0.0.1:57448,DS-1b5f7e33-a2bf-4edc-9122-a74c995a99f5,DISK]]
> are bad. Aborting...
> at
> org.apache.hadoop.hdfs.DataStreamer.handleBadDatanode(DataStreamer.java:1667)
> at
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1601)
> at
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1587)
> at
> org.apache.hadoop.hdfs.DataStreamer.processDatanodeOrExternalError(DataStreamer.java:1371)
> at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:674)
> {code}
> After tracing the root cause, this bug was introduced by
> [HDFS-16534|https://issues.apache.org/jira/browse/HDFS-16534]. Because the
> block GS of client may be smaller than DN when pipeline recovery failed.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]