[
https://issues.apache.org/jira/browse/HDFS-14699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16928549#comment-16928549
]
Hudson commented on HDFS-14699:
-------------------------------
FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #17285 (See
[https://builds.apache.org/job/Hadoop-trunk-Commit/17285/])
HDFS-14699. Erasure Coding: Storage not considered in live replica when
(surendralilhore: rev d1c303a49763029fffa5164295034af8e81e74a0)
* (edit)
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* (edit)
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java
> Erasure Coding: Storage not considered in live replica when replication
> streams hard limit reached to threshold
> ---------------------------------------------------------------------------------------------------------------
>
> Key: HDFS-14699
> URL: https://issues.apache.org/jira/browse/HDFS-14699
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: ec
> Affects Versions: 3.2.0, 3.1.1, 3.3.0
> Reporter: Zhao Yi Ming
> Assignee: Zhao Yi Ming
> Priority: Critical
> Labels: patch
> Attachments: HDFS-14699.00.patch, HDFS-14699.01.patch,
> HDFS-14699.02.patch, HDFS-14699.03.patch, HDFS-14699.04.patch,
> HDFS-14699.05.patch, image-2019-08-20-19-58-51-872.png,
> image-2019-09-02-17-51-46-742.png
>
>
> We are tried the EC function on 80 node cluster with hadoop 3.1.1, we hit the
> same scenario as you said https://issues.apache.org/jira/browse/HDFS-8881.
> Following are our testing steps, hope it can helpful.(following DNs have the
> testing internal blocks)
> # we customized a new 10-2-1024k policy and use it on a path, now we have 12
> internal block(12 live block)
> # decommission one DN, after the decommission complete. now we have 13
> internal block(12 live block and 1 decommission block)
> # then shutdown one DN which did not have the same block id as 1
> decommission block, now we have 12 internal block(11 live block and 1
> decommission block)
> # after wait for about 600s (before the heart beat come) commission the
> decommissioned DN again, now we have 12 internal block(11 live block and 1
> duplicate block)
> # Then the EC is not reconstruct the missed block
> We think this is a critical issue for using the EC function in a production
> env. Could you help? Thanks a lot!
--
This message was sent by Atlassian Jira
(v8.3.2#803003)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]