[jira] [Commented] (HDFS-14849) Erasure Coding: replicate block infinitely when datanode being decommissioning

2019-09-19 Thread Fei Hui (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16934005#comment-16934005
 ] 

Fei Hui commented on HDFS-14849:


[~marvelrock] Thanks for your comments.
 Now I am sure it is not the same as HDFS-14847
Maybe it is better to modify the issue description of HDFS-14849

> Erasure Coding: replicate block infinitely when datanode being decommissioning
> --
>
> Key: HDFS-14849
> URL: https://issues.apache.org/jira/browse/HDFS-14849
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: HuangTao
>Assignee: HuangTao
>Priority: Major
>  Labels: EC, HDFS, NameNode
> Attachments: HDFS-14849.001.patch, HDFS-14849.002.patch, 
> fsck-file.png, liveBlockIndices.png, scheduleReconstruction.png
>
>
> When the datanode keeping in DECOMMISSION_INPROGRESS status, the EC block in 
> that datanode will be replicated infinitely.
> // added 2019/09/19
> I reproduced this scenario in a 163 nodes cluster with decommission 100 nodes 
> simultaneously. 
>  !scheduleReconstruction.png! 
>  !fsck-file.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14849) Erasure Coding: replicate block infinitely when datanode being decommissioning

2019-09-19 Thread HuangTao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933911#comment-16933911
 ] 

HuangTao commented on HDFS-14849:
-

Yes [~ferhui], just more than one, sometimes the number of an internal block 
equal the amount of racks.

> Erasure Coding: replicate block infinitely when datanode being decommissioning
> --
>
> Key: HDFS-14849
> URL: https://issues.apache.org/jira/browse/HDFS-14849
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: HuangTao
>Assignee: HuangTao
>Priority: Major
>  Labels: EC, HDFS, NameNode
> Attachments: HDFS-14849.001.patch, HDFS-14849.002.patch, 
> fsck-file.png, liveBlockIndices.png, scheduleReconstruction.png
>
>
> When the datanode keeping in DECOMMISSION_INPROGRESS status, the EC block in 
> that datanode will be replicated infinitely.
> // added 2019/09/19
> I reproduced this scenario in a 163 nodes cluster with decommission 100 nodes 
> simultaneously. 
>  !scheduleReconstruction.png! 
>  !fsck-file.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14849) Erasure Coding: replicate block infinitely when datanode being decommissioning

2019-09-19 Thread Fei Hui (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933506#comment-16933506
 ] 

Fei Hui commented on HDFS-14849:


[~marvelrock] Thanks
I deep into the code and find the workflow
# RedundancyMonitor -> computeDatanodeWork -> computeBlockReconstructionWork -> 
get blocks from *neededReconstruction* -> 
computeReconstructionWorkForBlocks(put block to pendingReconstruction in 
validateReconstructionWork) -> scheduleReconstruction -> ErasureCodingWork
# RedundancyMonitor -> processPendingReconstructions -> get timedOutItems from 
pendingReconstruction ->  add it to neededReconstruction 
I think it will replicate more blocks without your fix. After all of blocks 
from decommissioning nodes replicating successfully, the block will not add to 
neededReconstruction because it will recompute live replicas in 
processPendingReconstructions function.  Replicating block infinitely  does not 
happen, is it right ? Just some blocks are more than one?

> Erasure Coding: replicate block infinitely when datanode being decommissioning
> --
>
> Key: HDFS-14849
> URL: https://issues.apache.org/jira/browse/HDFS-14849
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: HuangTao
>Assignee: HuangTao
>Priority: Major
>  Labels: EC, HDFS, NameNode
> Attachments: HDFS-14849.001.patch, HDFS-14849.002.patch, 
> fsck-file.png, liveBlockIndices.png, scheduleReconstruction.png
>
>
> When the datanode keeping in DECOMMISSION_INPROGRESS status, the EC block in 
> that datanode will be replicated infinitely.
> // added 2019/09/19
> I reproduced this scenario in a 163 nodes cluster with decommission 100 nodes 
> simultaneously. 
>  !scheduleReconstruction.png! 
>  !fsck-file.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14849) Erasure Coding: replicate block infinitely when datanode being decommissioning

2019-09-19 Thread HuangTao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933455#comment-16933455
 ] 

HuangTao commented on HDFS-14849:
-

[~ferhui] I am trying to find the root cause, will sync here.

> Erasure Coding: replicate block infinitely when datanode being decommissioning
> --
>
> Key: HDFS-14849
> URL: https://issues.apache.org/jira/browse/HDFS-14849
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: HuangTao
>Assignee: HuangTao
>Priority: Major
>  Labels: EC, HDFS, NameNode
> Attachments: HDFS-14849.001.patch, HDFS-14849.002.patch, 
> fsck-file.png, liveBlockIndices.png, scheduleReconstruction.png
>
>
> When the datanode keeping in DECOMMISSION_INPROGRESS status, the EC block in 
> that datanode will be replicated infinitely.
> // added 2019/09/19
> I reproduced this scenario in a 163 nodes cluster with decommission 100 nodes 
> simultaneously. 
>  !scheduleReconstruction.png! 
>  !fsck-file.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14849) Erasure Coding: replicate block infinitely when datanode being decommissioning

2019-09-19 Thread Fei Hui (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933429#comment-16933429
 ] 

Fei Hui commented on HDFS-14849:


[~marvelrock] Thanks for your patch. 
Could you please give more detail why replicate block infinitely ?
I want to check whether it is the same  scenario with HDFS-14847

> Erasure Coding: replicate block infinitely when datanode being decommissioning
> --
>
> Key: HDFS-14849
> URL: https://issues.apache.org/jira/browse/HDFS-14849
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: HuangTao
>Assignee: HuangTao
>Priority: Major
>  Labels: EC, HDFS, NameNode
> Attachments: HDFS-14849.001.patch, HDFS-14849.002.patch, 
> fsck-file.png, liveBlockIndices.png, scheduleReconstruction.png
>
>
> When the datanode keeping in DECOMMISSION_INPROGRESS status, the EC block in 
> that datanode will be replicated infinitely.
> // added 2019/09/19
> I reproduced this scenario in a 163 nodes cluster with decommission 100 nodes 
> simultaneously. 
>  !scheduleReconstruction.png! 
>  !fsck-file.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14849) Erasure Coding: replicate block infinitely when datanode being decommissioning

2019-09-19 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933412#comment-16933412
 ] 

Hadoop QA commented on HDFS-14849:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  9s{color} 
| {color:red} HDFS-14849 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-14849 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27910/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Erasure Coding: replicate block infinitely when datanode being decommissioning
> --
>
> Key: HDFS-14849
> URL: https://issues.apache.org/jira/browse/HDFS-14849
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: HuangTao
>Assignee: HuangTao
>Priority: Major
>  Labels: EC, HDFS, NameNode
> Attachments: HDFS-14849.001.patch, HDFS-14849.002.patch, 
> fsck-file.png, liveBlockIndices.png, scheduleReconstruction.png
>
>
> When the datanode keeping in DECOMMISSION_INPROGRESS status, the EC block in 
> that datanode will be replicated infinitely.
> // added 2019/09/19
> I reproduced this scenario in a 163 nodes cluster with decommission 100 nodes 
> simultaneously. 
>  !scheduleReconstruction.png! 
>  !fsck-file.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14849) Erasure Coding: replicate block infinitely when datanode being decommissioning

2019-09-19 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933399#comment-16933399
 ] 

Hadoop QA commented on HDFS-14849:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
37s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 54s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 52s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}124m 
44s{color} | {color:green} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
34s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}195m 58s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=18.09.7 Server=18.09.7 Image:yetus/hadoop:39e82acc485 |
| JIRA Issue | HDFS-14849 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12980709/HDFS-14849.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux f069cb69415f 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 28913f7 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27909/testReport/ |
| Max. process+thread count | 2743 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27909/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Erasure Coding: replicate block infinitely when datanode being decommissioning
> --
>
> 

[jira] [Commented] (HDFS-14849) Erasure Coding: replicate block infinitely when datanode being decommissioning

2019-09-19 Thread HuangTao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933382#comment-16933382
 ] 

HuangTao commented on HDFS-14849:
-

 !liveBlockIndices.png! 

> Erasure Coding: replicate block infinitely when datanode being decommissioning
> --
>
> Key: HDFS-14849
> URL: https://issues.apache.org/jira/browse/HDFS-14849
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: HuangTao
>Assignee: HuangTao
>Priority: Major
>  Labels: EC, HDFS, NameNode
> Attachments: HDFS-14849.001.patch, HDFS-14849.002.patch, 
> fsck-file.png, liveBlockIndices.png, scheduleReconstruction.png
>
>
> When the datanode keeping in DECOMMISSION_INPROGRESS status, the EC block in 
> that datanode will be replicated infinitely.
> // added 2019/09/19
> I reproduced this scenario in a 163 nodes cluster with decommission 100 nodes 
> simultaneously. 
>  !scheduleReconstruction.png! 
>  !fsck-file.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14849) Erasure Coding: replicate block infinitely when datanode being decommissioning

2019-09-19 Thread HuangTao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933376#comment-16933376
 ] 

HuangTao commented on HDFS-14849:
-

I find a clue:

the `chooseSourceDatanodes` get 
{quote}LIVE=2, READONLY=0, DECOMMISSIONING=7, DECOMMISSIONED=0, 
MAINTENANCE_NOT_FOR_READ=0, MAINTENANCE_FOR_READ=0, CORRUPT=0, EXCESS=0, 
STALESTORAGE=0, REDUNDANT=22{quote}
and all block index (0-8) exists, and three blocks 3/4/8 have no redundant 
block, and the datanode where block 8 stored is in DECOMMISSIONING, other two 
datanode adminState is null. 

the `countNodes(block)` get
{quote}LIVE=8, READONLY=0, DECOMMISSIONING=7, DECOMMISSIONED=0, 
MAINTENANCE_NOT_FOR_READ=0, MAINTENANCE_FOR_READ=0, CORRUPT=0, EXCESS=0, 
STALESTORAGE=0, REDUNDANT=16{quote}

so we need to replicate block 8, but there is no racks anymore.


> Erasure Coding: replicate block infinitely when datanode being decommissioning
> --
>
> Key: HDFS-14849
> URL: https://issues.apache.org/jira/browse/HDFS-14849
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: HuangTao
>Assignee: HuangTao
>Priority: Major
>  Labels: EC, HDFS, NameNode
> Attachments: HDFS-14849.001.patch, HDFS-14849.002.patch, 
> fsck-file.png, scheduleReconstruction.png
>
>
> When the datanode keeping in DECOMMISSION_INPROGRESS status, the EC block in 
> that datanode will be replicated infinitely.
> // added 2019/09/19
> I reproduced this scenario in a 163 nodes cluster with decommission 100 nodes 
> simultaneously. 
>  !scheduleReconstruction.png! 
>  !fsck-file.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14849) Erasure Coding: replicate block infinitely when datanode being decommissioning

2019-09-19 Thread HuangTao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933266#comment-16933266
 ] 

HuangTao commented on HDFS-14849:
-

[~ferhui] We have the same scenario, but our fix can't pass each other's UT.

> Erasure Coding: replicate block infinitely when datanode being decommissioning
> --
>
> Key: HDFS-14849
> URL: https://issues.apache.org/jira/browse/HDFS-14849
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: HuangTao
>Assignee: HuangTao
>Priority: Major
>  Labels: EC, HDFS, NameNode
> Attachments: HDFS-14849.001.patch, HDFS-14849.002.patch, 
> fsck-file.png, scheduleReconstruction.png
>
>
> When the datanode keeping in DECOMMISSION_INPROGRESS status, the EC block in 
> that datanode will be replicated infinitely.
> // added 2019/09/19
> I reproduced this scenario in a 163 nodes cluster with decommission 100 nodes 
> simultaneously.
>  
>  !scheduleReconstruction.png! 
>  !fsck-file.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14849) Erasure Coding: replicate block infinitely when datanode being decommissioning

2019-09-19 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933237#comment-16933237
 ] 

Hadoop QA commented on HDFS-14849:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m 10s{color} 
| {color:red} HDFS-14849 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-14849 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27908/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Erasure Coding: replicate block infinitely when datanode being decommissioning
> --
>
> Key: HDFS-14849
> URL: https://issues.apache.org/jira/browse/HDFS-14849
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: HuangTao
>Assignee: HuangTao
>Priority: Major
>  Labels: EC, HDFS, NameNode
> Attachments: HDFS-14849.001.patch, scheduleReconstruction.png
>
>
> When the datanode keeping in DECOMMISSION_INPROGRESS status, the EC block in 
> that datanode will be replicated infinitely.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14849) Erasure Coding: replicate block infinitely when datanode being decommissioning

2019-09-15 Thread Fei Hui (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16930192#comment-16930192
 ] 

Fei Hui commented on HDFS-14849:


Is the issue is the same as HDFS-14847 ? From the description of the issue I 
think it is. 
BTW, Unit Test of HDFS-14847 is still timed out with this fix.

> Erasure Coding: replicate block infinitely when datanode being decommissioning
> --
>
> Key: HDFS-14849
> URL: https://issues.apache.org/jira/browse/HDFS-14849
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: HuangTao
>Assignee: HuangTao
>Priority: Major
>  Labels: EC, HDFS, NameNode
> Attachments: HDFS-14849.001.patch
>
>
> When the datanode keeping in DECOMMISSION_INPROGRESS status, the EC block in 
> that datanode will be replicated infinitely.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14849) Erasure Coding: replicate block infinitely when datanode being decommissioning

2019-09-15 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16929915#comment-16929915
 ] 

Hadoop QA commented on HDFS-14849:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
37s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 29s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 40s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 3 new + 163 unchanged - 0 fixed = 166 total (was 163) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 44s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 86m 48s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}140m 45s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestReconstructStripedFile |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:39e82acc485 |
| JIRA Issue | HDFS-14849 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12980339/HDFS-14849.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux e48ccfaa8f6c 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / e04b8a4 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27876/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27876/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27876/testReport/ |
| Max. process+thread count | 4414 (vs. ulimit of 5500) |
| modules | C: 

[jira] [Commented] (HDFS-14849) Erasure Coding: replicate block infinitely when datanode being decommissioning

2019-09-15 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16929912#comment-16929912
 ] 

Ayush Saxena commented on HDFS-14849:
-

Thanx [~marvelrock] for the patch, Got the change. The decommissioned one gets 
added first and then when the live one comes, it is marked as redundant. Seems 
fair enough to check state first. Fix LGTM.

For the test :

{code:java}
// ec policy
756 ECSchema rsSchema = new ECSchema("rs", 3, 2);
757 String policyName = "RS-3-2-128k";
758 int cellSize = 128 * 1024;
759 ErasureCodingPolicy ecPolicy =
760 new ErasureCodingPolicy(policyName, rsSchema, cellSize, (byte) 
-1);

{code}

You may instead use directly :


{code:java}
// RS-3-2 EC policy
ErasureCodingPolicy ecPolicy =
SystemErasureCodingPolicies.getPolicies().get(1);
{code}
 

> Erasure Coding: replicate block infinitely when datanode being decommissioning
> --
>
> Key: HDFS-14849
> URL: https://issues.apache.org/jira/browse/HDFS-14849
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: HuangTao
>Assignee: HuangTao
>Priority: Major
>  Labels: EC, HDFS, NameNode
> Attachments: HDFS-14849.001.patch
>
>
> When the datanode keeping in DECOMMISSION_INPROGRESS status, the EC block in 
> that datanode will be replicated infinitely.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14849) Erasure Coding: replicate block infinitely when datanode being decommissioning

2019-09-14 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16929889#comment-16929889
 ] 

Ayush Saxena commented on HDFS-14849:
-

Thanx [~marvelrock] for the report, Can you give a brief about the fix, why 
exactly it is happening.


> Erasure Coding: replicate block infinitely when datanode being decommissioning
> --
>
> Key: HDFS-14849
> URL: https://issues.apache.org/jira/browse/HDFS-14849
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: HuangTao
>Assignee: HuangTao
>Priority: Major
>  Labels: EC, HDFS, NameNode
> Attachments: HDFS-14849.001.patch
>
>
> When the datanode keeping in DECOMMISSION_INPROGRESS status, the EC block in 
> that datanode will be replicated infinitely.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org