[ https://issues.apache.org/jira/browse/HDFS-16479?focusedWorklogId=752615&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-752615 ]
ASF GitHub Bot logged work on HDFS-16479: ----------------------------------------- Author: ASF GitHub Bot Created on: 05/Apr/22 03:04 Start Date: 05/Apr/22 03:04 Worklog Time Spent: 10m Work Description: tasanuma opened a new pull request, #4138: URL: https://github.com/apache/hadoop/pull/4138 <!-- Thanks for sending a pull request! 1. If this is your first time, please read our contributor guidelines: https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute 2. Make sure your PR title starts with JIRA issue id, e.g., 'HADOOP-17799. Your PR title ...'. --> ### Description of PR NameNode should not send a reconstruction work when the source datanodes are insufficient. Otherwise, DataNodes receive the order and throw the following exception. ``` java.lang.IllegalArgumentException: No enough live striped blocks. at com.google.common.base.Preconditions.checkArgument(Preconditions.java:141) at org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.<init>(StripedReader.java:128) at org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReconstructor.<init>(StripedReconstructor.java:135) at org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.<init>(StripedBlockReconstructor.java:41) at org.apache.hadoop.hdfs.server.datanode.erasurecode.ErasureCodingWorker.processErasureCodingTasks(ErasureCodingWorker.java:133) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:796) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:680) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.processCommand(BPServiceActor.java:1314) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.lambda$enqueue$2(BPServiceActor.java:1360) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.processQueue(BPServiceActor.java:1287) ``` ### How was this patch tested? unit test ### For code changes: - [x] Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')? - [ ] Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, `NOTICE-binary` files? Issue Time Tracking ------------------- Worklog Id: (was: 752615) Remaining Estimate: 0h Time Spent: 10m > EC: NameNode should not send a reconstruction work when the source datanodes > are insufficient > --------------------------------------------------------------------------------------------- > > Key: HDFS-16479 > URL: https://issues.apache.org/jira/browse/HDFS-16479 > Project: Hadoop HDFS > Issue Type: Bug > Components: ec, erasure-coding > Reporter: Yuanbo Liu > Priority: Critical > Time Spent: 10m > Remaining Estimate: 0h > > We got this exception from DataNodes > {color:#707070}java.lang.IllegalArgumentException: No enough live striped > blocks.{color} > {color:#707070} at > com.google.common.base.Preconditions.checkArgument(Preconditions.java:141){color} > {color:#707070} at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.<init>(StripedReader.java:128){color} > {color:#707070} at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReconstructor.<init>(StripedReconstructor.java:135){color} > {color:#707070} at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.<init>(StripedBlockReconstructor.java:41){color} > {color:#707070} at > org.apache.hadoop.hdfs.server.datanode.erasurecode.ErasureCodingWorker.processErasureCodingTasks(ErasureCodingWorker.java:133){color} > {color:#707070} at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:796){color} > {color:#707070} at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:680){color} > {color:#707070} at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.processCommand(BPServiceActor.java:1314){color} > {color:#707070} at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.lambda$enqueue$2(BPServiceActor.java:1360){color} > {color:#707070} at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.processQueue(BPServiceActor.java:1287){color} > After going through the code of ErasureCodingWork.java, we found > {code:java} > targets[0].getDatanodeDescriptor().addBlockToBeErasureCoded( new > ExtendedBlock(blockPoolId, stripedBlk), getSrcNodes(), targets, > getLiveBlockIndicies(), stripedBlk.getErasureCodingPolicy()); > {code} > > the liveBusyBlockIndicies is not considered as liveBlockIndicies, hence > erasure coding reconstruction sometimes will fail as 'No enough live striped > blocks'. -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org