[ 
https://issues.apache.org/jira/browse/HDFS-15796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17370995#comment-17370995
 ] 

Daniel Ma edited comment on HDFS-15796 at 6/29/21, 2:35 AM:
------------------------------------------------------------

[~weichiu]  No idea what kind of condition can reproduce this problem. it seems 
the tergets object is modified elsewhere, when 
computeReconstrutionWorkForBlocks is in progress.
{code:java}
//代码占位符
// Step 2: choose target nodes for each reconstruction task
for (BlockReconstructionWork rw : reconWork) {
  // Exclude all of the containing nodes from being targets.
  // This list includes decommissioning or corrupt nodes.
  final Set<Node> excludedNodes = new HashSet<>(rw.getContainingNodes());
    List<DatanodeStorageInfo> targets = pendingReconstruction
        .getTargets(rw.getBlock());
    if (targets != null) {
      for (DatanodeStorageInfo dn : targets) {
        if (!excludedNodes.contains(dn.getDatanodeDescriptor())) {
          excludedNodes.add(dn.getDatanodeDescriptor());
        }
      }
    }

  // choose replication targets: NOT HOLDING THE GLOBAL LOCK
  final BlockPlacementPolicy placementPolicy =
      placementPolicies.getPolicy(rw.getBlock().getBlockType());
  rw.chooseTargets(placementPolicy, storagePolicySuite, excludedNodes);
}

{code}
 


was (Author: daniel ma):
[~weichiu]  No idea what kind of condition can reproduce this problem. it seems 
the tergets object is modified elsewhere, when 
computeReconstrutionWorkForBlocks is in progress.
{quote}// Step 2: choose target nodes for each reconstruction task
 for (BlockReconstructionWork rw : reconWork) {
     // Exclude all of the containing nodes from being targets.
     // This list includes decommissioning or corrupt nodes.
     final Set<Node> excludedNodes = new HashSet<>(rw.getContainingNodes());
     List<DatanodeStorageInfo> targets = pendingReconstruction
         .getTargets(rw.getBlock());
     if (targets != null) {
         for (DatanodeStorageInfo dn : targets) {
               if (!excludedNodes.contains(dn.getDatanodeDescriptor())) {       
                                        
excludedNodes.add(dn.getDatanodeDescriptor());                

              }

         }
      }

     // choose replication targets: NOT HOLDING THE GLOBAL LOCK
      final BlockPlacementPolicy placementPolicy =
      placementPolicies.getPolicy(rw.getBlock().getBlockType());
      rw.chooseTargets(placementPolicy, storagePolicySuite, excludedNodes);
 }
{quote}
 

> ConcurrentModificationException error happens on NameNode occasionally
> ----------------------------------------------------------------------
>
>                 Key: HDFS-15796
>                 URL: https://issues.apache.org/jira/browse/HDFS-15796
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs
>    Affects Versions: 3.1.1
>            Reporter: Daniel Ma
>            Priority: Critical
>
> ConcurrentModificationException error happens on NameNode occasionally.
>  
> {code:java}
> 2021-01-23 20:21:18,107 | ERROR | RedundancyMonitor | RedundancyMonitor 
> thread received Runtime exception.  | BlockManager.java:4746
> java.util.ConcurrentModificationException
>       at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:909)
>       at java.util.ArrayList$Itr.next(ArrayList.java:859)
>       at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReconstructionWorkForBlocks(BlockManager.java:1907)
>       at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeBlockReconstructionWork(BlockManager.java:1859)
>       at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:4862)
>       at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$RedundancyMonitor.run(BlockManager.java:4729)
>       at java.lang.Thread.run(Thread.java:748)
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to