[
https://issues.apache.org/jira/browse/HDFS-15796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17375421#comment-17375421
]
Daniel Ma edited comment on HDFS-15796 at 7/6/21, 9:57 AM:
-----------------------------------------------------------
[~sodonnell]
Thanks for reviewing, Actually you missed the for loop here:
{code:java}
//代码占位符
synchronized (pendingReconstruction) {
List<DatanodeStorageInfo> targets = pendingReconstruction
.getTargets(rw.getBlock());
if (targets != null) {
for (DatanodeStorageInfo dn : targets) {
if (!excludedNodes.contains(dn.getDatanodeDescriptor())) {
excludedNodes.add(dn.getDatanodeDescriptor());
}
}
}
}
{code}
The problem happens when the code above try to travel the DataNodes stored in
pendingReconstruction object, while the DataNode list is also been modifing
elsewhere.
In other words, if you modify a List(delete or add an element) and visit it in
the same time, ConcurrentModificationException will be casted.
was (Author: daniel ma):
[~sodonnell]
Thanks for reviewing, Actually you missed the for loop here:
{code:java}
//代码占位符
synchronized (pendingReconstruction) {
List<DatanodeStorageInfo> targets = pendingReconstruction
.getTargets(rw.getBlock());
if (targets != null) {
for (DatanodeStorageInfo dn : targets) {
if (!excludedNodes.contains(dn.getDatanodeDescriptor())) {
excludedNodes.add(dn.getDatanodeDescriptor());
}
}
}
}
{code}
The problem happens when the code above try to travel the DataNodes stored in
pendingReconstruction object, while the DataNode list is also be modified
elsewhere.
In other words, if you modify a List(delete or add an element) and visit it in
the same time, ConcurrentModificationException will be casted.
> ConcurrentModificationException error happens on NameNode occasionally
> ----------------------------------------------------------------------
>
> Key: HDFS-15796
> URL: https://issues.apache.org/jira/browse/HDFS-15796
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: hdfs
> Affects Versions: 3.1.1
> Reporter: Daniel Ma
> Priority: Critical
> Attachments: 0001-HDFS-15796.patch
>
>
> ConcurrentModificationException error happens on NameNode occasionally.
>
> {code:java}
> 2021-01-23 20:21:18,107 | ERROR | RedundancyMonitor | RedundancyMonitor
> thread received Runtime exception. | BlockManager.java:4746
> java.util.ConcurrentModificationException
> at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:909)
> at java.util.ArrayList$Itr.next(ArrayList.java:859)
> at
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReconstructionWorkForBlocks(BlockManager.java:1907)
> at
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeBlockReconstructionWork(BlockManager.java:1859)
> at
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:4862)
> at
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$RedundancyMonitor.run(BlockManager.java:4729)
> at java.lang.Thread.run(Thread.java:748)
> {code}
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]