[
https://issues.apache.org/jira/browse/HBASE-27387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17711818#comment-17711818
]
guoxiaojiao commented on HBASE-27387:
-------------------------------------
This issue may occur when RegionServer start or enable_peer, multiple
ReplicationSourceShipper threads (hbase.wal.provider=muliwal,
hbase.wal.regiongrouping.numgroups =3) start replicate data to the peer
cluster, the monitor for every table's replication (Map<String,
MetricsReplicationTableSource> singleSourceSourceByTable) needs to update after
a entry batch replicated, singleSourceSourceByTable use HashMap is
thread-unsafe , so cause ConcurrentModificationException.
In extreme cases, a wal has replicated, then encounter
ConcurrentModificationException, so it will retry, but wal information in
zookeeper cannot be update again, we may be encounter NoNode Exception.
> MetricsSource lastShippedTimeStamps ConcurrentModificationException cause
> RegionServer crash
> ---------------------------------------------------------------------------------------------
>
> Key: HBASE-27387
> URL: https://issues.apache.org/jira/browse/HBASE-27387
> Project: HBase
> Issue Type: Bug
> Reporter: zhengsicheng
> Priority: Minor
>
> 022-09-20 14:14:40,332 ERROR [regionserver/hostname1:16020]
> regionserver.HRegionServer: ***** ABORTING region server
> hostname1,16020,1663147531495: Unhandled: null *****
> 8587 java.util.ConcurrentModificationException
> 8588 at java.util.HashMap$HashIterator.nextNode(HashMap.java:1442)
> 8589 at java.util.HashMap$ValueIterator.next(HashMap.java:1471)
> 8590 at
> org.apache.hadoop.hbase.replication.regionserver.MetricsSource.getTimestampOfLastShippedOp(MetricsSource.java:321)
> 8591 at
> org.apache.hadoop.hbase.replication.regionserver.ReplicationLoad.buildReplicationLoad(ReplicationLoad.java:80)
> 8592 at
> org.apache.hadoop.hbase.replication.regionserver.Replication.buildReplicationLoad(Replication.java:264)
> 8593 at
> org.apache.hadoop.hbase.replication.regionserver.Replication.refreshAndGetReplicationLoad(Replication.java:253)
> 8594 at
> org.apache.hadoop.hbase.regionserver.HRegionServer.buildServerLoad(HRegionServer.java:1436)
> 8595 at
> org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:1243)
> 8596 at
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:1065)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)