[ 
https://issues.apache.org/jira/browse/HBASE-27387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17711818#comment-17711818
 ] 

guoxiaojiao commented on HBASE-27387:
-------------------------------------

This issue may occur when RegionServer start or enable_peer, multiple 
ReplicationSourceShipper threads (hbase.wal.provider=muliwal, 
hbase.wal.regiongrouping.numgroups =3) start replicate data to the peer 
cluster, the monitor for every table's replication (Map<String, 
MetricsReplicationTableSource> singleSourceSourceByTable) needs to update after 
a entry batch replicated, singleSourceSourceByTable use HashMap is 
thread-unsafe , so cause ConcurrentModificationException. 

In extreme cases, a wal has replicated, then encounter 
ConcurrentModificationException, so it will retry, but wal information in 
zookeeper cannot be update again, we may be encounter NoNode Exception.

>  MetricsSource lastShippedTimeStamps ConcurrentModificationException cause 
> RegionServer crash
> ---------------------------------------------------------------------------------------------
>
>                 Key: HBASE-27387
>                 URL: https://issues.apache.org/jira/browse/HBASE-27387
>             Project: HBase
>          Issue Type: Bug
>            Reporter: zhengsicheng
>            Priority: Minor
>
> 022-09-20 14:14:40,332 ERROR [regionserver/hostname1:16020] 
> regionserver.HRegionServer: ***** ABORTING region server 
> hostname1,16020,1663147531495: Unhandled: null *****
>  8587 java.util.ConcurrentModificationException
>  8588     at java.util.HashMap$HashIterator.nextNode(HashMap.java:1442)
>  8589     at java.util.HashMap$ValueIterator.next(HashMap.java:1471)
>  8590     at 
> org.apache.hadoop.hbase.replication.regionserver.MetricsSource.getTimestampOfLastShippedOp(MetricsSource.java:321)
>  8591     at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationLoad.buildReplicationLoad(ReplicationLoad.java:80)
>  8592     at 
> org.apache.hadoop.hbase.replication.regionserver.Replication.buildReplicationLoad(Replication.java:264)
>  8593     at 
> org.apache.hadoop.hbase.replication.regionserver.Replication.refreshAndGetReplicationLoad(Replication.java:253)
>  8594     at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.buildServerLoad(HRegionServer.java:1436)
>  8595     at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:1243)
>  8596     at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:1065)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to