[ 
https://issues.apache.org/jira/browse/HBASE-25774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17340510#comment-17340510
 ] 

Duo Zhang commented on HBASE-25774:
-----------------------------------

So it does not only effect peer modification related procedures. For all the 
procedures which needs to refresh state on region server, we need to get all 
the region servers which have called regionServerStartup.

And since it has not been released in 2.4.3 yet, I plan to change the priority 
to blocker and set fix versions to 2.4.3 and 2.3.6.

Shout if you have other opinions [~apurtell] [~ndimiduk].

Thanks.

> ServerManager.getOnlineServer may miss some region servers when refreshing 
> state in some procedure implementations
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-25774
>                 URL: https://issues.apache.org/jira/browse/HBASE-25774
>             Project: HBase
>          Issue Type: Bug
>          Components: Replication
>            Reporter: Xiaolin Ha
>            Assignee: Duo Zhang
>            Priority: Critical
>
> [https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3025/9/testReport/org.apache.hadoop.hbase.replication/TestSyncReplicationStandbyKillRS/precommit_checks___yetus_jdk8_Hadoop3_checks______/]
> {code:java}
> ...[truncated 391170 chars]...
> 76d634:45149.replicationSource,1] regionserver.HRegionServer(2351): STOPPED: 
> Unexpected exception in RS:2;ece3af76d634:45149.replicationSource,1
> 2021-04-11T11:14:40,268 INFO  [RS:2;ece3af76d634:45149] 
> regionserver.HeapMemoryManager(218): Stopping
> 2021-04-11T11:14:40,268 INFO  [MemStoreFlusher.0] 
> regionserver.MemStoreFlusher$FlushHandler(384): MemStoreFlusher.0 exiting
> 2021-04-11T11:14:40,268 INFO  [RS:2;ece3af76d634:45149] 
> flush.RegionServerFlushTableProcedureManager(118): Stopping region server 
> flush procedure manager abruptly.
> 2021-04-11T11:14:40,270 INFO  [RS:2;ece3af76d634:45149] 
> snapshot.RegionServerSnapshotManager(136): Stopping 
> RegionServerSnapshotManager abruptly.
> 2021-04-11T11:14:40,270 INFO  [RS:2;ece3af76d634:45149] 
> regionserver.HRegionServer(1146): aborting server 
> ece3af76d634,45149,1618139661734
> 2021-04-11T11:14:40,272 ERROR 
> [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] 
> regionserver.ReplicationSource(428): Unexpected exception in 
> ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 
> currentPath=null
> java.lang.IllegalStateException: Source should be active.
>       at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.initialize(ReplicationSource.java:547)
>  ~[classes/:?]
>       at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282]
> 2021-04-11T11:14:40,272 DEBUG 
> [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] 
> regionserver.HRegionServer(2576): Abort already in progress. Ignoring the 
> current request with reason: Unexpected exception in 
> ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245
> {code}
> Maybe it should use HBASE-24877 to avoid failure of the initialize of 
> ReplicationSource.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to