[jira] [Commented] (HBASE-25774) ServerManager.getOnlineServer may miss some region servers when refreshing state in some procedure implementations
[ https://issues.apache.org/jira/browse/HBASE-25774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17342127#comment-17342127 ] Michael Stack commented on HBASE-25774: --- Thanks for figuring the race [~zhangduo] (My bad too for not seeing it on review...) > ServerManager.getOnlineServer may miss some region servers when refreshing > state in some procedure implementations > -- > > Key: HBASE-25774 > URL: https://issues.apache.org/jira/browse/HBASE-25774 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: Xiaolin Ha >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0-alpha-1, 1.7.0, 2.5.0, 2.4.3, 2.3.5.1 > > > [https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3025/9/testReport/org.apache.hadoop.hbase.replication/TestSyncReplicationStandbyKillRS/precommit_checks___yetus_jdk8_Hadoop3_checks__/] > {code:java} > ...[truncated 391170 chars]... > 76d634:45149.replicationSource,1] regionserver.HRegionServer(2351): STOPPED: > Unexpected exception in RS:2;ece3af76d634:45149.replicationSource,1 > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > regionserver.HeapMemoryManager(218): Stopping > 2021-04-11T11:14:40,268 INFO [MemStoreFlusher.0] > regionserver.MemStoreFlusher$FlushHandler(384): MemStoreFlusher.0 exiting > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > flush.RegionServerFlushTableProcedureManager(118): Stopping region server > flush procedure manager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > snapshot.RegionServerSnapshotManager(136): Stopping > RegionServerSnapshotManager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > regionserver.HRegionServer(1146): aborting server > ece3af76d634,45149,1618139661734 > 2021-04-11T11:14:40,272 ERROR > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.ReplicationSource(428): Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > currentPath=null > java.lang.IllegalStateException: Source should be active. > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.initialize(ReplicationSource.java:547) > ~[classes/:?] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282] > 2021-04-11T11:14:40,272 DEBUG > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.HRegionServer(2576): Abort already in progress. Ignoring the > current request with reason: Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > {code} > Maybe it should use HBASE-24877 to avoid failure of the initialize of > ReplicationSource. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25774) ServerManager.getOnlineServer may miss some region servers when refreshing state in some procedure implementations
[ https://issues.apache.org/jira/browse/HBASE-25774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17342077#comment-17342077 ] Nick Dimiduk commented on HBASE-25774: -- Looks like the discussion for how to hand 2.3.5 was split between here and HBASE-25032. I read to the end of the comments on that jira before I did this one, and so I left my opinion over there. > ServerManager.getOnlineServer may miss some region servers when refreshing > state in some procedure implementations > -- > > Key: HBASE-25774 > URL: https://issues.apache.org/jira/browse/HBASE-25774 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: Xiaolin Ha >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0-alpha-1, 1.7.0, 2.5.0, 2.4.3, 2.3.5.1 > > > [https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3025/9/testReport/org.apache.hadoop.hbase.replication/TestSyncReplicationStandbyKillRS/precommit_checks___yetus_jdk8_Hadoop3_checks__/] > {code:java} > ...[truncated 391170 chars]... > 76d634:45149.replicationSource,1] regionserver.HRegionServer(2351): STOPPED: > Unexpected exception in RS:2;ece3af76d634:45149.replicationSource,1 > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > regionserver.HeapMemoryManager(218): Stopping > 2021-04-11T11:14:40,268 INFO [MemStoreFlusher.0] > regionserver.MemStoreFlusher$FlushHandler(384): MemStoreFlusher.0 exiting > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > flush.RegionServerFlushTableProcedureManager(118): Stopping region server > flush procedure manager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > snapshot.RegionServerSnapshotManager(136): Stopping > RegionServerSnapshotManager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > regionserver.HRegionServer(1146): aborting server > ece3af76d634,45149,1618139661734 > 2021-04-11T11:14:40,272 ERROR > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.ReplicationSource(428): Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > currentPath=null > java.lang.IllegalStateException: Source should be active. > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.initialize(ReplicationSource.java:547) > ~[classes/:?] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282] > 2021-04-11T11:14:40,272 DEBUG > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.HRegionServer(2576): Abort already in progress. Ignoring the > current request with reason: Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > {code} > Maybe it should use HBASE-24877 to avoid failure of the initialize of > ReplicationSource. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25774) ServerManager.getOnlineServer may miss some region servers when refreshing state in some procedure implementations
[ https://issues.apache.org/jira/browse/HBASE-25774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341509#comment-17341509 ] Hudson commented on HBASE-25774: Results for branch master [build #287 on builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/287/]: (/) *{color:green}+1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/287/General_20Nightly_20Build_20Report/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/287/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 jdk11 hadoop3 checks{color} -- For more information [see jdk11 report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/287/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > ServerManager.getOnlineServer may miss some region servers when refreshing > state in some procedure implementations > -- > > Key: HBASE-25774 > URL: https://issues.apache.org/jira/browse/HBASE-25774 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: Xiaolin Ha >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0-alpha-1, 1.7.0, 2.5.0, 2.4.3, 2.3.5.1 > > > [https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3025/9/testReport/org.apache.hadoop.hbase.replication/TestSyncReplicationStandbyKillRS/precommit_checks___yetus_jdk8_Hadoop3_checks__/] > {code:java} > ...[truncated 391170 chars]... > 76d634:45149.replicationSource,1] regionserver.HRegionServer(2351): STOPPED: > Unexpected exception in RS:2;ece3af76d634:45149.replicationSource,1 > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > regionserver.HeapMemoryManager(218): Stopping > 2021-04-11T11:14:40,268 INFO [MemStoreFlusher.0] > regionserver.MemStoreFlusher$FlushHandler(384): MemStoreFlusher.0 exiting > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > flush.RegionServerFlushTableProcedureManager(118): Stopping region server > flush procedure manager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > snapshot.RegionServerSnapshotManager(136): Stopping > RegionServerSnapshotManager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > regionserver.HRegionServer(1146): aborting server > ece3af76d634,45149,1618139661734 > 2021-04-11T11:14:40,272 ERROR > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.ReplicationSource(428): Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > currentPath=null > java.lang.IllegalStateException: Source should be active. > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.initialize(ReplicationSource.java:547) > ~[classes/:?] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282] > 2021-04-11T11:14:40,272 DEBUG > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.HRegionServer(2576): Abort already in progress. Ignoring the > current request with reason: Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > {code} > Maybe it should use HBASE-24877 to avoid failure of the initialize of > ReplicationSource. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25774) ServerManager.getOnlineServer may miss some region servers when refreshing state in some procedure implementations
[ https://issues.apache.org/jira/browse/HBASE-25774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341450#comment-17341450 ] Duo Zhang commented on HBASE-25774: --- I think the main point of HBASE-25032 is about region assignment. The entry point of the final candidates is ServerManager.createDestinationServersList, I think we could just filter out the region servers which haven't done the first regionServerReport yet. On adding states, it may confuse the developers as we have a ServerState enum, in AssignmentManager related code. Maybe just add a boolean flag in ServerMetrics, something like boolean isInitialized(); And in regionServerStartup, we set this flag to false, and in regionServerReport, we set it to true. Thanks. > ServerManager.getOnlineServer may miss some region servers when refreshing > state in some procedure implementations > -- > > Key: HBASE-25774 > URL: https://issues.apache.org/jira/browse/HBASE-25774 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: Xiaolin Ha >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0-alpha-1, 1.7.0, 2.5.0, 2.4.3, 2.3.5.1 > > > [https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3025/9/testReport/org.apache.hadoop.hbase.replication/TestSyncReplicationStandbyKillRS/precommit_checks___yetus_jdk8_Hadoop3_checks__/] > {code:java} > ...[truncated 391170 chars]... > 76d634:45149.replicationSource,1] regionserver.HRegionServer(2351): STOPPED: > Unexpected exception in RS:2;ece3af76d634:45149.replicationSource,1 > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > regionserver.HeapMemoryManager(218): Stopping > 2021-04-11T11:14:40,268 INFO [MemStoreFlusher.0] > regionserver.MemStoreFlusher$FlushHandler(384): MemStoreFlusher.0 exiting > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > flush.RegionServerFlushTableProcedureManager(118): Stopping region server > flush procedure manager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > snapshot.RegionServerSnapshotManager(136): Stopping > RegionServerSnapshotManager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > regionserver.HRegionServer(1146): aborting server > ece3af76d634,45149,1618139661734 > 2021-04-11T11:14:40,272 ERROR > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.ReplicationSource(428): Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > currentPath=null > java.lang.IllegalStateException: Source should be active. > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.initialize(ReplicationSource.java:547) > ~[classes/:?] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282] > 2021-04-11T11:14:40,272 DEBUG > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.HRegionServer(2576): Abort already in progress. Ignoring the > current request with reason: Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > {code} > Maybe it should use HBASE-24877 to avoid failure of the initialize of > ReplicationSource. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25774) ServerManager.getOnlineServer may miss some region servers when refreshing state in some procedure implementations
[ https://issues.apache.org/jira/browse/HBASE-25774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341447#comment-17341447 ] Bharath Vissapragada commented on HBASE-25774: -- Nice find, couldn't think of this race when reviewing HBASE-25032, my bad. Agree that reverting it is best short term solution until we fix it cleanly. Coming to the fix, it seems like the issue here is the definition of what "online" means. I think we should split it into two states, something like INITIALIZED, REGISTERED. First state means that the RS has initialized (set during regionServerStartup()) but is waiting to be marked ready by master and the second one means that it is ready to receive requests (set in first report). Certain procedures (like refresh peer etc) that are interested in the all servers while code paths like AM are interested the REGISTERED ones. We should audit the code for usages of ServerManager carefully to make sure all code paths are addressed, WDYT? FYI [~caroliney14] > ServerManager.getOnlineServer may miss some region servers when refreshing > state in some procedure implementations > -- > > Key: HBASE-25774 > URL: https://issues.apache.org/jira/browse/HBASE-25774 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: Xiaolin Ha >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0-alpha-1, 1.7.0, 2.5.0, 2.4.3, 2.3.5.1 > > > [https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3025/9/testReport/org.apache.hadoop.hbase.replication/TestSyncReplicationStandbyKillRS/precommit_checks___yetus_jdk8_Hadoop3_checks__/] > {code:java} > ...[truncated 391170 chars]... > 76d634:45149.replicationSource,1] regionserver.HRegionServer(2351): STOPPED: > Unexpected exception in RS:2;ece3af76d634:45149.replicationSource,1 > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > regionserver.HeapMemoryManager(218): Stopping > 2021-04-11T11:14:40,268 INFO [MemStoreFlusher.0] > regionserver.MemStoreFlusher$FlushHandler(384): MemStoreFlusher.0 exiting > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > flush.RegionServerFlushTableProcedureManager(118): Stopping region server > flush procedure manager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > snapshot.RegionServerSnapshotManager(136): Stopping > RegionServerSnapshotManager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > regionserver.HRegionServer(1146): aborting server > ece3af76d634,45149,1618139661734 > 2021-04-11T11:14:40,272 ERROR > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.ReplicationSource(428): Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > currentPath=null > java.lang.IllegalStateException: Source should be active. > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.initialize(ReplicationSource.java:547) > ~[classes/:?] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282] > 2021-04-11T11:14:40,272 DEBUG > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.HRegionServer(2576): Abort already in progress. Ignoring the > current request with reason: Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > {code} > Maybe it should use HBASE-24877 to avoid failure of the initialize of > ReplicationSource. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25774) ServerManager.getOnlineServer may miss some region servers when refreshing state in some procedure implementations
[ https://issues.apache.org/jira/browse/HBASE-25774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341440#comment-17341440 ] Hudson commented on HBASE-25774: Results for branch branch-2.4 [build #112 on builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/112/]: (/) *{color:green}+1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/112/General_20Nightly_20Build_20Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/112/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/112/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 jdk11 hadoop3 checks{color} -- For more information [see jdk11 report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/112/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > ServerManager.getOnlineServer may miss some region servers when refreshing > state in some procedure implementations > -- > > Key: HBASE-25774 > URL: https://issues.apache.org/jira/browse/HBASE-25774 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: Xiaolin Ha >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0-alpha-1, 1.7.0, 2.5.0, 2.4.3, 2.3.5.1 > > > [https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3025/9/testReport/org.apache.hadoop.hbase.replication/TestSyncReplicationStandbyKillRS/precommit_checks___yetus_jdk8_Hadoop3_checks__/] > {code:java} > ...[truncated 391170 chars]... > 76d634:45149.replicationSource,1] regionserver.HRegionServer(2351): STOPPED: > Unexpected exception in RS:2;ece3af76d634:45149.replicationSource,1 > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > regionserver.HeapMemoryManager(218): Stopping > 2021-04-11T11:14:40,268 INFO [MemStoreFlusher.0] > regionserver.MemStoreFlusher$FlushHandler(384): MemStoreFlusher.0 exiting > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > flush.RegionServerFlushTableProcedureManager(118): Stopping region server > flush procedure manager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > snapshot.RegionServerSnapshotManager(136): Stopping > RegionServerSnapshotManager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > regionserver.HRegionServer(1146): aborting server > ece3af76d634,45149,1618139661734 > 2021-04-11T11:14:40,272 ERROR > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.ReplicationSource(428): Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > currentPath=null > java.lang.IllegalStateException: Source should be active. > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.initialize(ReplicationSource.java:547) > ~[classes/:?] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282] > 2021-04-11T11:14:40,272 DEBUG > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.HRegionServer(2576): Abort already in progress. Ignoring the > current request with reason: Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > {code} > Maybe it should use HBASE-24877 to avoid failure of the initialize of > ReplicationSource. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25774) ServerManager.getOnlineServer may miss some region servers when refreshing state in some procedure implementations
[ https://issues.apache.org/jira/browse/HBASE-25774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341394#comment-17341394 ] Andrew Kyle Purtell commented on HBASE-25774: - Thanks for the addendum [~zhangduo], I noticed the issue this morning > ServerManager.getOnlineServer may miss some region servers when refreshing > state in some procedure implementations > -- > > Key: HBASE-25774 > URL: https://issues.apache.org/jira/browse/HBASE-25774 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: Xiaolin Ha >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0-alpha-1, 1.7.0, 2.5.0, 2.4.3, 2.3.5.1 > > > [https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3025/9/testReport/org.apache.hadoop.hbase.replication/TestSyncReplicationStandbyKillRS/precommit_checks___yetus_jdk8_Hadoop3_checks__/] > {code:java} > ...[truncated 391170 chars]... > 76d634:45149.replicationSource,1] regionserver.HRegionServer(2351): STOPPED: > Unexpected exception in RS:2;ece3af76d634:45149.replicationSource,1 > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > regionserver.HeapMemoryManager(218): Stopping > 2021-04-11T11:14:40,268 INFO [MemStoreFlusher.0] > regionserver.MemStoreFlusher$FlushHandler(384): MemStoreFlusher.0 exiting > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > flush.RegionServerFlushTableProcedureManager(118): Stopping region server > flush procedure manager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > snapshot.RegionServerSnapshotManager(136): Stopping > RegionServerSnapshotManager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > regionserver.HRegionServer(1146): aborting server > ece3af76d634,45149,1618139661734 > 2021-04-11T11:14:40,272 ERROR > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.ReplicationSource(428): Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > currentPath=null > java.lang.IllegalStateException: Source should be active. > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.initialize(ReplicationSource.java:547) > ~[classes/:?] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282] > 2021-04-11T11:14:40,272 DEBUG > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.HRegionServer(2576): Abort already in progress. Ignoring the > current request with reason: Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > {code} > Maybe it should use HBASE-24877 to avoid failure of the initialize of > ReplicationSource. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25774) ServerManager.getOnlineServer may miss some region servers when refreshing state in some procedure implementations
[ https://issues.apache.org/jira/browse/HBASE-25774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341384#comment-17341384 ] Hudson commented on HBASE-25774: Results for branch branch-2 [build #245 on builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/245/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/245/General_20Nightly_20Build_20Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/245/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/245/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 jdk11 hadoop3 checks{color} -- For more information [see jdk11 report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/245/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > ServerManager.getOnlineServer may miss some region servers when refreshing > state in some procedure implementations > -- > > Key: HBASE-25774 > URL: https://issues.apache.org/jira/browse/HBASE-25774 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: Xiaolin Ha >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0-alpha-1, 1.7.0, 2.5.0, 2.4.3, 2.3.5.1 > > > [https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3025/9/testReport/org.apache.hadoop.hbase.replication/TestSyncReplicationStandbyKillRS/precommit_checks___yetus_jdk8_Hadoop3_checks__/] > {code:java} > ...[truncated 391170 chars]... > 76d634:45149.replicationSource,1] regionserver.HRegionServer(2351): STOPPED: > Unexpected exception in RS:2;ece3af76d634:45149.replicationSource,1 > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > regionserver.HeapMemoryManager(218): Stopping > 2021-04-11T11:14:40,268 INFO [MemStoreFlusher.0] > regionserver.MemStoreFlusher$FlushHandler(384): MemStoreFlusher.0 exiting > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > flush.RegionServerFlushTableProcedureManager(118): Stopping region server > flush procedure manager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > snapshot.RegionServerSnapshotManager(136): Stopping > RegionServerSnapshotManager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > regionserver.HRegionServer(1146): aborting server > ece3af76d634,45149,1618139661734 > 2021-04-11T11:14:40,272 ERROR > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.ReplicationSource(428): Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > currentPath=null > java.lang.IllegalStateException: Source should be active. > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.initialize(ReplicationSource.java:547) > ~[classes/:?] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282] > 2021-04-11T11:14:40,272 DEBUG > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.HRegionServer(2576): Abort already in progress. Ignoring the > current request with reason: Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > {code} > Maybe it should use HBASE-24877 to avoid failure of the initialize of > ReplicationSource. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25774) ServerManager.getOnlineServer may miss some region servers when refreshing state in some procedure implementations
[ https://issues.apache.org/jira/browse/HBASE-25774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341360#comment-17341360 ] Hudson commented on HBASE-25774: Results for branch branch-2.3 [build #216 on builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.3/216/]: (/) *{color:green}+1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.3/216/General_20Nightly_20Build_20Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.3/216/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.3/216/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 jdk11 hadoop3 checks{color} -- For more information [see jdk11 report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.3/216/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > ServerManager.getOnlineServer may miss some region servers when refreshing > state in some procedure implementations > -- > > Key: HBASE-25774 > URL: https://issues.apache.org/jira/browse/HBASE-25774 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: Xiaolin Ha >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0-alpha-1, 1.7.0, 2.5.0, 2.4.3, 2.3.5.1 > > > [https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3025/9/testReport/org.apache.hadoop.hbase.replication/TestSyncReplicationStandbyKillRS/precommit_checks___yetus_jdk8_Hadoop3_checks__/] > {code:java} > ...[truncated 391170 chars]... > 76d634:45149.replicationSource,1] regionserver.HRegionServer(2351): STOPPED: > Unexpected exception in RS:2;ece3af76d634:45149.replicationSource,1 > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > regionserver.HeapMemoryManager(218): Stopping > 2021-04-11T11:14:40,268 INFO [MemStoreFlusher.0] > regionserver.MemStoreFlusher$FlushHandler(384): MemStoreFlusher.0 exiting > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > flush.RegionServerFlushTableProcedureManager(118): Stopping region server > flush procedure manager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > snapshot.RegionServerSnapshotManager(136): Stopping > RegionServerSnapshotManager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > regionserver.HRegionServer(1146): aborting server > ece3af76d634,45149,1618139661734 > 2021-04-11T11:14:40,272 ERROR > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.ReplicationSource(428): Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > currentPath=null > java.lang.IllegalStateException: Source should be active. > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.initialize(ReplicationSource.java:547) > ~[classes/:?] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282] > 2021-04-11T11:14:40,272 DEBUG > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.HRegionServer(2576): Abort already in progress. Ignoring the > current request with reason: Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > {code} > Maybe it should use HBASE-24877 to avoid failure of the initialize of > ReplicationSource. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25774) ServerManager.getOnlineServer may miss some region servers when refreshing state in some procedure implementations
[ https://issues.apache.org/jira/browse/HBASE-25774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341351#comment-17341351 ] Hudson commented on HBASE-25774: Results for branch branch-1 [build #124 on builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-1/124/]: (x) *{color:red}-1 overall{color}* details (if available): (x) {color:red}-1 general checks{color} -- For more information [see general report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-1/124//General_Nightly_Build_Report/] (/) {color:green}+1 jdk7 checks{color} -- For more information [see jdk7 report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-1/124//JDK7_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-1/124//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 source release artifact{color} -- See build output for details. > ServerManager.getOnlineServer may miss some region servers when refreshing > state in some procedure implementations > -- > > Key: HBASE-25774 > URL: https://issues.apache.org/jira/browse/HBASE-25774 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: Xiaolin Ha >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0-alpha-1, 1.7.0, 2.5.0, 2.4.3, 2.3.5.1 > > > [https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3025/9/testReport/org.apache.hadoop.hbase.replication/TestSyncReplicationStandbyKillRS/precommit_checks___yetus_jdk8_Hadoop3_checks__/] > {code:java} > ...[truncated 391170 chars]... > 76d634:45149.replicationSource,1] regionserver.HRegionServer(2351): STOPPED: > Unexpected exception in RS:2;ece3af76d634:45149.replicationSource,1 > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > regionserver.HeapMemoryManager(218): Stopping > 2021-04-11T11:14:40,268 INFO [MemStoreFlusher.0] > regionserver.MemStoreFlusher$FlushHandler(384): MemStoreFlusher.0 exiting > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > flush.RegionServerFlushTableProcedureManager(118): Stopping region server > flush procedure manager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > snapshot.RegionServerSnapshotManager(136): Stopping > RegionServerSnapshotManager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > regionserver.HRegionServer(1146): aborting server > ece3af76d634,45149,1618139661734 > 2021-04-11T11:14:40,272 ERROR > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.ReplicationSource(428): Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > currentPath=null > java.lang.IllegalStateException: Source should be active. > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.initialize(ReplicationSource.java:547) > ~[classes/:?] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282] > 2021-04-11T11:14:40,272 DEBUG > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.HRegionServer(2576): Abort already in progress. Ignoring the > current request with reason: Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > {code} > Maybe it should use HBASE-24877 to avoid failure of the initialize of > ReplicationSource. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25774) ServerManager.getOnlineServer may miss some region servers when refreshing state in some procedure implementations
[ https://issues.apache.org/jira/browse/HBASE-25774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341176#comment-17341176 ] Hudson commented on HBASE-25774: Results for branch master [build #286 on builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/286/]: (x) *{color:red}-1 overall{color}* details (if available): (x) {color:red}-1 general checks{color} -- For more information [see general report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/286/General_20Nightly_20Build_20Report/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/286/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/] (x) {color:red}-1 jdk11 hadoop3 checks{color} -- For more information [see jdk11 report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/286/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/] (x) {color:red}-1 source release artifact{color} -- See build output for details. (x) {color:red}-1 client integration test{color} -- Something went wrong with this stage, [check relevant console output|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/286//console]. > ServerManager.getOnlineServer may miss some region servers when refreshing > state in some procedure implementations > -- > > Key: HBASE-25774 > URL: https://issues.apache.org/jira/browse/HBASE-25774 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: Xiaolin Ha >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0-alpha-1, 1.7.0, 2.5.0, 2.4.3, 2.3.5.1 > > > [https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3025/9/testReport/org.apache.hadoop.hbase.replication/TestSyncReplicationStandbyKillRS/precommit_checks___yetus_jdk8_Hadoop3_checks__/] > {code:java} > ...[truncated 391170 chars]... > 76d634:45149.replicationSource,1] regionserver.HRegionServer(2351): STOPPED: > Unexpected exception in RS:2;ece3af76d634:45149.replicationSource,1 > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > regionserver.HeapMemoryManager(218): Stopping > 2021-04-11T11:14:40,268 INFO [MemStoreFlusher.0] > regionserver.MemStoreFlusher$FlushHandler(384): MemStoreFlusher.0 exiting > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > flush.RegionServerFlushTableProcedureManager(118): Stopping region server > flush procedure manager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > snapshot.RegionServerSnapshotManager(136): Stopping > RegionServerSnapshotManager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > regionserver.HRegionServer(1146): aborting server > ece3af76d634,45149,1618139661734 > 2021-04-11T11:14:40,272 ERROR > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.ReplicationSource(428): Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > currentPath=null > java.lang.IllegalStateException: Source should be active. > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.initialize(ReplicationSource.java:547) > ~[classes/:?] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282] > 2021-04-11T11:14:40,272 DEBUG > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.HRegionServer(2576): Abort already in progress. Ignoring the > current request with reason: Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > {code} > Maybe it should use HBASE-24877 to avoid failure of the initialize of > ReplicationSource. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25774) ServerManager.getOnlineServer may miss some region servers when refreshing state in some procedure implementations
[ https://issues.apache.org/jira/browse/HBASE-25774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341095#comment-17341095 ] Andrew Kyle Purtell commented on HBASE-25774: - Yes, I marked the revert with HBASE-25774 . I think we need to set the fix versions for it. I also updated fix versions on HBASE-25032. We are pulling the 2.3.5 release from the distribution mirrors once 2.3.5.1 is out, and 2.3.5.1 will have a correct change log, so I think we will be good. Please let me know if you'd like to see something done differently. > ServerManager.getOnlineServer may miss some region servers when refreshing > state in some procedure implementations > -- > > Key: HBASE-25774 > URL: https://issues.apache.org/jira/browse/HBASE-25774 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: Xiaolin Ha >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0-alpha-1, 1.7.0, 2.5.0, 2.4.3, 2.3.5.1 > > > [https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3025/9/testReport/org.apache.hadoop.hbase.replication/TestSyncReplicationStandbyKillRS/precommit_checks___yetus_jdk8_Hadoop3_checks__/] > {code:java} > ...[truncated 391170 chars]... > 76d634:45149.replicationSource,1] regionserver.HRegionServer(2351): STOPPED: > Unexpected exception in RS:2;ece3af76d634:45149.replicationSource,1 > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > regionserver.HeapMemoryManager(218): Stopping > 2021-04-11T11:14:40,268 INFO [MemStoreFlusher.0] > regionserver.MemStoreFlusher$FlushHandler(384): MemStoreFlusher.0 exiting > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > flush.RegionServerFlushTableProcedureManager(118): Stopping region server > flush procedure manager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > snapshot.RegionServerSnapshotManager(136): Stopping > RegionServerSnapshotManager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > regionserver.HRegionServer(1146): aborting server > ece3af76d634,45149,1618139661734 > 2021-04-11T11:14:40,272 ERROR > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.ReplicationSource(428): Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > currentPath=null > java.lang.IllegalStateException: Source should be active. > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.initialize(ReplicationSource.java:547) > ~[classes/:?] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282] > 2021-04-11T11:14:40,272 DEBUG > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.HRegionServer(2576): Abort already in progress. Ignoring the > current request with reason: Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > {code} > Maybe it should use HBASE-24877 to avoid failure of the initialize of > ReplicationSource. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25774) ServerManager.getOnlineServer may miss some region servers when refreshing state in some procedure implementations
[ https://issues.apache.org/jira/browse/HBASE-25774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341093#comment-17341093 ] Duo Zhang commented on HBASE-25774: --- Oh, good, just noticed that you committed the revert patch as HBASE-25774. > ServerManager.getOnlineServer may miss some region servers when refreshing > state in some procedure implementations > -- > > Key: HBASE-25774 > URL: https://issues.apache.org/jira/browse/HBASE-25774 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: Xiaolin Ha >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0-alpha-1, 1.7.0, 2.5.0, 2.4.3, 2.3.5.1 > > > [https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3025/9/testReport/org.apache.hadoop.hbase.replication/TestSyncReplicationStandbyKillRS/precommit_checks___yetus_jdk8_Hadoop3_checks__/] > {code:java} > ...[truncated 391170 chars]... > 76d634:45149.replicationSource,1] regionserver.HRegionServer(2351): STOPPED: > Unexpected exception in RS:2;ece3af76d634:45149.replicationSource,1 > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > regionserver.HeapMemoryManager(218): Stopping > 2021-04-11T11:14:40,268 INFO [MemStoreFlusher.0] > regionserver.MemStoreFlusher$FlushHandler(384): MemStoreFlusher.0 exiting > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > flush.RegionServerFlushTableProcedureManager(118): Stopping region server > flush procedure manager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > snapshot.RegionServerSnapshotManager(136): Stopping > RegionServerSnapshotManager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > regionserver.HRegionServer(1146): aborting server > ece3af76d634,45149,1618139661734 > 2021-04-11T11:14:40,272 ERROR > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.ReplicationSource(428): Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > currentPath=null > java.lang.IllegalStateException: Source should be active. > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.initialize(ReplicationSource.java:547) > ~[classes/:?] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282] > 2021-04-11T11:14:40,272 DEBUG > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.HRegionServer(2576): Abort already in progress. Ignoring the > current request with reason: Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > {code} > Maybe it should use HBASE-24877 to avoid failure of the initialize of > ReplicationSource. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25774) ServerManager.getOnlineServer may miss some region servers when refreshing state in some procedure implementations
[ https://issues.apache.org/jira/browse/HBASE-25774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341091#comment-17341091 ] Duo Zhang commented on HBASE-25774: --- We didn't commit anything to branch other than master, so should we resolve this issue as fixed and set fix versions for all active branches? Not sure, just asking... > ServerManager.getOnlineServer may miss some region servers when refreshing > state in some procedure implementations > -- > > Key: HBASE-25774 > URL: https://issues.apache.org/jira/browse/HBASE-25774 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: Xiaolin Ha >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0-alpha-1, 1.7.0, 2.5.0, 2.4.3, 2.3.5.1 > > > [https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3025/9/testReport/org.apache.hadoop.hbase.replication/TestSyncReplicationStandbyKillRS/precommit_checks___yetus_jdk8_Hadoop3_checks__/] > {code:java} > ...[truncated 391170 chars]... > 76d634:45149.replicationSource,1] regionserver.HRegionServer(2351): STOPPED: > Unexpected exception in RS:2;ece3af76d634:45149.replicationSource,1 > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > regionserver.HeapMemoryManager(218): Stopping > 2021-04-11T11:14:40,268 INFO [MemStoreFlusher.0] > regionserver.MemStoreFlusher$FlushHandler(384): MemStoreFlusher.0 exiting > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > flush.RegionServerFlushTableProcedureManager(118): Stopping region server > flush procedure manager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > snapshot.RegionServerSnapshotManager(136): Stopping > RegionServerSnapshotManager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > regionserver.HRegionServer(1146): aborting server > ece3af76d634,45149,1618139661734 > 2021-04-11T11:14:40,272 ERROR > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.ReplicationSource(428): Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > currentPath=null > java.lang.IllegalStateException: Source should be active. > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.initialize(ReplicationSource.java:547) > ~[classes/:?] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282] > 2021-04-11T11:14:40,272 DEBUG > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.HRegionServer(2576): Abort already in progress. Ignoring the > current request with reason: Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > {code} > Maybe it should use HBASE-24877 to avoid failure of the initialize of > ReplicationSource. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25774) ServerManager.getOnlineServer may miss some region servers when refreshing state in some procedure implementations
[ https://issues.apache.org/jira/browse/HBASE-25774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341086#comment-17341086 ] Andrew Kyle Purtell commented on HBASE-25774: - I am reverting HBASE-25032 from master, branch-2, branch-2.3, and branch-2.4 now and will make the 2.3.5.1 and 2.4.3 releases. Voting starts Monday. > ServerManager.getOnlineServer may miss some region servers when refreshing > state in some procedure implementations > -- > > Key: HBASE-25774 > URL: https://issues.apache.org/jira/browse/HBASE-25774 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: Xiaolin Ha >Assignee: Duo Zhang >Priority: Critical > > [https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3025/9/testReport/org.apache.hadoop.hbase.replication/TestSyncReplicationStandbyKillRS/precommit_checks___yetus_jdk8_Hadoop3_checks__/] > {code:java} > ...[truncated 391170 chars]... > 76d634:45149.replicationSource,1] regionserver.HRegionServer(2351): STOPPED: > Unexpected exception in RS:2;ece3af76d634:45149.replicationSource,1 > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > regionserver.HeapMemoryManager(218): Stopping > 2021-04-11T11:14:40,268 INFO [MemStoreFlusher.0] > regionserver.MemStoreFlusher$FlushHandler(384): MemStoreFlusher.0 exiting > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > flush.RegionServerFlushTableProcedureManager(118): Stopping region server > flush procedure manager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > snapshot.RegionServerSnapshotManager(136): Stopping > RegionServerSnapshotManager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > regionserver.HRegionServer(1146): aborting server > ece3af76d634,45149,1618139661734 > 2021-04-11T11:14:40,272 ERROR > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.ReplicationSource(428): Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > currentPath=null > java.lang.IllegalStateException: Source should be active. > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.initialize(ReplicationSource.java:547) > ~[classes/:?] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282] > 2021-04-11T11:14:40,272 DEBUG > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.HRegionServer(2576): Abort already in progress. Ignoring the > current request with reason: Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > {code} > Maybe it should use HBASE-24877 to avoid failure of the initialize of > ReplicationSource. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25774) ServerManager.getOnlineServer may miss some region servers when refreshing state in some procedure implementations
[ https://issues.apache.org/jira/browse/HBASE-25774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340551#comment-17340551 ] Duo Zhang commented on HBASE-25774: --- {quote} We can revert it in 2.3.5 and make a 2.3.5.1 with just that one change, and revert it everywhere else, and not hold up 2.4.3 further. If that is acceptable I will do 2.3.5.1 and 2.4.3 at the same time. {quote} I'm OK with this approach. [~ndimiduk] WDYT? Thanks. > ServerManager.getOnlineServer may miss some region servers when refreshing > state in some procedure implementations > -- > > Key: HBASE-25774 > URL: https://issues.apache.org/jira/browse/HBASE-25774 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: Xiaolin Ha >Assignee: Duo Zhang >Priority: Critical > > [https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3025/9/testReport/org.apache.hadoop.hbase.replication/TestSyncReplicationStandbyKillRS/precommit_checks___yetus_jdk8_Hadoop3_checks__/] > {code:java} > ...[truncated 391170 chars]... > 76d634:45149.replicationSource,1] regionserver.HRegionServer(2351): STOPPED: > Unexpected exception in RS:2;ece3af76d634:45149.replicationSource,1 > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > regionserver.HeapMemoryManager(218): Stopping > 2021-04-11T11:14:40,268 INFO [MemStoreFlusher.0] > regionserver.MemStoreFlusher$FlushHandler(384): MemStoreFlusher.0 exiting > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > flush.RegionServerFlushTableProcedureManager(118): Stopping region server > flush procedure manager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > snapshot.RegionServerSnapshotManager(136): Stopping > RegionServerSnapshotManager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > regionserver.HRegionServer(1146): aborting server > ece3af76d634,45149,1618139661734 > 2021-04-11T11:14:40,272 ERROR > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.ReplicationSource(428): Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > currentPath=null > java.lang.IllegalStateException: Source should be active. > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.initialize(ReplicationSource.java:547) > ~[classes/:?] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282] > 2021-04-11T11:14:40,272 DEBUG > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.HRegionServer(2576): Abort already in progress. Ignoring the > current request with reason: Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > {code} > Maybe it should use HBASE-24877 to avoid failure of the initialize of > ReplicationSource. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25774) ServerManager.getOnlineServer may miss some region servers when refreshing state in some procedure implementations
[ https://issues.apache.org/jira/browse/HBASE-25774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340539#comment-17340539 ] Andrew Kyle Purtell commented on HBASE-25774: - We can revert it in 2.3.5 and make a 2.3.5.1 with just that one change, and revert it everywhere else, and not hold up 2.4.3 further. If that is acceptable I will do 2.3.5.1 and 2.4.3 at the same time. > ServerManager.getOnlineServer may miss some region servers when refreshing > state in some procedure implementations > -- > > Key: HBASE-25774 > URL: https://issues.apache.org/jira/browse/HBASE-25774 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: Xiaolin Ha >Assignee: Duo Zhang >Priority: Critical > > [https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3025/9/testReport/org.apache.hadoop.hbase.replication/TestSyncReplicationStandbyKillRS/precommit_checks___yetus_jdk8_Hadoop3_checks__/] > {code:java} > ...[truncated 391170 chars]... > 76d634:45149.replicationSource,1] regionserver.HRegionServer(2351): STOPPED: > Unexpected exception in RS:2;ece3af76d634:45149.replicationSource,1 > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > regionserver.HeapMemoryManager(218): Stopping > 2021-04-11T11:14:40,268 INFO [MemStoreFlusher.0] > regionserver.MemStoreFlusher$FlushHandler(384): MemStoreFlusher.0 exiting > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > flush.RegionServerFlushTableProcedureManager(118): Stopping region server > flush procedure manager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > snapshot.RegionServerSnapshotManager(136): Stopping > RegionServerSnapshotManager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > regionserver.HRegionServer(1146): aborting server > ece3af76d634,45149,1618139661734 > 2021-04-11T11:14:40,272 ERROR > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.ReplicationSource(428): Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > currentPath=null > java.lang.IllegalStateException: Source should be active. > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.initialize(ReplicationSource.java:547) > ~[classes/:?] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282] > 2021-04-11T11:14:40,272 DEBUG > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.HRegionServer(2576): Abort already in progress. Ignoring the > current request with reason: Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > {code} > Maybe it should use HBASE-24877 to avoid failure of the initialize of > ReplicationSource. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25774) ServerManager.getOnlineServer may miss some region servers when refreshing state in some procedure implementations
[ https://issues.apache.org/jira/browse/HBASE-25774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340536#comment-17340536 ] Duo Zhang commented on HBASE-25774: --- It has already been released in 2.3.5 so I do not think we could just revert it from all the code base. What I mean is provide a new patch here, which reverts the modification in HBASE-25032 and uses another approach to archive the same goal. > ServerManager.getOnlineServer may miss some region servers when refreshing > state in some procedure implementations > -- > > Key: HBASE-25774 > URL: https://issues.apache.org/jira/browse/HBASE-25774 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: Xiaolin Ha >Assignee: Duo Zhang >Priority: Critical > > [https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3025/9/testReport/org.apache.hadoop.hbase.replication/TestSyncReplicationStandbyKillRS/precommit_checks___yetus_jdk8_Hadoop3_checks__/] > {code:java} > ...[truncated 391170 chars]... > 76d634:45149.replicationSource,1] regionserver.HRegionServer(2351): STOPPED: > Unexpected exception in RS:2;ece3af76d634:45149.replicationSource,1 > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > regionserver.HeapMemoryManager(218): Stopping > 2021-04-11T11:14:40,268 INFO [MemStoreFlusher.0] > regionserver.MemStoreFlusher$FlushHandler(384): MemStoreFlusher.0 exiting > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > flush.RegionServerFlushTableProcedureManager(118): Stopping region server > flush procedure manager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > snapshot.RegionServerSnapshotManager(136): Stopping > RegionServerSnapshotManager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > regionserver.HRegionServer(1146): aborting server > ece3af76d634,45149,1618139661734 > 2021-04-11T11:14:40,272 ERROR > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.ReplicationSource(428): Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > currentPath=null > java.lang.IllegalStateException: Source should be active. > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.initialize(ReplicationSource.java:547) > ~[classes/:?] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282] > 2021-04-11T11:14:40,272 DEBUG > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.HRegionServer(2576): Abort already in progress. Ignoring the > current request with reason: Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > {code} > Maybe it should use HBASE-24877 to avoid failure of the initialize of > ReplicationSource. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25774) ServerManager.getOnlineServer may miss some region servers when refreshing state in some procedure implementations
[ https://issues.apache.org/jira/browse/HBASE-25774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340534#comment-17340534 ] Andrew Kyle Purtell commented on HBASE-25774: - Better to be sure. Let’s revert and reopen that issue. > ServerManager.getOnlineServer may miss some region servers when refreshing > state in some procedure implementations > -- > > Key: HBASE-25774 > URL: https://issues.apache.org/jira/browse/HBASE-25774 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: Xiaolin Ha >Assignee: Duo Zhang >Priority: Critical > > [https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3025/9/testReport/org.apache.hadoop.hbase.replication/TestSyncReplicationStandbyKillRS/precommit_checks___yetus_jdk8_Hadoop3_checks__/] > {code:java} > ...[truncated 391170 chars]... > 76d634:45149.replicationSource,1] regionserver.HRegionServer(2351): STOPPED: > Unexpected exception in RS:2;ece3af76d634:45149.replicationSource,1 > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > regionserver.HeapMemoryManager(218): Stopping > 2021-04-11T11:14:40,268 INFO [MemStoreFlusher.0] > regionserver.MemStoreFlusher$FlushHandler(384): MemStoreFlusher.0 exiting > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > flush.RegionServerFlushTableProcedureManager(118): Stopping region server > flush procedure manager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > snapshot.RegionServerSnapshotManager(136): Stopping > RegionServerSnapshotManager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > regionserver.HRegionServer(1146): aborting server > ece3af76d634,45149,1618139661734 > 2021-04-11T11:14:40,272 ERROR > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.ReplicationSource(428): Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > currentPath=null > java.lang.IllegalStateException: Source should be active. > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.initialize(ReplicationSource.java:547) > ~[classes/:?] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282] > 2021-04-11T11:14:40,272 DEBUG > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.HRegionServer(2576): Abort already in progress. Ignoring the > current request with reason: Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > {code} > Maybe it should use HBASE-24877 to avoid failure of the initialize of > ReplicationSource. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25774) ServerManager.getOnlineServer may miss some region servers when refreshing state in some procedure implementations
[ https://issues.apache.org/jira/browse/HBASE-25774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340521#comment-17340521 ] Duo Zhang commented on HBASE-25774: --- Skimmed the code, I do not think it is easy to fix as we call isServerOnline in many places, especially in RSProcedureDispatcher, we will give up if isServerOnline returns, which means we assume that we will only send procedures to online servers. So now I prefer we just revert HBASE-25032, and use another way to not assign regions to regionservers which are not fully initialized yet. Thanks. > ServerManager.getOnlineServer may miss some region servers when refreshing > state in some procedure implementations > -- > > Key: HBASE-25774 > URL: https://issues.apache.org/jira/browse/HBASE-25774 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: Xiaolin Ha >Assignee: Duo Zhang >Priority: Critical > > [https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3025/9/testReport/org.apache.hadoop.hbase.replication/TestSyncReplicationStandbyKillRS/precommit_checks___yetus_jdk8_Hadoop3_checks__/] > {code:java} > ...[truncated 391170 chars]... > 76d634:45149.replicationSource,1] regionserver.HRegionServer(2351): STOPPED: > Unexpected exception in RS:2;ece3af76d634:45149.replicationSource,1 > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > regionserver.HeapMemoryManager(218): Stopping > 2021-04-11T11:14:40,268 INFO [MemStoreFlusher.0] > regionserver.MemStoreFlusher$FlushHandler(384): MemStoreFlusher.0 exiting > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > flush.RegionServerFlushTableProcedureManager(118): Stopping region server > flush procedure manager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > snapshot.RegionServerSnapshotManager(136): Stopping > RegionServerSnapshotManager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > regionserver.HRegionServer(1146): aborting server > ece3af76d634,45149,1618139661734 > 2021-04-11T11:14:40,272 ERROR > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.ReplicationSource(428): Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > currentPath=null > java.lang.IllegalStateException: Source should be active. > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.initialize(ReplicationSource.java:547) > ~[classes/:?] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282] > 2021-04-11T11:14:40,272 DEBUG > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.HRegionServer(2576): Abort already in progress. Ignoring the > current request with reason: Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > {code} > Maybe it should use HBASE-24877 to avoid failure of the initialize of > ReplicationSource. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25774) ServerManager.getOnlineServer may miss some region servers when refreshing state in some procedure implementations
[ https://issues.apache.org/jira/browse/HBASE-25774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340510#comment-17340510 ] Duo Zhang commented on HBASE-25774: --- So it does not only effect peer modification related procedures. For all the procedures which needs to refresh state on region server, we need to get all the region servers which have called regionServerStartup. And since it has not been released in 2.4.3 yet, I plan to change the priority to blocker and set fix versions to 2.4.3 and 2.3.6. Shout if you have other opinions [~apurtell] [~ndimiduk]. Thanks. > ServerManager.getOnlineServer may miss some region servers when refreshing > state in some procedure implementations > -- > > Key: HBASE-25774 > URL: https://issues.apache.org/jira/browse/HBASE-25774 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: Xiaolin Ha >Assignee: Duo Zhang >Priority: Critical > > [https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3025/9/testReport/org.apache.hadoop.hbase.replication/TestSyncReplicationStandbyKillRS/precommit_checks___yetus_jdk8_Hadoop3_checks__/] > {code:java} > ...[truncated 391170 chars]... > 76d634:45149.replicationSource,1] regionserver.HRegionServer(2351): STOPPED: > Unexpected exception in RS:2;ece3af76d634:45149.replicationSource,1 > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > regionserver.HeapMemoryManager(218): Stopping > 2021-04-11T11:14:40,268 INFO [MemStoreFlusher.0] > regionserver.MemStoreFlusher$FlushHandler(384): MemStoreFlusher.0 exiting > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > flush.RegionServerFlushTableProcedureManager(118): Stopping region server > flush procedure manager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > snapshot.RegionServerSnapshotManager(136): Stopping > RegionServerSnapshotManager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > regionserver.HRegionServer(1146): aborting server > ece3af76d634,45149,1618139661734 > 2021-04-11T11:14:40,272 ERROR > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.ReplicationSource(428): Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > currentPath=null > java.lang.IllegalStateException: Source should be active. > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.initialize(ReplicationSource.java:547) > ~[classes/:?] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282] > 2021-04-11T11:14:40,272 DEBUG > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.HRegionServer(2576): Abort already in progress. Ignoring the > current request with reason: Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > {code} > Maybe it should use HBASE-24877 to avoid failure of the initialize of > ReplicationSource. > -- This message was sent by Atlassian Jira (v8.3.4#803005)