[
https://issues.apache.org/jira/browse/HBASE-27761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17705286#comment-17705286
]
Duo Zhang commented on HBASE-27761:
-----------------------------------
In HRegionServer.stopServicesThreads
{code}
if (sameReplicationSourceAndSink && this.replicationSourceHandler != null) {
this.replicationSourceHandler.stopReplicationService();
} else {
if (this.replicationSourceHandler != null) {
this.replicationSourceHandler.stopReplicationService();
}
if (this.replicationSinkHandler != null) {
this.replicationSinkHandler.stopReplicationService();
}
}
{code}
In Replication.stopReplicationService
{code}
this.replicationManager.join();
{code}
In ReplicationSourceManager.join
{code}
this.executor.shutdown();
for (ReplicationSourceInterface source : this.sources.values()) {
source.terminate("Region server is closing");
}
synchronized (oldsources) {
for (ReplicationSourceInterface source : this.oldsources) {
source.terminate("Region server is closing");
}
}
{code}
So we will try to close all the replication sources when shutting down a region
server.
Could you please check the related logs in these methods to see if they are
properly called when shutting down?
Thanks.
> Replication threads not attached to their parent process
> --------------------------------------------------------
>
> Key: HBASE-27761
> URL: https://issues.apache.org/jira/browse/HBASE-27761
> Project: HBase
> Issue Type: Task
> Components: read replicas, regionserver, Replication
> Affects Versions: 2.5.4
> Reporter: Nick Dimiduk
> Priority: Major
>
> While debugging HBASE-27707 in a unit test, I see behaviour that I cannot
> explain. My test uses a minicluster, enables read replica replication, writes
> some data, concurrently kills a region server thread hosting a primary
> region, and then verifies that all replicas eventually show all data.
> Inspecting logs, noticed that replication source threads seem to continue
> working even after their associated region server is killed. Interspersing
> some thread dumps and sleeps, I can see that replication threads associated
> with the condemned region server are not being removed after it is killed. I
> think that this behaviour will render unreliably any replication test that
> relies on killing a source or sink region server. It also implies to me that
> the minicluster leaks replication threads and cannot be reliably recycled
> within a single jvm process.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)