[jira] [Commented] (HBASE-27761) Replication threads not attached to their parent process

Duo Zhang (Jira) Mon, 27 Mar 2023 03:38:07 -0700


    [ 
https://issues.apache.org/jira/browse/HBASE-27761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17705286#comment-17705286
 ]


Duo Zhang commented on HBASE-27761:
-----------------------------------

In HRegionServer.stopServicesThreads

{code}
    if (sameReplicationSourceAndSink && this.replicationSourceHandler != null) {
      this.replicationSourceHandler.stopReplicationService();
    } else {
      if (this.replicationSourceHandler != null) {
        this.replicationSourceHandler.stopReplicationService();
      }
      if (this.replicationSinkHandler != null) {
        this.replicationSinkHandler.stopReplicationService();
      }
    }
{code}

In Replication.stopReplicationService
{code}
this.replicationManager.join();
{code}

In ReplicationSourceManager.join
{code}
    this.executor.shutdown();
    for (ReplicationSourceInterface source : this.sources.values()) {
      source.terminate("Region server is closing");
    }
    synchronized (oldsources) {
      for (ReplicationSourceInterface source : this.oldsources) {
        source.terminate("Region server is closing");
      }
    }
{code}

So we will try to close all the replication sources when shutting down a region 
server.

Could you please check the related logs in these methods to see if they are 
properly called when shutting down?

Thanks.

> Replication threads not attached to their parent process
> --------------------------------------------------------
>
>                 Key: HBASE-27761
>                 URL: https://issues.apache.org/jira/browse/HBASE-27761
>             Project: HBase
>          Issue Type: Task
>          Components: read replicas, regionserver, Replication
>    Affects Versions: 2.5.4
>            Reporter: Nick Dimiduk
>            Priority: Major
>
> While debugging HBASE-27707 in a unit test, I see behaviour that I cannot 
> explain. My test uses a minicluster, enables read replica replication, writes 
> some data, concurrently kills a region server thread hosting a primary 
> region, and then verifies that all replicas eventually show all data. 
> Inspecting logs, noticed that replication source threads seem to continue 
> working even after their associated region server is killed. Interspersing 
> some thread dumps and sleeps, I can see that replication threads associated 
> with the condemned region server are not being removed after it is killed. I 
> think that this behaviour will render unreliably any replication test that 
> relies on killing a source or sink region server. It also implies to me that 
> the minicluster leaks replication threads and cannot be reliably recycled 
> within a single jvm process.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-27761) Replication threads not attached to their parent process

Reply via email to