Duo Zhang commented on HBASE-19929:

[~stack] When implementing the test for this issue, I found that if I kill the 
RS which holds meta region, then it will be likely to timeout.

In the test, I setup a cluster with 2 RSes, and kill one RS, and then waiting 
for 30 seconds until all the regions are onlined on the other RS. If I kill the 
one with meta region, then usually I will get a timeout when waiting for the 
regions to online... I guess there maybe some problems...


> Call RS.stop on a session expired RS may hang
> ---------------------------------------------
>                 Key: HBASE-19929
>                 URL: https://issues.apache.org/jira/browse/HBASE-19929
>             Project: HBase
>          Issue Type: Bug
>          Components: wal
>            Reporter: Duo Zhang
>            Assignee: Duo Zhang
>            Priority: Major
>             Fix For: 2.0.0-beta-2
>         Attachments: HBASE-19929-v1.patch, HBASE-19929.patch
> See the discussion in HBASE-19927. The problem is that, for a normal stop we 
> will try to close all the regions and wait until they are all closed. But if 
> the RS has already session expired, master will start the failover work which 
> will move the WAL directory, and then we will be stuck in writing flush 
> marker.

This message was sent by Atlassian JIRA

Reply via email to