[
https://issues.apache.org/jira/browse/HBASE-21325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Guanghao Zhang updated HBASE-21325:
-----------------------------------
Affects Version/s: 2.1.1
2.2.0
3.0.0
2.0.2
> Force to terminate regionserver when abort hang in somewhere
> ------------------------------------------------------------
>
> Key: HBASE-21325
> URL: https://issues.apache.org/jira/browse/HBASE-21325
> Project: HBase
> Issue Type: Improvement
> Affects Versions: 3.0.0, 2.2.0, 2.1.1, 2.0.2
> Reporter: Duo Zhang
> Assignee: Guanghao Zhang
> Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21325.master.001.patch,
> HBASE-21325.master.001.patch, HBASE-21325.master.002.patch,
> HBASE-21325.master.003.patch, HBASE-21325.master.004.patch,
> HBASE-21325.master.005.patch
>
>
> When testing sync replication, I found that, if I transit the remote cluster
> to DA, while the local cluster is still in A, the region server will hang
> when shutdown. As the fsOk flag only test the local cluster(which is
> reasonable), we will enter the waitOnAllRegionsToClose, and since the WAL is
> broken(the remote wal directory is gone) so we will never succeed. And this
> lead to an infinite wait inside waitOnAllRegionsToClose.
> So I think here we should have an upper bound for the wait time in
> waitOnAllRegionsToClose method.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)