[ 
https://issues.apache.org/jira/browse/HBASE-21325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16656470#comment-16656470
 ] 

Guanghao Zhang commented on HBASE-21325:
----------------------------------------

As local cluster still in active but remote cluter didn't have remoteWALs 
directory. So logroller will lock rollWriterLock and roll log, but never 
succeed. And shutdown wal need get this lock firstly, too.
{code:java}
    rollWriterLock.lock();
    try {
      doShutdown();
    } finally {
      rollWriterLock.unlock();
    }
{code}


> Add a max wait time for waitOnAllRegionsToClose
> -----------------------------------------------
>
>                 Key: HBASE-21325
>                 URL: https://issues.apache.org/jira/browse/HBASE-21325
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Duo Zhang
>            Assignee: Guanghao Zhang
>            Priority: Major
>
> When testing sync replication, I found that, if I transit the remote cluster 
> to DA, while the local cluster is still in A, the region server will hang 
> when shutdown. As the fsOk flag only test the local cluster(which is 
> reasonable), we will enter the waitOnAllRegionsToClose, and since the WAL is 
> broken(the remote wal directory is gone)  so we will never succeed. And this 
> lead to an infinite wait inside waitOnAllRegionsToClose.
> So I think here we should have an upper bound for the wait time in 
> waitOnAllRegionsToClose method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to