[ 
https://issues.apache.org/jira/browse/HBASE-9480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13764818#comment-13764818
 ] 

Jimmy Xiang commented on HBASE-9480:
------------------------------------

[~jeffreyz], thanks for reviewing it. I just attached v1.1 (not much changes, 
just added two test cases).
bq. so you don't want remove the znode deletion part from unassign?
If the region is not there, we should delete the znode and re-assign the 
region.  This is used by hbck/admin to fix some region stuck in transition. 
Now, since we throw region already in transition, we don't need to delete the 
znode any more, right?
bq. new retry could trigger NotServingRegionException if the region is closed 
at the RS or the RS just dies between retries
If rs dies in the middle, it's ok, the region is offline anyway. If the region 
is closed while we are unassigning it, the znode should be already deleted.
                
> Regions are unexpectedly made offline in certain failure conditions
> -------------------------------------------------------------------
>
>                 Key: HBASE-9480
>                 URL: https://issues.apache.org/jira/browse/HBASE-9480
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Devaraj Das
>            Assignee: Jimmy Xiang
>            Priority: Blocker
>             Fix For: 0.96.0
>
>         Attachments: 9480-1.txt, trunk-9480.patch, trunk-9480_v1.1.patch
>
>
> Came across this issue (HBASE-9338 test):
> 1. Client issues a request to move a region from ServerA to ServerB
> 2. ServerA is compacting that region and doesn't close region immediately. In 
> fact, it takes a while to complete the request.
> 3. The master in the meantime, sends another close request.
> 4. ServerA sends it a NotServingRegionException
> 5. Master handles the exception, deletes the znode, and invokes regionOffline 
> for the said region.
> 6. ServerA fails to operate on ZK in the CloseRegionHandler since the node is 
> deleted.
> The region is permanently offline.
> There are potentially other situations where when a RegionServer is offline 
> and the client asks for a region move off from that server, the master makes 
> the region offline.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to