[ 
https://issues.apache.org/jira/browse/HBASE-12440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-12440.
------------------------------------
       Resolution: Fixed
    Fix Version/s:     (was: 0.99.1)
                   0.99.2
     Hadoop Flags: Reviewed

> Region may remain offline on clean startup under certain race condition
> -----------------------------------------------------------------------
>
>                 Key: HBASE-12440
>                 URL: https://issues.apache.org/jira/browse/HBASE-12440
>             Project: HBase
>          Issue Type: Bug
>          Components: Region Assignment
>            Reporter: Virag Kothari
>            Assignee: Virag Kothari
>             Fix For: 0.98.8, 0.99.2
>
>         Attachments: HBASE-12440-0.98.patch, HBASE-12440-0.98_v2.patch, 
> HBASE-12440-branch-1.patch
>
>
> Saw this in prod some time back with zk assignment
> On clean startup, while master was doing bulk assign while one of the region 
> servers dies. The bulk assigner then tried to assign it individually using 
> AssignCallable. The AssignCallable does a forceStateToOffline() and skips 
> assigning as it wants the SSH to do the assignment
> {code}
> 2014-10-16 16:05:23,593 DEBUG master.AssignmentManager [AM.-pool1-t1] : 
> Offline 
> sieve_main:inlinks,com.cbslocal.seattle/photo-galleries/category/consumer///:http\x09com.cbslocal.seattle/photo-galleries/category/tailgate-fan///:http,1413464068567.1f1620174d2542fe7d5b034f3311c3a8.,
>  no need to unassign since it's on a dead server: 
> gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016
> 2014-10-16 16:05:23,593  INFO master.RegionStates [AM.-pool1-t1] : Transition 
> {1f1620174d2542fe7d5b034f3311c3a8 state=PENDING_OPEN, ts=1413475519482, 
> server=gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016} to 
> {1f1620174d2542fe7d5b034f3311c3a8 state=OFFLINE, ts=1413475523593, 
> server=gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016}
> 2014-10-16 16:05:23,598  INFO master.AssignmentManager [AM.-pool1-t1] : Skip 
> assigning 
> sieve_main:inlinks,com.cbslocal.seattle/photo-galleries/category/consumer///:http\x09com.cbslocal.seattle/photo-galleries/category/tailgate-fan///:http,1413464068567.1f1620174d2542fe7d5b034f3311c3a8.,
>  it is on a dead but not processed yet server: 
> gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016
> {code}
> But the SSH wont assign as the region is offline but not in transition
> {code}
> 2014-10-16 16:05:24,606  INFO handler.ServerShutdownHandler 
> [MASTER_SERVER_OPERATIONS-hbbl874n38:50510-0] : Reassigning 0 region(s) that 
> gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016 was carrying (and 0 
> regions(s) that were opening on this server)
> 2014-10-16 16:05:24,606 DEBUG master.DeadServer 
> [MASTER_SERVER_OPERATIONS-hbbl874n38:50510-0] : Finished processing 
> gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016
> {code}
> In zk-less assignment, the bulk assigner invoking AssignCallable and the SSH 
> may try to assign the region. But as they go through lock, only one will 
> succeed and doesn't seem to be an issue. 
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to