[ 
https://issues.apache.org/jira/browse/HBASE-21156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21156:
--------------------------
    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

I pushed this to branch-2.1+

> [hbck2] Queue an assign of hbase:meta and bulk assign/unassign
> --------------------------------------------------------------
>
>                 Key: HBASE-21156
>                 URL: https://issues.apache.org/jira/browse/HBASE-21156
>             Project: HBase
>          Issue Type: Sub-task
>          Components: hbck2
>    Affects Versions: 2.1.0
>            Reporter: stack
>            Assignee: stack
>            Priority: Critical
>             Fix For: 2.1.1
>
>         Attachments: HBASE-21156.branch-2.1.001.patch, 
> HBASE-21156.branch-2.1.002.patch, HBASE-21156.branch-2.1.003.patch, 
> HBASE-21156.branch-2.1.004.patch, HBASE-21156.branch-2.1.005.patch
>
>
> We need this to effect repair when damage.
> If procedure WALs AND a server WAL dir are lost or cleaned or we crashed 
> during partial split (unlikely scenarios but nonetheless possible), a Master 
> can be stuck unable to become active because there is no assign procedure for 
> hbase:meta in the system.
> The reasonable argument over in HBASE-21035 has it that attempts at 
> auto-repair under these extremes could cause other issues so at least until 
> we learn more, we for now punt to the operator for fix-up.
> To reproduce the catastrophe, see notes in HBASE-21035 (and [~allan163]'s 
> test).
> UPDATE: HBASE-21191 adds a Master assuming an "holding-pattern" if on startup 
> it does not have an assign for meta (possible if we lose all Master WAL 
> Procs.). Holding pattern is needed because we were exiting after one minute 
> of RPC'ing to old meta location. To inject an assign, the Admin#assign won't 
> work because it gets rejected because the "Master is Initializing". So we 
> need to be able to assign hbase:meta even if "Master is initializing". Also, 
> while in here, add being able to bulk assign because assigning a 
> Region-at-a-time from the shell only works if the offflined region count is 
> in the low 10s; fails when thousands offline.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to