number of executors was carrying meta

Ted Yu (JIRA) Tue, 26 Jul 2011 15:35:34 -0700

     [ 
https://issues.apache.org/jira/browse/HBASE-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Ted Yu updated HBASE-3809:
--------------------------

    Fix Version/s:     (was: 0.92.0)
                   0.94.0

> .META. may not come back online if > number of executors servers crash and 
> one of those > number of executors was carrying meta
> -------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-3809
>                 URL: https://issues.apache.org/jira/browse/HBASE-3809
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>            Priority: Critical
>             Fix For: 0.94.0
>
>
> This is a duplicate of another issue but at the moment I cannot find the 
> original.
> If you had a 700 node cluster and then you ran something on the cluster which 
> killed 100 nodes, and .META. had been running on one of those downed nodes, 
> well, you'll have all of your master executors processing ServerShutdowns and 
> more than likely non of the currently processing executors will be servicing 
> the shutdown of the server that was carrying .META.
> Well, for server shutdown to complete at the moment, an online .META. is 
> required.  So, in the above case, we'll be stuck. The current executors will 
> not be able to clear to make space for the processing of the server carrying 
> .META. because they need .META. to complete.
> We can make the master handlers have no bound so it will expand to accomodate 
> all crashed servers -- so it'll have the one .META. in its queue -- or we can 
> change it so shutdown handling doesn't require .META. to be on-line (its used 
> to figure the regions the server was carrying); we could use the master's 
> in-memory picture of the cluster (But IIRC, there may be holes ....TBD)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3809) .META. may not come back online if > number of executors servers crash and one of those > number of executors was carrying meta

Reply via email to