[jira] [Updated] (HBASE-3446) ProcessServerShutdown fails if META moves, orphaning lots of regions

stack (Updated) (JIRA) Thu, 13 Oct 2011 16:38:35 -0700

     [ 
https://issues.apache.org/jira/browse/HBASE-3446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


stack updated HBASE-3446:
-------------------------

      Resolution: Fixed
    Release Note: Makes catalog/* classes retry: e.g. MetaEditor, MetaReader 
and CatalogTracker.  Previously they would try once and unless successful, 
fail.  Retrying is courtesy of HTable instances.
    Hadoop Flags: Reviewed
          Status: Resolved  (was: Patch Available)

Got all tests to pass, eventually.

A bunch of tests were failing because the waitForMeta just hung on the 
meta-is-available boolean on master startup waiting for some background thread 
to set it true when meta had been set.  This was fine in old days when we'd go 
get an HRegionInterface to the .META. and try and ensure it is in its wherever 
location with verifies over the HRegionInterface instances (with no retries) 
but now we don't do such primitives, we've gone up the stack, and have 
HTables/HConnections do search and 'verify' of meta for us.  We need to run a 
connection get to know if meta is available (if it is available, the magic 
atomicboolean gets set).

Other miscellaneous stuff like testshell was failing for me because couldn't 
find cluster -- need to set it with the cluster's configuration.

Moved more of the meta migration code into the MetaMigrationRemoveHTD class 
rather than have it spread all about.

Changed the LocalHBaseCluster#join method so it uses the old threaddumping join 
which will dump out a thread dump if we are waiting on something > 60 seconds 
to finish.  Helped me debug a few tests here.

Otherwise, was what was up on rb.
                
> ProcessServerShutdown fails if META moves, orphaning lots of regions
> --------------------------------------------------------------------
>
>                 Key: HBASE-3446
>                 URL: https://issues.apache.org/jira/browse/HBASE-3446
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.90.0
>            Reporter: Todd Lipcon
>            Assignee: stack
>            Priority: Blocker
>             Fix For: 0.92.0
>
>         Attachments: 3446-v11.txt, 3446-v12.txt, 3446-v13.txt, 3446-v14.txt, 
> 3446-v2.txt, 3446-v3.txt, 3446-v4.txt, 3446-v7.txt, 3446-v9.txt, 3446.txt, 
> 3446v15.txt, 3446v23.txt
>
>
> I ran a rolling restart on a 5 node cluster with lots of regions, and 
> afterwards had LOTS of regions left orphaned. The issue appears to be that 
> ProcessServerShutdown failed because the server hosting META was restarted 
> around the same time as another server was being processed

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3446) ProcessServerShutdown fails if META moves, orphaning lots of regions

Reply via email to