[
https://issues.apache.org/jira/browse/HBASE-3446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
stack updated HBASE-3446:
-------------------------
Resolution: Fixed
Release Note: Makes catalog/* classes retry: e.g. MetaEditor, MetaReader
and CatalogTracker. Previously they would try once and unless successful,
fail. Retrying is courtesy of HTable instances.
Hadoop Flags: Reviewed
Status: Resolved (was: Patch Available)
Got all tests to pass, eventually.
A bunch of tests were failing because the waitForMeta just hung on the
meta-is-available boolean on master startup waiting for some background thread
to set it true when meta had been set. This was fine in old days when we'd go
get an HRegionInterface to the .META. and try and ensure it is in its wherever
location with verifies over the HRegionInterface instances (with no retries)
but now we don't do such primitives, we've gone up the stack, and have
HTables/HConnections do search and 'verify' of meta for us. We need to run a
connection get to know if meta is available (if it is available, the magic
atomicboolean gets set).
Other miscellaneous stuff like testshell was failing for me because couldn't
find cluster -- need to set it with the cluster's configuration.
Moved more of the meta migration code into the MetaMigrationRemoveHTD class
rather than have it spread all about.
Changed the LocalHBaseCluster#join method so it uses the old threaddumping join
which will dump out a thread dump if we are waiting on something > 60 seconds
to finish. Helped me debug a few tests here.
Otherwise, was what was up on rb.
> ProcessServerShutdown fails if META moves, orphaning lots of regions
> --------------------------------------------------------------------
>
> Key: HBASE-3446
> URL: https://issues.apache.org/jira/browse/HBASE-3446
> Project: HBase
> Issue Type: Bug
> Components: master
> Affects Versions: 0.90.0
> Reporter: Todd Lipcon
> Assignee: stack
> Priority: Blocker
> Fix For: 0.92.0
>
> Attachments: 3446-v11.txt, 3446-v12.txt, 3446-v13.txt, 3446-v14.txt,
> 3446-v2.txt, 3446-v3.txt, 3446-v4.txt, 3446-v7.txt, 3446-v9.txt, 3446.txt,
> 3446v15.txt, 3446v23.txt
>
>
> I ran a rolling restart on a 5 node cluster with lots of regions, and
> afterwards had LOTS of regions left orphaned. The issue appears to be that
> ProcessServerShutdown failed because the server hosting META was restarted
> around the same time as another server was being processed
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira