[ 
https://issues.apache.org/jira/browse/HBASE-3446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13115818#comment-13115818
 ] 

jirapos...@reviews.apache.org commented on HBASE-3446:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2065/#review2107
-----------------------------------------------------------


Thanks for reviews Ted and Jon.  Will put up new patch when you fellas finish...


src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java
<https://reviews.apache.org/r/2065/#comment4864>

    @Jon True.  I opened another issue with suggested fix.  I should at least 
reference it in here.



src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java
<https://reviews.apache.org/r/2065/#comment4865>

    Nah.  Thats TODO.


- Michael


On 2011-09-27 06:38:09, Michael Stack wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2065/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-09-27 06:38:09)
bq.  
bq.  
bq.  Review request for hbase and Jonathan Gray.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Make the Meta* operations against meta retry.  We do it by using HTable 
instances.
bq.  (HTable calls HConnection.getRegionServerWithRetries for get, put, scan 
etc).
bq.  In 0.89, we had special RetryableMetaOperation class that was a
bq.  subclass of Callable which reproduced the guts of 
HConnection.getRegionServerWithRetries
bq.  with its retry loop.  Now we just use HTable instead (Costs some on setup 
but
bq.  otherwise, we avoid duplicating code).  Upped the retries on serverside 
too.
bq.  
bq.  Had problem with CatalogJanitor.  MetaReader and MetaEditor were relying
bq.  heavily on CT methods getting proxy connections to meta and root servers.
bq.  CT needs to be cut back.  This patch closes down access on (unused) public
bq.  methods and removes being able to get an HRegionInterface on meta and root
bq.  -- this stuff is used internally to CT only now; use MetaEditor or
bq.  MetaReader if you want to update or read catalog tables.  Opening new issue
bq.  to cutback CT use over the code base.
bq.  
bq.  A little off topic but couldn't help it since was in MetaReader and 
MetaEditor
bq.  trying to clean them up, I ended up moving meta migration code out to its
bq.  own class rather than have it in all inside in MetaEditor.
bq.  
bq.  Here is some detail to help reviews.
bq.  
bq.  M src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java
bq.    Clean up.  Shutdown access on some of these unused methods.  Don't
bq.    let out HRegionInterface instances in particular since we are going
bq.    away from raw HRI use to instead use a connection with retries:
bq.    i.e. HTable.
bq.  
bq.    Comments on state of this class. Javadoc edits.
bq.    getZooKeeperWatcher on HConnection is deprecated so don't use it
bq.    in constructor.  Override MetaNodeTracker and on node delete
bq.    reset meta location (We used to do this over in MetaNodeTracker
bq.    but to do that we had to have a CatalogTracker over in zk package
bq.    which is silly -- bad package encapsulation).
bq.  
bq.    (waitForRootServer) Renamed getRootServerConnection and change it
bq.    from public to package private.
bq.    (waitForRootServerConnectionDefault, getRootServerConnection) Removed.
bq.    (getMetaServerConnection) Change from public to package private.
bq.    Use MetaReader to read the meta location in root rather than a
bq.    raw HRegionInterface so we get retrying.
bq.    (remaining, timedout) Added utility methods.
bq.    (waitForMetaServer) Changed from public to private.
bq.    (resetMetaLocation) Made it synchronized on metaAvailable.
bq.    Not all accesses were synchronized.
bq.  
bq.  M src/main/java/org/apache/hadoop/hbase/catalog/MetaEditor.java
bq.    Refactor to use HTable instead of raw HRegionInterface so we get
bq.    retrying.  For each operation we get an HTable, use it, then close it.
bq.    (putToMetaTable, putsToMetaTable, etc) Utility methods.
bq.    (updateRootWithMetaMigrationStatus, etc.) Moved out to own
bq.    class since these classes are for a one-time migration only.
bq.      
bq.  A 
src/main/java/org/apache/hadoop/hbase/catalog/MetaMigrationRemovingHTD.java
bq.    New class that holds all Meta* methods updating meta table used
bq.    doing the one-time migration done to meta on startup.  This class
bq.    is marked deprecated because its going to be dropped in 0.94.
bq.  
bq.  M src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java
bq.    Retrofit methods in here to use fullScan methods with Visitor.
bq.    (getCatalogRegionInterface, getCatalogRegionNameForTable,
bq.      getCatalogRegionNameForRegion) Removed.
bq.    (fullScan) Cleaned up the fullScans.  Fixed up wrong javadoc.
bq.    (fullScanOfResults) Renamed as fullScan override.
bq.    (fullScanOfRoot) Added as deprecated. We should be doing
bq.    this against zk.
bq.    (metaRowToRegionPair, getServerNameFromResult) Moved to Result
bq.    (CollectAllVisitor) Added
bq.  M src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
bq.    Handle few cases where methods throw InterruptedException
bq.    (Don't let it out on the HBaseAdmin public API)
bq.  
bq.  M src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
bq.    Populate new exception, 
RetriesExhaustedException.ThrowableWithExtraContext
bq.    on failure. Call ServerCallable connect AFTER beforeCall rather than
bq.    ServerCallable.instantiateServer BEFORE beforeCall.
bq.  
bq.  M src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java
bq.    Add to DEBUG message the connection name we were using.
bq.  
bq.  M src/main/java/org/apache/hadoop/hbase/client/Result.java
bq.    (getServerNameFromCatalogResult, parseCatalogResult,
bq.      parseHRegionInfoFromCatalogResult) Added
bq.  
bq.  M 
src/main/java/org/apache/hadoop/hbase/client/RetriesExhaustedException.java
bq.    Added new ThrowableWithExtraContext that takes extra context info.
bq.  
bq.  M src/main/java/org/apache/hadoop/hbase/client/ScannerCallable.java
bq.    instantiateServer renamed as connect
bq.  
bq.  M src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java
bq.    Javadoc.  Renamed instantiateServer as connect.
bq.  
bq.  M src/main/java/org/apache/hadoop/hbase/master/HMaster.java
bq.    Javadoc. Use MetaReader method instead of handcoding.
bq.  
bq.  M 
src/main/java/org/apache/hadoop/hbase/master/handler/EnableTableHandler.java
bq.    Handle InterruptedException
bq.  
bq.  M 
src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java
bq.    Handle InterruptedException
bq.  
bq.  M src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java
bq.    Allow hris can come back null when we ask for table regions.
bq.  
bq.  M src/main/java/org/apache/hadoop/hbase/zookeeper/MetaNodeTracker.java
bq.    Remove import of CatalogTracker.
bq.  
bq.  M src/test/java/org/apache/hadoop/hbase/TestRegionRebalancing.java
bq.    Use utility in MetaReader instead of handcode it.
bq.  
bq.  M src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java
bq.    Use new HConnectionTestingUtility mocking tests (need to use it
bq.    because its a bit harder mocking tests now that we use HTable instead
bq.    of the more direct HRegionInterface).
bq.    Add some tests of broken out utility methods.
bq.  
bq.  M src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java
bq.    Add tests
bq.  
bq.  M 
src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditorNoCluster.java
bq.    Add test of 3669 retrying.
bq.  
bq.  M 
src/test/java/org/apache/hadoop/hbase/client/HConnectionTestingUtility.java
bq.    New test utility that helps with mock of HConnection making it so can 
mock
bq.    an HConnection and then have an HTable use the mocked connection.  Can do
bq.    a mock or a spied on HConnection
bq.  
bq.  M src/test/java/org/apache/hadoop/hbase/client/TestMetaMigration.java
bq.    The migration code moved.  Reference new location.
bq.  
bq.  M src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java
bq.  M 
src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java
bq.  M src/test/java/org/apache/hadoop/hbase/master/TestMaster.java
bq.    Was waiting on wrong events.  Was waiting on Opens rather than Splits. 
Fix.
bq.  
bq.  
bq.  This addresses bug hbase-3446.
bq.      https://issues.apache.org/jira/browse/hbase-3446
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/KeyValue.java 585c4a8 
bq.    src/main/java/org/apache/hadoop/hbase/LocalHBaseCluster.java 5bc3bb0 
bq.    src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java 
ac0bc38 
bq.    src/main/java/org/apache/hadoop/hbase/catalog/MetaEditor.java ac60311 
bq.    
src/main/java/org/apache/hadoop/hbase/catalog/MetaMigrationRemovingHTD.java 
PRE-CREATION 
bq.    src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java 2afe70c 
bq.    src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java a55a4b1 
bq.    src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 
5afaedf 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTable.java b5cf639 
bq.    src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java da5b80d 
bq.    src/main/java/org/apache/hadoop/hbase/client/Result.java 8a0c1a9 
bq.    
src/main/java/org/apache/hadoop/hbase/client/RetriesExhaustedException.java 
89d2abe 
bq.    src/main/java/org/apache/hadoop/hbase/client/ScannerCallable.java 
5ea38b4 
bq.    src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java 816f8b7 
bq.    src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java a069400 
bq.    src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 
c53d3be 
bq.    src/main/java/org/apache/hadoop/hbase/master/HMaster.java 06bf814 
bq.    
src/main/java/org/apache/hadoop/hbase/master/handler/EnableTableHandler.java 
1e5d83c 
bq.    
src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 
6ac6408 
bq.    
src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java 
c374d6f 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 
5869c18 
bq.    src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java e72cfa2 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 8465724 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/MetaNodeTracker.java 
55257b3 
bq.    src/test/java/org/apache/hadoop/hbase/TestRegionRebalancing.java 9023af8 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java 
538e809 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 
84130e2 
bq.    
src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditorNoCluster.java
 PRE-CREATION 
bq.    
src/test/java/org/apache/hadoop/hbase/client/HConnectionTestingUtility.java 
PRE-CREATION 
bq.    src/test/java/org/apache/hadoop/hbase/client/TestHCM.java f09944e 
bq.    src/test/java/org/apache/hadoop/hbase/client/TestMetaMigration.java 
645bca6 
bq.    src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java 
8fcdccc 
bq.    
src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java 
a0e6450 
bq.    src/test/java/org/apache/hadoop/hbase/master/TestMaster.java f473c80 
bq.  
bq.  Diff: https://reviews.apache.org/r/2065/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  All tests passed recently.  Rerunning again.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Michael
bq.  
bq.


                
> ProcessServerShutdown fails if META moves, orphaning lots of regions
> --------------------------------------------------------------------
>
>                 Key: HBASE-3446
>                 URL: https://issues.apache.org/jira/browse/HBASE-3446
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.90.0
>            Reporter: Todd Lipcon
>            Assignee: stack
>            Priority: Blocker
>             Fix For: 0.92.0
>
>         Attachments: 3446-v11.txt, 3446-v12.txt, 3446-v13.txt, 3446-v14.txt, 
> 3446-v2.txt, 3446-v3.txt, 3446-v4.txt, 3446-v7.txt, 3446-v9.txt, 3446.txt, 
> 3446v15.txt
>
>
> I ran a rolling restart on a 5 node cluster with lots of regions, and 
> afterwards had LOTS of regions left orphaned. The issue appears to be that 
> ProcessServerShutdown failed because the server hosting META was restarted 
> around the same time as another server was being processed

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to