[ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13237241#comment-13237241
 ] 

Jonathan Hsieh commented on HBASE-5128:
---------------------------------------

Full suites of 0.92/0.94/trunk versions pass.

Looks like the 0.90 version has always had flakey tests for the same reason the 
0.92/0.94/trunk versions.  It is related to assignment and HBASE-5563, but just 
didn't  happen as often as in the 0.92/0.94/trunk version (2/10 runs vs 5/10 
runs).  HBASE-5563 would not be available on older clusters but won't cause 
permanent problems if this updated hbck was used against a version that did not 
have the improvement.

Let's say using this hbck against an older 0.90-based cluster that didn't have  
HBASE-5563 or HBASE-5589.  The side effect is that you may have to run 'hbck 
-fixAssignments' an extra time to fix region assignment/deployment problems 
after disabling and deleting a table that has been fixed, or alternately, you 
may need to bounce the HMaster or affected RegionServer to clean up this 
transient state.

I currently have a 0.90 version of HBASE-5563 (attached there), and an updated 
HBASE-5128 for 0.90 that is as close as possible to the 0.92/0.94/trunk 
versions as possible.


                
> [uber hbck] Enable hbck to automatically repair table integrity problems as 
> well as region consistency problems while online.
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-5128
>                 URL: https://issues.apache.org/jira/browse/HBASE-5128
>             Project: HBase
>          Issue Type: New Feature
>          Components: hbck
>    Affects Versions: 0.90.5, 0.92.0, 0.94.0, 0.96.0
>            Reporter: Jonathan Hsieh
>            Assignee: Jonathan Hsieh
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: hbase-5128-0.90-v2.patch, hbase-5128-0.90-v2b.patch, 
> hbase-5128-0.92-v2.patch, hbase-5128-0.92-v4.patch, hbase-5128-0.94-v2.patch, 
> hbase-5128-0.94-v4.patch, hbase-5128-trunk-v2.patch, hbase-5128-trunk.patch, 
> hbase-5128-v3.patch, hbase-5128-v4.patch
>
>
> The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region 
> consistency and table integrity invariant violations.  However with '-fix' it 
> can only automatically repair region consistency cases having to do with 
> deployment problems.  This updated version should be able to handle all cases 
> (including a new orphan regiondir case).  When complete will likely deprecate 
> the OfflineMetaRepair tool and subsume several open META-hole related issue.
> Here's the approach (from the comment of at the top of the new version of the 
> file).
> {code}
> /**
>  * HBaseFsck (hbck) is a tool for checking and repairing region consistency 
> and
>  * table integrity.  
>  * 
>  * Region consistency checks verify that META, region deployment on
>  * region servers and the state of data in HDFS (.regioninfo files) all are in
>  * accordance. 
>  * 
>  * Table integrity checks verify that that all possible row keys can resolve 
> to
>  * exactly one region of a table.  This means there are no individual 
> degenerate
>  * or backwards regions; no holes between regions; and that there no 
> overlapping
>  * regions. 
>  * 
>  * The general repair strategy works in these steps.
>  * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
>  * 2) Repair Region Consistency with META and assignments
>  * 
>  * For table integrity repairs, the tables their region directories are 
> scanned
>  * for .regioninfo files.  Each table's integrity is then verified.  If there 
>  * are any orphan regions (regions with no .regioninfo files), or holes, new 
>  * regions are fabricated.  Backwards regions are sidelined as well as empty
>  * degenerate (endkey==startkey) regions.  If there are any overlapping 
> regions,
>  * a new region is created and all data is merged into the new region.  
>  * 
>  * Table integrity repairs deal solely with HDFS and can be done offline -- 
> the
>  * hbase region servers or master do not need to be running.  These phase can 
> be
>  * use to completely reconstruct the META table in an offline fashion. 
>  * 
>  * Region consistency requires three conditions -- 1) valid .regioninfo file 
>  * present in an hdfs region dir,  2) valid row with .regioninfo data in META,
>  * and 3) a region is deployed only at the regionserver that is was assigned 
> to.
>  * 
>  * Region consistency requires hbck to contact the HBase master and region
>  * servers, so the connect() must first be called successfully.  Much of the
>  * region consistency information is transient and less risky to repair.
>  */
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to