[ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13236098#comment-13236098
 ] 

[email protected] commented on HBASE-5128:
------------------------------------------------------



bq.  On 2012-03-22 19:00:46, Ted Yu wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 1771
bq.  > <https://reviews.apache.org/r/4280/diff/2/?file=94413#file94413line1771>
bq.  >
bq.  >     Is @Override missing ?

yeah, i missed all of them.


bq.  On 2012-03-22 19:00:46, Ted Yu wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java, line 72
bq.  > <https://reviews.apache.org/r/4280/diff/2/?file=94414#file94414line72>
bq.  >
bq.  >     Renaming this method is desirable as I mentioned earlier.

Suggestion?


bq.  On 2012-03-22 19:00:46, Ted Yu wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java, line 92
bq.  > <https://reviews.apache.org/r/4280/diff/2/?file=94414#file94414line92>
bq.  >
bq.  >     Typo: assume

"This assumes that info is in META."


bq.  On 2012-03-22 19:00:46, Ted Yu wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java, line 99
bq.  > <https://reviews.apache.org/r/4280/diff/2/?file=94414#file94414line99>
bq.  >
bq.  >     This method is called in two places where HBaseAdmin is available.
bq.  >     
bq.  >     Please change the method signature to avoid creating HBaseAdmin 
every time.

thanks.  This was something missed when porting back and forth between 0.90 and 
0.92.


bq.  On 2012-03-22 19:00:46, Ted Yu wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java, line 152
bq.  > <https://reviews.apache.org/r/4280/diff/2/?file=94414#file94414line152>
bq.  >
bq.  >     Why ?

removed


bq.  On 2012-03-22 19:00:46, Ted Yu wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java, line 161
bq.  > <https://reviews.apache.org/r/4280/diff/2/?file=94414#file94414line161>
bq.  >
bq.  >     success is no longer set in this method.
bq.  >     This can be removed.

done (likely from 0.90 version)


bq.  On 2012-03-22 19:00:46, Ted Yu wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java, line 185
bq.  > <https://reviews.apache.org/r/4280/diff/2/?file=94414#file94414line185>
bq.  >
bq.  >     Shall we return directly here ?
bq.  >     The new exception would be caught at line 182

yes.  


bq.  On 2012-03-22 19:00:46, Ted Yu wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java, line 215
bq.  > <https://reviews.apache.org/r/4280/diff/2/?file=94414#file94414line215>
bq.  >
bq.  >     Please use this method in the three places of HBaseFsck I mentioned.

done


bq.  On 2012-03-22 19:00:46, Ted Yu wrote:
bq.  > src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java, line 274
bq.  > <https://reviews.apache.org/r/4280/diff/2/?file=94419#file94419line274>
bq.  >
bq.  >     Can we reuse the method from HBaseFsck ?

done


bq.  On 2012-03-22 19:00:46, Ted Yu wrote:
bq.  > src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java, line 1217
bq.  > <https://reviews.apache.org/r/4280/diff/2/?file=94418#file94418line1217>
bq.  >
bq.  >     This check was added because of failed test ?

This is an unhandled case.  In one of the patches I had some extra ScrubMeta 
and DumpMeta methods that would clean this up -- this is follow on work for 
another jira.


bq.  On 2012-03-22 19:00:46, Ted Yu wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/util/hbck/TableIntegrityErrorHandlerImpl.java,
 line 30
bq.  > <https://reviews.apache.org/r/4280/diff/2/?file=94417#file94417line30>
bq.  >
bq.  >     Can this class be package-private ?

not yet -- hbck needs to be moved from o.a.h.h.util to o.a.h.h.util.hbck for 
this to be possible.


bq.  On 2012-03-22 19:00:46, Ted Yu wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/util/hbck/TableIntegrityErrorHandler.java,
 line 63
bq.  > <https://reviews.apache.org/r/4280/diff/2/?file=94416#file94416line63>
bq.  >
bq.  >     Javadoc for parameters.

Updated in interface, added:

  /**
   * {@inheritDoc}
   */


bq.  On 2012-03-22 19:00:46, Ted Yu wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/util/hbck/TableIntegrityErrorHandler.java,
 line 71
bq.  > <https://reviews.apache.org/r/4280/diff/2/?file=94416#file94416line71>
bq.  >
bq.  >     Javadoc for parameters.

Updated in interface, added:

  /**
   * {@inheritDoc}
   */


bq.  On 2012-03-22 19:00:46, Ted Yu wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/util/hbck/TableIntegrityErrorHandler.java,
 line 83
bq.  > <https://reviews.apache.org/r/4280/diff/2/?file=94416#file94416line83>
bq.  >
bq.  >     Javadoc for parameters.

Updated in interface, added:

  /**
   * {@inheritDoc}
   */
(and for the other cases).


bq.  On 2012-03-22 19:00:46, Ted Yu wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java, line 112
bq.  > <https://reviews.apache.org/r/4280/diff/2/?file=94414#file94414line112>
bq.  >
bq.  >     Typo: handleHBCK

this comment is not relevent to this branch anymore, removing.


bq.  On 2012-03-22 19:00:46, Ted Yu wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java, line 122
bq.  > <https://reviews.apache.org/r/4280/diff/2/?file=94414#file94414line122>
bq.  >
bq.  >     This is called in a loop in checkMetaRegion().
bq.  >     It would be nice for this method to take a list of regions and wait 
for them to come out of RIT.

This was a cause of a bunch of flakyness or 5 second sleeps in the older hbck 
so I updated this.


bq.  On 2012-03-22 19:00:46, Ted Yu wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java, line 207
bq.  > <https://reviews.apache.org/r/4280/diff/2/?file=94414#file94414line207>
bq.  >
bq.  >     It would be nice to cache meta for subsequent calls.
bq.  >     Can be done in another JIRA.

follow up jira.


- jmhsieh


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4280/#review6239
-----------------------------------------------------------


On 2012-03-21 23:24:13, jmhsieh wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4280/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2012-03-21 23:24:13)
bq.  
bq.  
bq.  Review request for hbase, Todd Lipcon, Ted Yu, and Lars Hofhansl.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  This version is similar to the 0.90.x version posted a few months back, 
but has a few new features and some minor differences.
bq.  
bq.  1) No trackHTD method needed since we can read from the file system.
bq.  2) Added safeguards to prevent mega merges, and to isolate repairs to 
particular tables.
bq.  3) Fixed comparator in HRegionInfo
bq.  4) Fixed TestRegionObserverInterface so that it doesn't rely on bug in 
HRegionInfo comparator.
bq.  
bq.  I'll backport to 0.94/0.92 (which should be very similar) and update the 
0.90 versions after this patch has mostly cleared.
bq.  
bq.  This version is not perfect (there are definitely cases not covered) but 
it think it is worth trying to get this in so that future reviews are more 
manageable.
bq.  
bq.  
bq.  This addresses bug HBASE-5128.
bq.      https://issues.apache.org/jira/browse/HBASE-5128
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java 3c635d4 
bq.    src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 
d47ef10 
bq.    src/main/java/org/apache/hadoop/hbase/master/HMaster.java cd1755f 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java c0aaf65 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java 5916d9c 
bq.    src/main/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRepair.java 
d57bb6b 
bq.    
src/main/java/org/apache/hadoop/hbase/util/hbck/TableIntegrityErrorHandler.java 
PRE-CREATION 
bq.    
src/main/java/org/apache/hadoop/hbase/util/hbck/TableIntegrityErrorHandlerImpl.java
 PRE-CREATION 
bq.    src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java d9a2a02 
bq.    src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java 937781d 
bq.    src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsckComparator.java 
0599da1 
bq.    src/test/java/org/apache/hadoop/hbase/util/hbck/HbckTestingUtil.java 
dbb97f8 
bq.    
src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildBase.java 
2b4cac8 
bq.    
src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildHole.java 
ebbeead 
bq.    
src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildOverlap.java
 b175548 
bq.  
bq.  Diff: https://reviews.apache.org/r/4280/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Unit tests cover many many situations and pass.  Most "live" testing has 
been done on 0.90.x versions.  Many improvements and features added from 
experience.  Not much testing live on the trunk versions.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  jmhsieh
bq.  
bq.


                
> [uber hbck] Enable hbck to automatically repair table integrity problems as 
> well as region consistency problems while online.
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-5128
>                 URL: https://issues.apache.org/jira/browse/HBASE-5128
>             Project: HBase
>          Issue Type: New Feature
>          Components: hbck
>    Affects Versions: 0.90.5, 0.92.0, 0.94.0, 0.96.0
>            Reporter: Jonathan Hsieh
>            Assignee: Jonathan Hsieh
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: hbase-5128-0.90-v2.patch, hbase-5128-0.90-v2b.patch, 
> hbase-5128-0.92-v2.patch, hbase-5128-0.94-v2.patch, 
> hbase-5128-trunk-v2.patch, hbase-5128-trunk.patch
>
>
> The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region 
> consistency and table integrity invariant violations.  However with '-fix' it 
> can only automatically repair region consistency cases having to do with 
> deployment problems.  This updated version should be able to handle all cases 
> (including a new orphan regiondir case).  When complete will likely deprecate 
> the OfflineMetaRepair tool and subsume several open META-hole related issue.
> Here's the approach (from the comment of at the top of the new version of the 
> file).
> {code}
> /**
>  * HBaseFsck (hbck) is a tool for checking and repairing region consistency 
> and
>  * table integrity.  
>  * 
>  * Region consistency checks verify that META, region deployment on
>  * region servers and the state of data in HDFS (.regioninfo files) all are in
>  * accordance. 
>  * 
>  * Table integrity checks verify that that all possible row keys can resolve 
> to
>  * exactly one region of a table.  This means there are no individual 
> degenerate
>  * or backwards regions; no holes between regions; and that there no 
> overlapping
>  * regions. 
>  * 
>  * The general repair strategy works in these steps.
>  * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
>  * 2) Repair Region Consistency with META and assignments
>  * 
>  * For table integrity repairs, the tables their region directories are 
> scanned
>  * for .regioninfo files.  Each table's integrity is then verified.  If there 
>  * are any orphan regions (regions with no .regioninfo files), or holes, new 
>  * regions are fabricated.  Backwards regions are sidelined as well as empty
>  * degenerate (endkey==startkey) regions.  If there are any overlapping 
> regions,
>  * a new region is created and all data is merged into the new region.  
>  * 
>  * Table integrity repairs deal solely with HDFS and can be done offline -- 
> the
>  * hbase region servers or master do not need to be running.  These phase can 
> be
>  * use to completely reconstruct the META table in an offline fashion. 
>  * 
>  * Region consistency requires three conditions -- 1) valid .regioninfo file 
>  * present in an hdfs region dir,  2) valid row with .regioninfo data in META,
>  * and 3) a region is deployed only at the regionserver that is was assigned 
> to.
>  * 
>  * Region consistency requires hbck to contact the HBase master and region
>  * servers, so the connect() must first be called successfully.  Much of the
>  * region consistency information is transient and less risky to repair.
>  */
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to