[
https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13118468#comment-13118468
]
[email protected] commented on HBASE-4377:
------------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2126/#review2231
-----------------------------------------------------------
How long did it take to scan the cluster with 2700 inconsistencies ?
I see certain places in the code where more parallelism can be achieved if
practical use of this feature takes long time.
src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
<https://reviews.apache.org/r/2126/#comment5180>
Better replace root with -ROOT-
src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
<https://reviews.apache.org/r/2126/#comment5179>
b is not needed here, same with question mark.
src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
<https://reviews.apache.org/r/2126/#comment5181>
Please remove b and question mark.
src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java
<https://reviews.apache.org/r/2126/#comment5185>
It would be nice to log the path for the underlying region.
Otherwise what purpose does this catch/rethrow serve ?
src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java
<https://reviews.apache.org/r/2126/#comment5188>
The .META. region is open upon return.
I think we should document this.
src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java
<https://reviews.apache.org/r/2126/#comment5187>
Looking at the usage below, maybe createNewRootAndMeta would be a better
name.
src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java
<https://reviews.apache.org/r/2126/#comment5189>
This log doesn't match the check above.
If we only produce Put for the first HbckInfo, we'd better declare that in
the log.
src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java
<https://reviews.apache.org/r/2126/#comment5190>
This would produce exception if his.size() == 0.
src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java
<https://reviews.apache.org/r/2126/#comment5192>
Better use boolean for return value to indicate success/failure.
src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java
<https://reviews.apache.org/r/2126/#comment5191>
Do you plan to do this in the next patch or in another JIRA ?
I haven't looked at the other JIRAs you mentioned, pardon me.
src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java
<https://reviews.apache.org/r/2126/#comment5193>
Is there something we can do in case we get IOE from this call ?
- Ted
On 2011-09-30 00:02:16, jmhsieh wrote:
bq.
bq. -----------------------------------------------------------
bq. This is an automatically generated e-mail. To reply, visit:
bq. https://reviews.apache.org/r/2126/
bq. -----------------------------------------------------------
bq.
bq. (Updated 2011-09-30 00:02:16)
bq.
bq.
bq. Review request for hbase, Michael Stack and Andrew Purtell.
bq.
bq.
bq. Summary
bq. -------
bq.
bq. commit fbf82c17be6b3ecca5a981f5270cf93aac26e479
bq. Author: Jonathan Hsieh <[email protected]>
bq. Date: Wed Sep 28 10:18:11 2011 -0700
bq.
bq. HBASE-4377 [hbck] Offline rebuild .META. from fs data only
bq.
bq.
bq. This patch rebuilds a new .META. table by reading all the .regioninfo
files in the hbase main directory. It depends on the yet to be committed
HBASE-4515 (either my verison or Gary's version), HBASE-4509, and HBASE-4506.
bq.
bq. Some follow on work includes backporting to 0.90, auto-patching true
holes, and adding documentation.
bq.
bq.
bq. This addresses bug HBASE-4377.
bq. https://issues.apache.org/jira/browse/HBASE-4377
bq.
bq.
bq. Diffs
bq. -----
bq.
bq. src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
b9c850d
bq. src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 8465724
bq. src/main/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRepair.java
PRE-CREATION
bq. src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java fae0881
bq. src/test/java/org/apache/hadoop/hbase/util/hbck/HbckTestingUtil.java
PRE-CREATION
bq.
src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuild.java
PRE-CREATION
bq.
bq. Diff: https://reviews.apache.org/r/2126/diff
bq.
bq.
bq. Testing
bq. -------
bq.
bq. An earlier version of this code (backported to 0.90) was used to diagnose
and repair a cluster that had 2700 inconsistencies due to failed splits (the
cluster was underprovisioned memory-wise, and on restart, the some regions
would start splitting and then die due to oome's). This was not actually used
on a live cluster -- it was used to reconstruct a .META. from .regioninfo's
laid out in hbase's directory structure.
bq.
bq. Note also that this is not an automatic fix -- whenever any problems are
found, this bails out but dumps info on holes, suggests some fixes, and
displays sets of overlapping regions. It is up to the user to merge regions,
to create .regioninfo files to plug hole, and to do any potential data loosing
operations.
bq.
bq. The tests demonstrate current expected behavior -- rebuild meta if things
line up, and fail without making modifications if holes or overlaps exist.
bq.
bq.
bq. Thanks,
bq.
bq. jmhsieh
bq.
bq.
> [hbck] Offline rebuild .META. from fs data only.
> ------------------------------------------------
>
> Key: HBASE-4377
> URL: https://issues.apache.org/jira/browse/HBASE-4377
> Project: HBase
> Issue Type: New Feature
> Reporter: Jonathan Hsieh
> Assignee: Jonathan Hsieh
> Attachments:
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch
>
>
> In a worst case situation, it may be helpful to have an offline .META.
> rebuilder that just looks at the file system's .regioninfos and rebuilds meta
> from scratch. Users could move bad regions out until there is a clean
> rebuild.
> It would likely fill in region split holes. Follow on work could given
> options to merge or select regions that overlap, or do online rebuilds.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira