[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13151743#comment-13151743 ] Jonathan Hsieh commented on HBASE-4377: --- Todd too a quick look and mentioned that fs.defaultFS is a Hadoop 0.21+'ism. On a 0.20.x release nothing really happens. Any concerns about this on the 0.90 backport? {code} + public static void main(String[] args) throws Exception { + +// create a fsck object +Configuration conf = HBaseConfiguration.create(); +conf.set(fs.defaultFS, conf.get(HConstants.HBASE_DIR)); +HBaseFsck fsck = new HBaseFsck(conf); + + {code{ [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, EXT_AC.regioninfo, EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo, hbase-4377-trunk.v2.patch, hbase-4377.0.90.v6.patch, hbase-4377.trunk.v3.txt, hbase-4377.trunk.v4.txt, hbase-4377.trunk.v5.txt, hbase-4377.trunk.v6.patch In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145192#comment-13145192 ] mingjian commented on HBASE-4377: - @Jonathan If a region is splitting how do we fix it without onlined parent and daughters? [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, EXT_AC.regioninfo, EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo, hbase-4377-trunk.v2.patch, hbase-4377.0.90.v6.patch, hbase-4377.trunk.v3.txt, hbase-4377.trunk.v4.txt, hbase-4377.trunk.v5.txt, hbase-4377.trunk.v6.patch In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145197#comment-13145197 ] Jonathan Hsieh commented on HBASE-4377: --- @mingjian If there was a split that didn't complete cleanly, a parent region with daughters should look like an overlap. The tool will tell you where these overlaps are. One way to fix the problem is to keep the parent region and then move or remove the daughter regions from hdfs. Since it is in the middle of a split, the parent should have all the data. Alternately, you could copy the store files from the daughters into the dir of the parent and then run the offline rebuilder. I plan on writing a blog post and hopefully adding to the book on how to fix these problems. [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, EXT_AC.regioninfo, EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo, hbase-4377-trunk.v2.patch, hbase-4377.0.90.v6.patch, hbase-4377.trunk.v3.txt, hbase-4377.trunk.v4.txt, hbase-4377.trunk.v5.txt, hbase-4377.trunk.v6.patch In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13142035#comment-13142035 ] Ted Yu commented on HBASE-4377: --- Integrated to 0.90, 0.92 and TRUNK. Thanks for the patch Jonathan. [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, EXT_AC.regioninfo, EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo, hbase-4377-trunk.v2.patch, hbase-4377.0.90.v6.patch, hbase-4377.trunk.v3.txt, hbase-4377.trunk.v4.txt, hbase-4377.trunk.v5.txt, hbase-4377.trunk.v6.patch In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13142071#comment-13142071 ] Hudson commented on HBASE-4377: --- Integrated in HBase-0.92 #98 (See [https://builds.apache.org/job/HBase-0.92/98/]) HBASE-4377 [hbck] Offline rebuild .META. from fs data only (Jonathan Hsieh) HBASE-4377 [hbck] Offline rebuild .META. from fs data only (Jonathan Hsieh) (detail) tedyu : Files : * /hbase/branches/0.92/CHANGES.txt tedyu : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/HRegionInfo.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/util/hbck * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRepair.java * /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java * /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/util/hbck * /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/util/hbck/HbckTestingUtil.java * /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRebuildTestCore.java * /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildBase.java * /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildHole.java * /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildOverlap.java [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, EXT_AC.regioninfo, EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo, hbase-4377-trunk.v2.patch, hbase-4377.0.90.v6.patch, hbase-4377.trunk.v3.txt, hbase-4377.trunk.v4.txt, hbase-4377.trunk.v5.txt, hbase-4377.trunk.v6.patch In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13140533#comment-13140533 ] Jonathan Hsieh commented on HBASE-4377: --- Addressed most of stack's comments: * Removed try-catch from deleteTable. * Updated comment related issues. * Renamed splits in populateTable to values (splits is for region splits, the latter is for creating values.) * Have separate patch for filling in holes. * Removed setTableName and added internal check code to getTableName(). * Refactored the sidelining function to check rename returns. I'm going to punt on these two. * HRegion creation was done manually because the version that existed attempted to open stores and I didn't want or need that. * MetaReader was not used because at the time I was trying to figure out the different table existence semantics in 0.90 vs trunk. [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, EXT_AC.regioninfo, EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo, hbase-4377-trunk.v2.patch, hbase-4377.trunk.v3.txt, hbase-4377.trunk.v4.txt, hbase-4377.trunk.v5.txt In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13140546#comment-13140546 ] jirapos...@reviews.apache.org commented on HBASE-4377: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2126/ --- (Updated 2011-10-31 20:55:03.327832) Review request for hbase, Michael Stack and Andrew Purtell. Changes --- Addressed Stack's comments Summary --- commit fbf82c17be6b3ecca5a981f5270cf93aac26e479 Author: Jonathan Hsieh j...@cloudera.com Date: Wed Sep 28 10:18:11 2011 -0700 HBASE-4377 [hbck] Offline rebuild .META. from fs data only This patch rebuilds a new .META. table by reading all the .regioninfo files in the hbase main directory. It depends on the yet to be committed HBASE-4515 (either my verison or Gary's version), HBASE-4509, and HBASE-4506. Some follow on work includes backporting to 0.90, auto-patching true holes, and adding documentation. This addresses bug HBASE-4377. https://issues.apache.org/jira/browse/HBASE-4377 Diffs (updated) - src/main/java/org/apache/hadoop/hbase/HRegionInfo.java ae068c7 src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java 46ca765 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 9e9e07b src/main/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRepair.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java ca6dd4b src/test/java/org/apache/hadoop/hbase/util/hbck/HbckTestingUtil.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRebuildTestCore.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildBase.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildHole.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildOverlap.java PRE-CREATION Diff: https://reviews.apache.org/r/2126/diff Testing --- An earlier version of this code (backported to 0.90) was used to diagnose and repair a cluster that had 2700 inconsistencies due to failed splits (the cluster was underprovisioned memory-wise, and on restart, the some regions would start splitting and then die due to oome's). This was not actually used on a live cluster -- it was used to reconstruct a .META. from .regioninfo's laid out in hbase's directory structure. Note also that this is not an automatic fix -- whenever any problems are found, this bails out but dumps info on holes, suggests some fixes, and displays sets of overlapping regions. It is up to the user to merge regions, to create .regioninfo files to plug hole, and to do any potential data loosing operations. The tests demonstrate current expected behavior -- rebuild meta if things line up, and fail without making modifications if holes or overlaps exist. Thanks, jmhsieh [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, EXT_AC.regioninfo, EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo, hbase-4377-trunk.v2.patch, hbase-4377.trunk.v3.txt, hbase-4377.trunk.v4.txt, hbase-4377.trunk.v5.txt In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13140554#comment-13140554 ] jirapos...@reviews.apache.org commented on HBASE-4377: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2287/ --- (Updated 2011-10-31 21:03:52.791775) Review request for hbase and Ted Yu. Changes --- Updated to address stacks comments. I believe Seb's patch wasn't necessary in 0.90 since that code came in on HBASE-451 which isn't on the 0.90 branch. Summary --- Backport to 0.90 commit 89862b73c6358e27220b87b0362599d86ab0fe4a Author: Jonathan Hsieh j...@cloudera.com Date: Wed Sep 28 10:18:11 2011 -0700 HBASE-4377 [hbck] Offline rebuild .META. from fs data only This addresses bug HBASE-4377. https://issues.apache.org/jira/browse/HBASE-4377 Diffs (updated) - src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java ef246c3 src/main/java/org/apache/hadoop/hbase/util/Bytes.java 13ad026 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java e0bd77e src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java a981f72 src/main/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRepair.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java bd3b2f3 src/test/java/org/apache/hadoop/hbase/util/hbck/HbckTestingUtil.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRebuildTestCore.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildBase.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildHole.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildOverlap.java PRE-CREATION Diff: https://reviews.apache.org/r/2287/diff Testing --- Note, the assertion test result is different in the failure cases due to HBASE-451 changes. (0.90 returns 0 tables since it does a meta scan on empty meta, trunk branch looks at hdfs dirs, and returns 1). This version passes after HBASE-4508 (backport HBASE-3777 to 0.90 branch) is applied. I believe if that patch is not applied, I could modify the test code to force some explicit HConnection deletions. Thanks, jmhsieh [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, EXT_AC.regioninfo, EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo, hbase-4377-trunk.v2.patch, hbase-4377.0.90.v6.patch, hbase-4377.trunk.v3.txt, hbase-4377.trunk.v4.txt, hbase-4377.trunk.v5.txt, hbase-4377.trunk.v6.patch In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13140560#comment-13140560 ] Hadoop QA commented on HBASE-4377: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12501666/hbase-4377.trunk.v6.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 13 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/113//console This message is automatically generated. [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, EXT_AC.regioninfo, EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo, hbase-4377-trunk.v2.patch, hbase-4377.0.90.v6.patch, hbase-4377.trunk.v3.txt, hbase-4377.trunk.v4.txt, hbase-4377.trunk.v5.txt, hbase-4377.trunk.v6.patch In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13140661#comment-13140661 ] Hadoop QA commented on HBASE-4377: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12501672/hbase-4377.trunk.v6.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 13 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -166 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 4 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestDistributedLogSplitting org.apache.hadoop.hbase.master.TestMasterFailover Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/114//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/114//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/114//console This message is automatically generated. [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, EXT_AC.regioninfo, EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo, hbase-4377-trunk.v2.patch, hbase-4377.0.90.v6.patch, hbase-4377.trunk.v3.txt, hbase-4377.trunk.v4.txt, hbase-4377.trunk.v5.txt, hbase-4377.trunk.v6.patch In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13140671#comment-13140671 ] Ted Yu commented on HBASE-4377: --- The failed tests were due to 'Too many open files'. [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, EXT_AC.regioninfo, EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo, hbase-4377-trunk.v2.patch, hbase-4377.0.90.v6.patch, hbase-4377.trunk.v3.txt, hbase-4377.trunk.v4.txt, hbase-4377.trunk.v5.txt, hbase-4377.trunk.v6.patch In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13139318#comment-13139318 ] Ted Yu commented on HBASE-4377: --- The code in HRegion which creates region directory is scattered in createHRegion(), etc. To reuse the code, we need to separate it into its own method. MetaReader.fullScanMetaAndPrint() doesn't return the number of rows in .META. We can enhance it by returning the count. For fs.rename(), we end up calling DFSClient where the return value's javadoc says: {code} * @return true if successful, or false if the old name does not exist * or if the new name already belongs to the namespace. {code} We should add checking for return value from fs.rename(). But it seems catching IOException is useful defense against unexpected situation as well. [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, EXT_AC.regioninfo, EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo, hbase-4377-trunk.v2.patch, hbase-4377.trunk.v3.txt, hbase-4377.trunk.v4.txt, hbase-4377.trunk.v5.txt In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13139332#comment-13139332 ] stack commented on HBASE-4377: -- bq. To reuse the code, we need to separate it into its own method. That would be a good thing I'd think. bq. MetaReader.fullScanMetaAndPrint() doesn't return the number of rows in .META. Seems like a minor addition -- otherwise, it does what the method in here does. Yes, catch IOE and check the rename return value I'd say (Not too long ago, a patch was added by an hdfs-er to check all boolean returns out of fs operations... we should try keep up the pattern). Good stuff. [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, EXT_AC.regioninfo, EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo, hbase-4377-trunk.v2.patch, hbase-4377.trunk.v3.txt, hbase-4377.trunk.v4.txt, hbase-4377.trunk.v5.txt In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13139381#comment-13139381 ] Hadoop QA commented on HBASE-4377: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12501458/hbase-4377.trunk.v5.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 18 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -166 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 4 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestDistributedLogSplitting org.apache.hadoop.hbase.master.TestMasterFailover Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/98//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/98//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/98//console This message is automatically generated. [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, EXT_AC.regioninfo, EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo, hbase-4377-trunk.v2.patch, hbase-4377.trunk.v3.txt, hbase-4377.trunk.v4.txt, hbase-4377.trunk.v5.txt In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13139386#comment-13139386 ] Jonathan Hsieh commented on HBASE-4377: --- @Stack I have a patch written that optionally handles filling in holes, but haven't polished it for review yet. I'll add it after this patch gets through. IIRC it adds this functionality to hbck and to the offline meta rebuilder. [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, EXT_AC.regioninfo, EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo, hbase-4377-trunk.v2.patch, hbase-4377.trunk.v3.txt, hbase-4377.trunk.v4.txt, hbase-4377.trunk.v5.txt In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13139478#comment-13139478 ] stack commented on HBASE-4377: -- @Jon Sounds good. Do we want to make a v6 of these patch to address the minor comments above or do we want to commit this and do them in a different issue (The test fails in patch build are not because of v5). [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, EXT_AC.regioninfo, EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo, hbase-4377-trunk.v2.patch, hbase-4377.trunk.v3.txt, hbase-4377.trunk.v4.txt, hbase-4377.trunk.v5.txt In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13139482#comment-13139482 ] Ted Yu commented on HBASE-4377: --- The patch for 0.90 doesn't cleanly compile yet. We need to produce a clean patch for 0.90 and run test suite for it so that 0.90 can have this feature. [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, EXT_AC.regioninfo, EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo, hbase-4377-trunk.v2.patch, hbase-4377.trunk.v3.txt, hbase-4377.trunk.v4.txt, hbase-4377.trunk.v5.txt In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13139500#comment-13139500 ] Jonathan Hsieh commented on HBASE-4377: --- i'll do an update tomorrow or monday to address the nits and get the 0.90 version caught up again. [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, EXT_AC.regioninfo, EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo, hbase-4377-trunk.v2.patch, hbase-4377.trunk.v3.txt, hbase-4377.trunk.v4.txt, hbase-4377.trunk.v5.txt In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13139053#comment-13139053 ] Ted Yu commented on HBASE-4377: --- hbase-4377.trunk.v5.txt didn't produce regression. The following isn't new: {code} Tests in error: testRegionTransitionOperations(org.apache.hadoop.hbase.coprocessor.TestMasterObserver): 9faf6fe48f36206644b7fd913cf7e229 {code} [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, EXT_AC.regioninfo, EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo, hbase-4377-trunk.v2.patch, hbase-4377.trunk.v3.txt, hbase-4377.trunk.v4.txt, hbase-4377.trunk.v5.txt In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13139109#comment-13139109 ] Hadoop QA commented on HBASE-4377: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12501403/hbase-4377.trunk.v5.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 18 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -166 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 4 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.coprocessor.TestMasterObserver org.apache.hadoop.hbase.client.TestMultiParallel org.apache.hadoop.hbase.master.TestDefaultLoadBalancer org.apache.hadoop.hbase.TestRegionRebalancing org.apache.hadoop.hbase.master.TestMasterFailover org.apache.hadoop.hbase.master.TestDistributedLogSplitting org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/94//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/94//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/94//console This message is automatically generated. [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, EXT_AC.regioninfo, EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo, hbase-4377-trunk.v2.patch, hbase-4377.trunk.v3.txt, hbase-4377.trunk.v4.txt, hbase-4377.trunk.v5.txt In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13139121#comment-13139121 ] stack commented on HBASE-4377: -- We want this? {code} +} catch (Exception e) { + // Do nothing. +} {code} Why not let the test fail if we can't delete the table? I see we do this deleteTable in a few places w/ above ignore of IOE (the deleteTable method is same in two places) Nit: Belwo is a little wonky. No biggie. {code} +HTableDescriptor[] htbls = // new HBaseAdmin(conf).listTables(); +TEST_UTIL.getHBaseAdmin().listTables(); {code} Nit: No biggie. Rewrite below if new patch made: {code} + * This testing base class provides create a minicluster and testing table table + * and shutsdown the cluster afterwards. It provides methods to wipes out meta, + * inject errors into meta and the file system. {code} This define is repeated: {code}+ protected final static byte[][] splits = new byte[][] { Bytes.toBytes(A), + Bytes.toBytes(B), Bytes.toBytes(C) }; {code} Then there is : +byte[] splits = { 'A', 'B', 'C', 'D' }; Should we be looking for the missing region in the filesystem if we find a hole in meta before we go ahead and create a region to plug the hole? Do we do that? FYI, for future, I think there is utility in HRegion to do the below (with defines for .regioninfo) and for writing it (Maybe there is a reason you did the below manually?): {code} +Path p = new Path(rootDir + / + htd.getNameAsString(), +hri.getEncodedName()); +fs.mkdirs(p); +Path riPath = new Path(p, .regioninfo); {code} FYI, there is utility in MetaReader to do this; {code} + protected int scanMeta() throws IOException { +int count = 0; +HTable meta = new HTable(conf, HTableDescriptor.META_TABLEDESC.getName()); +ResultScanner scanner = meta.getScanner(new Scan()); +LOG.info(Table: + Bytes.toString(meta.getTableName())); +for (Result res : scanner) { + LOG.info(Bytes.toString(res.getRow())); + count++; +} +return count; + } {code} Do we need setTableName? Should below be moved into HRI? {code} + if (getTableName() == null || getTableName().length == 0) { +byte [] newTableName = HRegionInfo.getTableName(this.getRegionName()); +LOG.debug(Bytes.toString(newTableName)+: .regioninfo doesn't have tableName value, but we are getting it from regionName :)); +this.setTableName(newTableName); + } {code} This will be ok? We'll have perms to go here? If we don't we will just fail which should be fine. {code} +Path backupDir = new Path(rootDir.getParent(), rootDir.getName() + - ++ now); {code} Next time, you could have made a method out of this and used it for meta and root passing in 'root' or 'meta' and backupRoot -- its repeated code: {code} +if (fs.exists(root)) { + fs.rename(root, backupRoot); +} else { + LOG.info(No previous -ROOT- exists. Continuing.); +} {code} Should you test the result of fs.rename? It returns a boolean true if it succeeds and false if not? Thats enough for now. [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, EXT_AC.regioninfo, EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo, hbase-4377-trunk.v2.patch, hbase-4377.trunk.v3.txt, hbase-4377.trunk.v4.txt, hbase-4377.trunk.v5.txt In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135758#comment-13135758 ] Hadoop QA commented on HBASE-4377: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12500832/EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/69//console This message is automatically generated. [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, EXT_AC.regioninfo, EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo, hbase-4377-trunk.v2.patch, hbase-4377.trunk.v3.txt, hbase-4377.trunk.v4.txt In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135222#comment-13135222 ] Ted Yu commented on HBASE-4377: --- @Sebastian: 0.92 is close to release. I got the following: {code} [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile (default-testCompile) on project hbase: Compilation failure: Compilation failure: [ERROR] /Users/zhihyu/92hbase/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java:[277,21] doFsck(org.apache.hadoop.conf.Configuration,boolean) in org.apache.hadoop.hbase.util.hbck.HbckTestingUtil cannot be applied to (boolean) [ERROR] [ERROR] /Users/zhihyu/92hbase/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java:[286,23] doFsck(org.apache.hadoop.conf.Configuration,boolean) in org.apache.hadoop.hbase.util.hbck.HbckTestingUtil cannot be applied to (boolean) {code} Do you mind refresh patch for 0.92 ? Thanks [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, hbase-4377-trunk.v2.patch In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135324#comment-13135324 ] Jonathan Hsieh commented on HBASE-4377: --- Seb glad to hear that this basically worked for you. Would it make sense to add Seb's change as a separate jira after the original patch gets committed? IMO, it feels like it needs a test case as well. [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, hbase-4377-trunk.v2.patch, hbase-4377.trunk.v3.txt In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135459#comment-13135459 ] Ted Yu commented on HBASE-4377: --- Sebastian's latest patch applies to 0.92 and hbck related tests passed. I think we should include his enhancement. Test for his scenario can be added later. @Jonathan: What's your opinion ? [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, hbase-4377-trunk.v2.patch, hbase-4377.trunk.v3.txt In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135478#comment-13135478 ] Jonathan Hsieh commented on HBASE-4377: --- @Ted I'm basically ok wit it. @Seb can you post some of the bad .regioninfo files? I'm curious about what you did to need to use a full rebuild! [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, hbase-4377-trunk.v2.patch, hbase-4377.trunk.v3.txt In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135486#comment-13135486 ] Hadoop QA commented on HBASE-4377: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12500737/0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 18 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -167 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 3 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.coprocessor.TestMasterObserver org.apache.hadoop.hbase.master.TestDistributedLogSplitting Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/66//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/66//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/66//console This message is automatically generated. [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, hbase-4377-trunk.v2.patch, hbase-4377.trunk.v3.txt In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135494#comment-13135494 ] Ted Yu commented on HBASE-4377: --- Some tests failed due to: {code} Caused by: java.io.IOException: Too many open files at sun.nio.ch.IOUtil.initPipe(Native Method) at sun.nio.ch.EPollSelectorImpl.init(EPollSelectorImpl.java:49) at sun.nio.ch.EPollSelectorProvider.openSelector(EPollSelectorProvider.java:18) {code} One test failure is tracked by HBASE-4675 [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, hbase-4377-trunk.v2.patch, hbase-4377.trunk.v3.txt In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131286#comment-13131286 ] jirapos...@reviews.apache.org commented on HBASE-4377: -- bq. On 2011-10-07 21:01:16, Ted Yu wrote: bq. src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 302 bq. https://reviews.apache.org/r/2287/diff/1/?file=48780#file48780line302 bq. bq. Naming rd as rootdir would make the code more readable. done bq. On 2011-10-07 21:01:16, Ted Yu wrote: bq. src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 446 bq. https://reviews.apache.org/r/2287/diff/1/?file=48780#file48780line446 bq. bq. I think LOG.info() should be used here. I think it is still a problem, but we are in an ok state. I've changed it to 'warn' instead. bq. On 2011-10-07 21:01:16, Ted Yu wrote: bq. src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 276 bq. https://reviews.apache.org/r/2287/diff/1/?file=48780#file48780line276 bq. bq. Minor suggestion: IOException may occur more than once. Would logging all such IOException's before bailing out make user experience better ? bq. Basically we just need to track the last such IOException in a variable and bail out at line 283 if the variable isn't null. Updated to track all IOE's and throw MultipleIOException. bq. On 2011-10-07 21:01:16, Ted Yu wrote: bq. src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 346 bq. https://reviews.apache.org/r/2287/diff/1/?file=48780#file48780line346 bq. bq. I think rebuildMeta() should check the return value from generatePuts(). bq. Otherwise we would encounter NPE at line 405 below. see below bq. On 2011-10-07 21:01:16, Ted Yu wrote: bq. src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 407 bq. https://reviews.apache.org/r/2287/diff/1/?file=48780#file48780line407 bq. bq. false should be returned if puts is null. So I believe that checkHdfs and loadTableInfo and the error checking happens before and bails out after suggestFixes(). But sure, it doesn't really hurt here to be event more defensive. bq. On 2011-10-07 21:01:16, Ted Yu wrote: bq. src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 378 bq. https://reviews.apache.org/r/2287/diff/1/?file=48780#file48780line378 bq. bq. Do you plan to add this logic in another JIRA ? I have a patch that adds this but it is having problems on the trunk side. I'd like to get this in first and then then we'll deal with that next. New issue filed HBASE-4632. - jmhsieh --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2287/#review2440 --- On 2011-10-07 19:04:44, jmhsieh wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2287/ bq. --- bq. bq. (Updated 2011-10-07 19:04:44) bq. bq. bq. Review request for hbase and Ted Yu. bq. bq. bq. Summary bq. --- bq. bq. Backport to 0.90 bq. bq. commit 89862b73c6358e27220b87b0362599d86ab0fe4a bq. Author: Jonathan Hsieh j...@cloudera.com bq. Date: Wed Sep 28 10:18:11 2011 -0700 bq. bq. HBASE-4377 [hbck] Offline rebuild .META. from fs data only bq. bq. bq. bq. This addresses bug HBASE-4377. bq. https://issues.apache.org/jira/browse/HBASE-4377 bq. bq. bq. Diffs bq. - bq. bq.src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java ef246c3 bq.src/main/java/org/apache/hadoop/hbase/util/Bytes.java 13ad026 bq.src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java b04aab6 bq.src/main/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRepair.java PRE-CREATION bq.src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java f792720 bq.src/test/java/org/apache/hadoop/hbase/util/hbck/HbckTestingUtil.java PRE-CREATION bq. src/test/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRebuildTestCore.java PRE-CREATION bq. src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildBase.java PRE-CREATION bq. src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildHole.java PRE-CREATION bq. src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildOverlap.java PRE-CREATION bq. bq. Diff: https://reviews.apache.org/r/2287/diff bq. bq. bq. Testing bq. --- bq. bq. Note, the assertion test result is different in the failure cases due to HBASE-451 changes. (0.90 returns 0 tables since it does a meta scan on empty meta, trunk branch looks at hdfs dirs, and returns 1). bq. bq. This version
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131301#comment-13131301 ] jirapos...@reviews.apache.org commented on HBASE-4377: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2287/ --- (Updated 2011-10-20 03:21:33.922683) Review request for hbase and Ted Yu. Changes --- Addressed comments * added more logging and better error message * Handled exit properly. Summary --- Backport to 0.90 commit 89862b73c6358e27220b87b0362599d86ab0fe4a Author: Jonathan Hsieh j...@cloudera.com Date: Wed Sep 28 10:18:11 2011 -0700 HBASE-4377 [hbck] Offline rebuild .META. from fs data only This addresses bug HBASE-4377. https://issues.apache.org/jira/browse/HBASE-4377 Diffs (updated) - src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildHole.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildBase.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRepair.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java f792720 src/test/java/org/apache/hadoop/hbase/util/hbck/HbckTestingUtil.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRebuildTestCore.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java ef246c3 src/main/java/org/apache/hadoop/hbase/util/Bytes.java 13ad026 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java b04aab6 src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildOverlap.java PRE-CREATION Diff: https://reviews.apache.org/r/2287/diff Testing --- Note, the assertion test result is different in the failure cases due to HBASE-451 changes. (0.90 returns 0 tables since it does a meta scan on empty meta, trunk branch looks at hdfs dirs, and returns 1). This version passes after HBASE-4508 (backport HBASE-3777 to 0.90 branch) is applied. I believe if that patch is not applied, I could modify the test code to force some explicit HConnection deletions. Thanks, jmhsieh [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, hbase-4377-trunk.v2.patch In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131302#comment-13131302 ] Jonathan Hsieh commented on HBASE-4377: --- 0.90 version requires HBASE-4508 [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, hbase-4377-trunk.v2.patch In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131303#comment-13131303 ] jirapos...@reviews.apache.org commented on HBASE-4377: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2126/ --- (Updated 2011-10-20 03:22:39.708313) Review request for hbase, Michael Stack and Andrew Purtell. Changes --- Ported updates from comments from 0.90 branch to trunk/0.92 branch. Summary --- commit fbf82c17be6b3ecca5a981f5270cf93aac26e479 Author: Jonathan Hsieh j...@cloudera.com Date: Wed Sep 28 10:18:11 2011 -0700 HBASE-4377 [hbck] Offline rebuild .META. from fs data only This patch rebuilds a new .META. table by reading all the .regioninfo files in the hbase main directory. It depends on the yet to be committed HBASE-4515 (either my verison or Gary's version), HBASE-4509, and HBASE-4506. Some follow on work includes backporting to 0.90, auto-patching true holes, and adding documentation. This addresses bug HBASE-4377. https://issues.apache.org/jira/browse/HBASE-4377 Diffs (updated) - src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java b9c850d src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 7409c9c src/main/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRepair.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java f5be448 src/test/java/org/apache/hadoop/hbase/util/hbck/HbckTestingUtil.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRebuildTestCore.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildBase.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildHole.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildOverlap.java PRE-CREATION Diff: https://reviews.apache.org/r/2126/diff Testing --- An earlier version of this code (backported to 0.90) was used to diagnose and repair a cluster that had 2700 inconsistencies due to failed splits (the cluster was underprovisioned memory-wise, and on restart, the some regions would start splitting and then die due to oome's). This was not actually used on a live cluster -- it was used to reconstruct a .META. from .regioninfo's laid out in hbase's directory structure. Note also that this is not an automatic fix -- whenever any problems are found, this bails out but dumps info on holes, suggests some fixes, and displays sets of overlapping regions. It is up to the user to merge regions, to create .regioninfo files to plug hole, and to do any potential data loosing operations. The tests demonstrate current expected behavior -- rebuild meta if things line up, and fail without making modifications if holes or overlaps exist. Thanks, jmhsieh [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, hbase-4377-trunk.v2.patch In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131330#comment-13131330 ] jirapos...@reviews.apache.org commented on HBASE-4377: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2287/ --- (Updated 2011-10-20 03:46:53.527042) Review request for hbase and Ted Yu. Changes --- New version for 0.90 that does not require HBASE-3777 / HBASE-4508. Summary --- Backport to 0.90 commit 89862b73c6358e27220b87b0362599d86ab0fe4a Author: Jonathan Hsieh j...@cloudera.com Date: Wed Sep 28 10:18:11 2011 -0700 HBASE-4377 [hbck] Offline rebuild .META. from fs data only This addresses bug HBASE-4377. https://issues.apache.org/jira/browse/HBASE-4377 Diffs (updated) - src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java ef246c3 src/main/java/org/apache/hadoop/hbase/util/Bytes.java 13ad026 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java b04aab6 src/main/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRepair.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java f792720 src/test/java/org/apache/hadoop/hbase/util/hbck/HbckTestingUtil.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRebuildTestCore.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildBase.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildHole.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildOverlap.java PRE-CREATION Diff: https://reviews.apache.org/r/2287/diff Testing --- Note, the assertion test result is different in the failure cases due to HBASE-451 changes. (0.90 returns 0 tables since it does a meta scan on empty meta, trunk branch looks at hdfs dirs, and returns 1). This version passes after HBASE-4508 (backport HBASE-3777 to 0.90 branch) is applied. I believe if that patch is not applied, I could modify the test code to force some explicit HConnection deletions. Thanks, jmhsieh [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, hbase-4377-trunk.v2.patch In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13123059#comment-13123059 ] jirapos...@reviews.apache.org commented on HBASE-4377: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2126/ --- (Updated 2011-10-07 18:46:54.806909) Review request for hbase, Michael Stack and Andrew Purtell. Changes --- Updates with nits and separated tests into different classes so that we can rely on new jvms to avoid OO file handle errors intermittently encountered when shutting down and restarting mini clusters. Summary --- commit fbf82c17be6b3ecca5a981f5270cf93aac26e479 Author: Jonathan Hsieh j...@cloudera.com Date: Wed Sep 28 10:18:11 2011 -0700 HBASE-4377 [hbck] Offline rebuild .META. from fs data only This patch rebuilds a new .META. table by reading all the .regioninfo files in the hbase main directory. It depends on the yet to be committed HBASE-4515 (either my verison or Gary's version), HBASE-4509, and HBASE-4506. Some follow on work includes backporting to 0.90, auto-patching true holes, and adding documentation. This addresses bug HBASE-4377. https://issues.apache.org/jira/browse/HBASE-4377 Diffs (updated) - src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java b9c850d src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 154ac32 src/main/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRepair.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java f5be448 src/test/java/org/apache/hadoop/hbase/util/hbck/HbckTestingUtil.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRebuildTestCore.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildBase.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildHole.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildOverlap.java PRE-CREATION Diff: https://reviews.apache.org/r/2126/diff Testing --- An earlier version of this code (backported to 0.90) was used to diagnose and repair a cluster that had 2700 inconsistencies due to failed splits (the cluster was underprovisioned memory-wise, and on restart, the some regions would start splitting and then die due to oome's). This was not actually used on a live cluster -- it was used to reconstruct a .META. from .regioninfo's laid out in hbase's directory structure. Note also that this is not an automatic fix -- whenever any problems are found, this bails out but dumps info on holes, suggests some fixes, and displays sets of overlapping regions. It is up to the user to merge regions, to create .regioninfo files to plug hole, and to do any potential data loosing operations. The tests demonstrate current expected behavior -- rebuild meta if things line up, and fail without making modifications if holes or overlaps exist. Thanks, jmhsieh [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, hbase-4377-trunk.v2.patch In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13123058#comment-13123058 ] jirapos...@reviews.apache.org commented on HBASE-4377: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2126/ --- (Updated 2011-10-07 18:47:01.208741) Review request for hbase, Michael Stack and Andrew Purtell. Summary --- commit fbf82c17be6b3ecca5a981f5270cf93aac26e479 Author: Jonathan Hsieh j...@cloudera.com Date: Wed Sep 28 10:18:11 2011 -0700 HBASE-4377 [hbck] Offline rebuild .META. from fs data only This patch rebuilds a new .META. table by reading all the .regioninfo files in the hbase main directory. It depends on the yet to be committed HBASE-4515 (either my verison or Gary's version), HBASE-4509, and HBASE-4506. Some follow on work includes backporting to 0.90, auto-patching true holes, and adding documentation. This addresses bug HBASE-4377. https://issues.apache.org/jira/browse/HBASE-4377 Diffs - src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java b9c850d src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 154ac32 src/main/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRepair.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java f5be448 src/test/java/org/apache/hadoop/hbase/util/hbck/HbckTestingUtil.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRebuildTestCore.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildBase.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildHole.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildOverlap.java PRE-CREATION Diff: https://reviews.apache.org/r/2126/diff Testing --- An earlier version of this code (backported to 0.90) was used to diagnose and repair a cluster that had 2700 inconsistencies due to failed splits (the cluster was underprovisioned memory-wise, and on restart, the some regions would start splitting and then die due to oome's). This was not actually used on a live cluster -- it was used to reconstruct a .META. from .regioninfo's laid out in hbase's directory structure. Note also that this is not an automatic fix -- whenever any problems are found, this bails out but dumps info on holes, suggests some fixes, and displays sets of overlapping regions. It is up to the user to merge regions, to create .regioninfo files to plug hole, and to do any potential data loosing operations. The tests demonstrate current expected behavior -- rebuild meta if things line up, and fail without making modifications if holes or overlaps exist. Thanks, jmhsieh [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, hbase-4377-trunk.v2.patch In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13123080#comment-13123080 ] jirapos...@reviews.apache.org commented on HBASE-4377: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2287/ --- Review request for hbase and Ted Yu. Summary --- Backport to 0.90 commit 89862b73c6358e27220b87b0362599d86ab0fe4a Author: Jonathan Hsieh j...@cloudera.com Date: Wed Sep 28 10:18:11 2011 -0700 HBASE-4377 [hbck] Offline rebuild .META. from fs data only This addresses bug HBASE-4377. https://issues.apache.org/jira/browse/HBASE-4377 Diffs - src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java ef246c3 src/main/java/org/apache/hadoop/hbase/util/Bytes.java 13ad026 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java b04aab6 src/main/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRepair.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java f792720 src/test/java/org/apache/hadoop/hbase/util/hbck/HbckTestingUtil.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRebuildTestCore.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildBase.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildHole.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildOverlap.java PRE-CREATION Diff: https://reviews.apache.org/r/2287/diff Testing --- Note, the assertion test result is different in the failure cases due to HBASE-451 changes. (0.90 returns 0 tables since it does a meta scan on empty meta, trunk branch looks at hdfs dirs, and returns 1). This version passes after HBASE-4508 (backport HBASE-3777 to 0.90 branch) is applied. I believe if that patch is not applied, I could modify the test code to force some explicit HConnection deletions. Thanks, jmhsieh [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, hbase-4377-trunk.v2.patch In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13123197#comment-13123197 ] jirapos...@reviews.apache.org commented on HBASE-4377: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2287/#review2440 --- src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java https://reviews.apache.org/r/2287/#comment5546 Minor suggestion: IOException may occur more than once. Would logging all such IOException's before bailing out make user experience better ? Basically we just need to track the last such IOException in a variable and bail out at line 283 if the variable isn't null. src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java https://reviews.apache.org/r/2287/#comment5545 Naming rd as rootdir would make the code more readable. src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java https://reviews.apache.org/r/2287/#comment5548 I think rebuildMeta() should check the return value from generatePuts(). Otherwise we would encounter NPE at line 405 below. src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java https://reviews.apache.org/r/2287/#comment5549 Do you plan to add this logic in another JIRA ? src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java https://reviews.apache.org/r/2287/#comment5550 false should be returned if puts is null. src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java https://reviews.apache.org/r/2287/#comment5552 I think LOG.info() should be used here. - Ted On 2011-10-07 19:04:44, jmhsieh wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2287/ bq. --- bq. bq. (Updated 2011-10-07 19:04:44) bq. bq. bq. Review request for hbase and Ted Yu. bq. bq. bq. Summary bq. --- bq. bq. Backport to 0.90 bq. bq. commit 89862b73c6358e27220b87b0362599d86ab0fe4a bq. Author: Jonathan Hsieh j...@cloudera.com bq. Date: Wed Sep 28 10:18:11 2011 -0700 bq. bq. HBASE-4377 [hbck] Offline rebuild .META. from fs data only bq. bq. bq. bq. This addresses bug HBASE-4377. bq. https://issues.apache.org/jira/browse/HBASE-4377 bq. bq. bq. Diffs bq. - bq. bq.src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java ef246c3 bq.src/main/java/org/apache/hadoop/hbase/util/Bytes.java 13ad026 bq.src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java b04aab6 bq.src/main/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRepair.java PRE-CREATION bq.src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java f792720 bq.src/test/java/org/apache/hadoop/hbase/util/hbck/HbckTestingUtil.java PRE-CREATION bq. src/test/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRebuildTestCore.java PRE-CREATION bq. src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildBase.java PRE-CREATION bq. src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildHole.java PRE-CREATION bq. src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildOverlap.java PRE-CREATION bq. bq. Diff: https://reviews.apache.org/r/2287/diff bq. bq. bq. Testing bq. --- bq. bq. Note, the assertion test result is different in the failure cases due to HBASE-451 changes. (0.90 returns 0 tables since it does a meta scan on empty meta, trunk branch looks at hdfs dirs, and returns 1). bq. bq. This version passes after HBASE-4508 (backport HBASE-3777 to 0.90 branch) is applied. bq. bq. I believe if that patch is not applied, I could modify the test code to force some explicit HConnection deletions. bq. bq. bq. Thanks, bq. bq. jmhsieh bq. bq. [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, hbase-4377-trunk.v2.patch In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122344#comment-13122344 ] Jonathan Hsieh commented on HBASE-4377: --- In the 0.90 branch, after deleting meta and restarting the # of tables present is 0. In trunk and 0.92 branch, after deleting meta and restart the # of tables present is 1. This actually does make sense because HBASE-451 changed the behavior of HMaster -- in 0.90 (pre-HBASE-451) it HConnectionManager.listTables() loads table info on the client side via a meta scan. Post HBASE-451, table data from HConnectionManager.listTables() comes from the files system and is cached by the HMaster, and ignores the meta table. [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, hbase-4377-trunk.v2.patch In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122354#comment-13122354 ] Todd Lipcon commented on HBASE-4377: bq. Post HBASE-451, table data from HConnectionManager.listTables() comes from the files system and is cached by the HMaster, and ignores the meta table This seems like a bug - clients should never have to have direct access to HDFS! I filed HBASE-4548 [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, hbase-4377-trunk.v2.patch In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122367#comment-13122367 ] Jonathan Hsieh commented on HBASE-4377: --- @Todd, I think there is some confusion. Clients do not directly access hdfs. Let me add more detail. In trunk post HBASE-451, the HMaster reads and caches data from the file system (not the client). It then serves this the HTableDescriptors to the client rpc's via HConnectionManager to talk to the HMaster which just ships the cached HTD data. HMaster on initialization reads file system for HTD data. Client calls listTables() - HMaster (serve cached data from file system). Pre-HBASE-451, it the client HConnectionManager does a meta scan and builds HTableDescriptors. Client calls listTables() which actually is a metascan and that builds htds. [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, hbase-4377-trunk.v2.patch In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13120561#comment-13120561 ] Jonathan Hsieh commented on HBASE-4377: --- Although I've gotten this to work with live systems, it seems like that there are some problems with the testing on the backports. Different versions have different expected values which does not seem to make sense. HBASE-3777 changed some of the semantics of the HBaseTestingUtility so I'll be investigating more. [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, hbase-4377-trunk.v2.patch In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13120570#comment-13120570 ] Ted Yu commented on HBASE-4377: --- HBASE-4508 would backport HBASE-3777 to 0.90 We should get consistent behavior from HBaseTestingUtility after HBASE-4508 goes in. [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, hbase-4377-trunk.v2.patch In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13119451#comment-13119451 ] jirapos...@reviews.apache.org commented on HBASE-4377: -- bq. On 2011-09-30 21:27:16, Ted Yu wrote: bq. How long did it take to scan the cluster with 2700 inconsistencies ? bq. I see certain places in the code where more parallelism can be achieved if practical use of this feature takes long time. The cluster that had 12k total regions after clenaup. It took 2m to run (this was localdisk accesses). I didn't feel that the runtime was something to be concerned about. And I honestly hope this code doesn't get used too often! We could use the same WorkItem trick to speed up the code but my feeling is that straightforward and correct is the right first step. bq. On 2011-09-30 21:27:16, Ted Yu wrote: bq. src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java, line 374 bq. https://reviews.apache.org/r/2126/diff/1/?file=46564#file46564line374 bq. bq. Better replace root with -ROOT- done. bq. On 2011-09-30 21:27:16, Ted Yu wrote: bq. src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java, line 376 bq. https://reviews.apache.org/r/2126/diff/1/?file=46564#file46564line376 bq. bq. b is not needed here, same with question mark. k bq. On 2011-09-30 21:27:16, Ted Yu wrote: bq. src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java, line 391 bq. https://reviews.apache.org/r/2126/diff/1/?file=46564#file46564line391 bq. bq. Please remove b and question mark. k bq. On 2011-09-30 21:27:16, Ted Yu wrote: bq. src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 307 bq. https://reviews.apache.org/r/2126/diff/1/?file=46565#file46565line307 bq. bq. The .META. region is open upon return. bq. I think we should document this. changed live to open bq. On 2011-09-30 21:27:16, Ted Yu wrote: bq. src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 288 bq. https://reviews.apache.org/r/2126/diff/1/?file=46565#file46565line288 bq. bq. It would be nice to log the path for the underlying region. bq. Otherwise what purpose does this catch/rethrow serve ? nice catch. Updated to include table name and path. bq. On 2011-09-30 21:27:16, Ted Yu wrote: bq. src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 309 bq. https://reviews.apache.org/r/2126/diff/1/?file=46565#file46565line309 bq. bq. Looking at the usage below, maybe createNewRootAndMeta would be a better name. done bq. On 2011-09-30 21:27:16, Ted Yu wrote: bq. src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 352 bq. https://reviews.apache.org/r/2126/diff/1/?file=46565#file46565line352 bq. bq. This log doesn't match the check above. bq. If we only produce Put for the first HbckInfo, we'd better declare that in the log. updated error message and change behavior so that it bails out. In this particular case, the invariant is checked before this method is called, but I'll just make it more explicit. bq. On 2011-09-30 21:27:16, Ted Yu wrote: bq. src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 356 bq. https://reviews.apache.org/r/2126/diff/1/?file=46565#file46565line356 bq. bq. This would produce exception if his.size() == 0. problem avoided with update. bq. On 2011-09-30 21:27:16, Ted Yu wrote: bq. src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 378 bq. https://reviews.apache.org/r/2126/diff/1/?file=46565#file46565line378 bq. bq. Do you plan to do this in the next patch or in another JIRA ? bq. I haven't looked at the other JIRAs you mentioned, pardon me. I'll file it as a follow-on jira. bq. On 2011-09-30 21:27:16, Ted Yu wrote: bq. src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 428 bq. https://reviews.apache.org/r/2126/diff/1/?file=46565#file46565line428 bq. bq. Is there something we can do in case we get IOE from this call ? added error logging and an attempt to revert. bq. On 2011-09-30 21:27:16, Ted Yu wrote: bq. src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 377 bq. https://reviews.apache.org/r/2126/diff/1/?file=46565#file46565line377 bq. bq. Better use boolean for return value to indicate success/failure. done. - jmhsieh --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2126/#review2231 --- On 2011-09-30 00:02:16, jmhsieh wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2126/ bq.
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13119460#comment-13119460 ] jirapos...@reviews.apache.org commented on HBASE-4377: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2126/ --- (Updated 2011-10-03 18:16:51.000353) Review request for hbase, Michael Stack and Andrew Purtell. Changes --- Addressed review comments. Summary --- commit fbf82c17be6b3ecca5a981f5270cf93aac26e479 Author: Jonathan Hsieh j...@cloudera.com Date: Wed Sep 28 10:18:11 2011 -0700 HBASE-4377 [hbck] Offline rebuild .META. from fs data only This patch rebuilds a new .META. table by reading all the .regioninfo files in the hbase main directory. It depends on the yet to be committed HBASE-4515 (either my verison or Gary's version), HBASE-4509, and HBASE-4506. Some follow on work includes backporting to 0.90, auto-patching true holes, and adding documentation. This addresses bug HBASE-4377. https://issues.apache.org/jira/browse/HBASE-4377 Diffs (updated) - src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java f5be448 src/main/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRepair.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java b9c850d src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 154ac32 src/test/java/org/apache/hadoop/hbase/util/hbck/HbckTestingUtil.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuild.java PRE-CREATION Diff: https://reviews.apache.org/r/2126/diff Testing --- An earlier version of this code (backported to 0.90) was used to diagnose and repair a cluster that had 2700 inconsistencies due to failed splits (the cluster was underprovisioned memory-wise, and on restart, the some regions would start splitting and then die due to oome's). This was not actually used on a live cluster -- it was used to reconstruct a .META. from .regioninfo's laid out in hbase's directory structure. Note also that this is not an automatic fix -- whenever any problems are found, this bails out but dumps info on holes, suggests some fixes, and displays sets of overlapping regions. It is up to the user to merge regions, to create .regioninfo files to plug hole, and to do any potential data loosing operations. The tests demonstrate current expected behavior -- rebuild meta if things line up, and fail without making modifications if holes or overlaps exist. Thanks, jmhsieh [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13119461#comment-13119461 ] jirapos...@reviews.apache.org commented on HBASE-4377: -- bq. On 2011-09-30 21:27:16, Ted Yu wrote: bq. src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java, line 376 bq. https://reviews.apache.org/r/2126/diff/1/?file=46564#file46564line376 bq. bq. b is not needed here, same with question mark. bq. bq. jmhsieh wrote: bq. k javadoc form for @param is to list the parameter name, so it should be there. agree that no question mark should be there. i think the javadoc-y phrasing would be something like whether to enable in-memory caching or not - Jonathan --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2126/#review2231 --- On 2011-09-30 00:02:16, jmhsieh wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2126/ bq. --- bq. bq. (Updated 2011-09-30 00:02:16) bq. bq. bq. Review request for hbase, Michael Stack and Andrew Purtell. bq. bq. bq. Summary bq. --- bq. bq. commit fbf82c17be6b3ecca5a981f5270cf93aac26e479 bq. Author: Jonathan Hsieh j...@cloudera.com bq. Date: Wed Sep 28 10:18:11 2011 -0700 bq. bq. HBASE-4377 [hbck] Offline rebuild .META. from fs data only bq. bq. bq. This patch rebuilds a new .META. table by reading all the .regioninfo files in the hbase main directory. It depends on the yet to be committed HBASE-4515 (either my verison or Gary's version), HBASE-4509, and HBASE-4506. bq. bq. Some follow on work includes backporting to 0.90, auto-patching true holes, and adding documentation. bq. bq. bq. This addresses bug HBASE-4377. bq. https://issues.apache.org/jira/browse/HBASE-4377 bq. bq. bq. Diffs bq. - bq. bq.src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java b9c850d bq.src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 8465724 bq.src/main/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRepair.java PRE-CREATION bq.src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java fae0881 bq.src/test/java/org/apache/hadoop/hbase/util/hbck/HbckTestingUtil.java PRE-CREATION bq. src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuild.java PRE-CREATION bq. bq. Diff: https://reviews.apache.org/r/2126/diff bq. bq. bq. Testing bq. --- bq. bq. An earlier version of this code (backported to 0.90) was used to diagnose and repair a cluster that had 2700 inconsistencies due to failed splits (the cluster was underprovisioned memory-wise, and on restart, the some regions would start splitting and then die due to oome's). This was not actually used on a live cluster -- it was used to reconstruct a .META. from .regioninfo's laid out in hbase's directory structure. bq. bq. Note also that this is not an automatic fix -- whenever any problems are found, this bails out but dumps info on holes, suggests some fixes, and displays sets of overlapping regions. It is up to the user to merge regions, to create .regioninfo files to plug hole, and to do any potential data loosing operations. bq. bq. The tests demonstrate current expected behavior -- rebuild meta if things line up, and fail without making modifications if holes or overlaps exist. bq. bq. bq. Thanks, bq. bq. jmhsieh bq. bq. [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13119466#comment-13119466 ] Jonathan Hsieh commented on HBASE-4377: --- Since the review for trunk had relatively minor issues, I'm going to work on re-backporting this to the 0.90 branch. [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, hbase-4377-trunk.v2.patch In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13119727#comment-13119727 ] Jonathan Hsieh commented on HBASE-4377: --- When backporting to 0.90, the TestOfflineMetaRebuild test case would fail out due to out of file handles exceptions. I dug for a while and found that the static HConnections cached connections that are not flushed between tests. Even after avoiding that there are other resources (maybe pooling on hdfs client or zk client connections?) that cause the open file handles count to increase significantly after every test case. To avoid this problem, I'm going to split out the each rebuild tests into own test case so that each can be executed in a new process and avoid the out of file handles problem. I'll do this for trunk and for the 0.90 backport. [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, hbase-4377-trunk.v2.patch In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13118468#comment-13118468 ] jirapos...@reviews.apache.org commented on HBASE-4377: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2126/#review2231 --- How long did it take to scan the cluster with 2700 inconsistencies ? I see certain places in the code where more parallelism can be achieved if practical use of this feature takes long time. src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java https://reviews.apache.org/r/2126/#comment5180 Better replace root with -ROOT- src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java https://reviews.apache.org/r/2126/#comment5179 b is not needed here, same with question mark. src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java https://reviews.apache.org/r/2126/#comment5181 Please remove b and question mark. src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java https://reviews.apache.org/r/2126/#comment5185 It would be nice to log the path for the underlying region. Otherwise what purpose does this catch/rethrow serve ? src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java https://reviews.apache.org/r/2126/#comment5188 The .META. region is open upon return. I think we should document this. src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java https://reviews.apache.org/r/2126/#comment5187 Looking at the usage below, maybe createNewRootAndMeta would be a better name. src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java https://reviews.apache.org/r/2126/#comment5189 This log doesn't match the check above. If we only produce Put for the first HbckInfo, we'd better declare that in the log. src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java https://reviews.apache.org/r/2126/#comment5190 This would produce exception if his.size() == 0. src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java https://reviews.apache.org/r/2126/#comment5192 Better use boolean for return value to indicate success/failure. src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java https://reviews.apache.org/r/2126/#comment5191 Do you plan to do this in the next patch or in another JIRA ? I haven't looked at the other JIRAs you mentioned, pardon me. src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java https://reviews.apache.org/r/2126/#comment5193 Is there something we can do in case we get IOE from this call ? - Ted On 2011-09-30 00:02:16, jmhsieh wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2126/ bq. --- bq. bq. (Updated 2011-09-30 00:02:16) bq. bq. bq. Review request for hbase, Michael Stack and Andrew Purtell. bq. bq. bq. Summary bq. --- bq. bq. commit fbf82c17be6b3ecca5a981f5270cf93aac26e479 bq. Author: Jonathan Hsieh j...@cloudera.com bq. Date: Wed Sep 28 10:18:11 2011 -0700 bq. bq. HBASE-4377 [hbck] Offline rebuild .META. from fs data only bq. bq. bq. This patch rebuilds a new .META. table by reading all the .regioninfo files in the hbase main directory. It depends on the yet to be committed HBASE-4515 (either my verison or Gary's version), HBASE-4509, and HBASE-4506. bq. bq. Some follow on work includes backporting to 0.90, auto-patching true holes, and adding documentation. bq. bq. bq. This addresses bug HBASE-4377. bq. https://issues.apache.org/jira/browse/HBASE-4377 bq. bq. bq. Diffs bq. - bq. bq.src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java b9c850d bq.src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 8465724 bq.src/main/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRepair.java PRE-CREATION bq.src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java fae0881 bq.src/test/java/org/apache/hadoop/hbase/util/hbck/HbckTestingUtil.java PRE-CREATION bq. src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuild.java PRE-CREATION bq. bq. Diff: https://reviews.apache.org/r/2126/diff bq. bq. bq. Testing bq. --- bq. bq. An earlier version of this code (backported to 0.90) was used to diagnose and repair a cluster that had 2700 inconsistencies due to failed splits (the cluster was underprovisioned memory-wise, and on restart, the some regions would start splitting and then die due to oome's). This was not actually used on a live cluster -- it was used to reconstruct a .META. from .regioninfo's laid out in
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13118480#comment-13118480 ] jirapos...@reviews.apache.org commented on HBASE-4377: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2126/#review2236 --- src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java https://reviews.apache.org/r/2126/#comment5194 My comment above on the first rename call was inaccurate. IOE out of the second call would be fatal. - Ted On 2011-09-30 00:02:16, jmhsieh wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2126/ bq. --- bq. bq. (Updated 2011-09-30 00:02:16) bq. bq. bq. Review request for hbase, Michael Stack and Andrew Purtell. bq. bq. bq. Summary bq. --- bq. bq. commit fbf82c17be6b3ecca5a981f5270cf93aac26e479 bq. Author: Jonathan Hsieh j...@cloudera.com bq. Date: Wed Sep 28 10:18:11 2011 -0700 bq. bq. HBASE-4377 [hbck] Offline rebuild .META. from fs data only bq. bq. bq. This patch rebuilds a new .META. table by reading all the .regioninfo files in the hbase main directory. It depends on the yet to be committed HBASE-4515 (either my verison or Gary's version), HBASE-4509, and HBASE-4506. bq. bq. Some follow on work includes backporting to 0.90, auto-patching true holes, and adding documentation. bq. bq. bq. This addresses bug HBASE-4377. bq. https://issues.apache.org/jira/browse/HBASE-4377 bq. bq. bq. Diffs bq. - bq. bq.src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java b9c850d bq.src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 8465724 bq.src/main/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRepair.java PRE-CREATION bq.src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java fae0881 bq.src/test/java/org/apache/hadoop/hbase/util/hbck/HbckTestingUtil.java PRE-CREATION bq. src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuild.java PRE-CREATION bq. bq. Diff: https://reviews.apache.org/r/2126/diff bq. bq. bq. Testing bq. --- bq. bq. An earlier version of this code (backported to 0.90) was used to diagnose and repair a cluster that had 2700 inconsistencies due to failed splits (the cluster was underprovisioned memory-wise, and on restart, the some regions would start splitting and then die due to oome's). This was not actually used on a live cluster -- it was used to reconstruct a .META. from .regioninfo's laid out in hbase's directory structure. bq. bq. Note also that this is not an automatic fix -- whenever any problems are found, this bails out but dumps info on holes, suggests some fixes, and displays sets of overlapping regions. It is up to the user to merge regions, to create .regioninfo files to plug hole, and to do any potential data loosing operations. bq. bq. The tests demonstrate current expected behavior -- rebuild meta if things line up, and fail without making modifications if holes or overlaps exist. bq. bq. bq. Thanks, bq. bq. jmhsieh bq. bq. [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13117095#comment-13117095 ] Jonathan Hsieh commented on HBASE-4377: --- I'm having a hard time with tests that restart the test hbase mini cluster. I start cluster, modify meta/hdfs regions, shutdown cluster, rebuild meta, and then get an NPE when restarting. Specifically, this method sometimes returns null which later causes an NPE when constructor calls {code} User.HadoopUser.init ugi = (UserGroupInformation) callStatic(getCurrentUGI); {code} Test were passing at one point but I can't seem to figure out a direct cause for why this would fail. Any hints? [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13117641#comment-13117641 ] Jonathan Hsieh commented on HBASE-4377: --- HBASE-4515 is required for tests to pass consistently [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13117762#comment-13117762 ] jirapos...@reviews.apache.org commented on HBASE-4377: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2126/ --- Review request for hbase, Michael Stack and Andrew Purtell. Summary --- commit fbf82c17be6b3ecca5a981f5270cf93aac26e479 Author: Jonathan Hsieh j...@cloudera.com Date: Wed Sep 28 10:18:11 2011 -0700 HBASE-4377 [hbck] Offline rebuild .META. from fs data only This patch rebuilds a new .META. table by reading all the .regioninfo files in the hbase main directory. It depends on the yet to be committed HBASE-4515 (either my verison or Gary's version), HBASE-4509, and HBASE-4506. Some follow on work includes backporting to 0.90, auto-patching true holes, and adding documentation. This addresses bug HBASE-4377. https://issues.apache.org/jira/browse/HBASE-4377 Diffs - src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java b9c850d src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 8465724 src/main/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRepair.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java fae0881 src/test/java/org/apache/hadoop/hbase/util/hbck/HbckTestingUtil.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuild.java PRE-CREATION Diff: https://reviews.apache.org/r/2126/diff Testing --- An earlier version of this code (backported to 0.90) was used to diagnose and repair a cluster that had 2700 inconsistencies due to failed splits (the cluster was underprovisioned memory-wise, and on restart, the some regions would start splitting and then die due to oome's). This was not actually used on a live cluster -- it was used to reconstruct a .META. from .regioninfo's laid out in hbase's directory structure. Note also that this is not an automatic fix -- whenever any problems are found, this bails out but dumps info on holes, suggests some fixes, and displays sets of overlapping regions. It is up to the user to merge regions, to create .regioninfo files to plug hole, and to do any potential data loosing operations. The tests demonstrate current expected behavior -- rebuild meta if things line up, and fail without making modifications if holes or overlaps exist. Thanks, jmhsieh [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13116634#comment-13116634 ] Jonathan Hsieh commented on HBASE-4377: --- I think my plan is to postpone the large refactor until after this gets through. [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13116708#comment-13116708 ] stack commented on HBASE-4377: -- @Jon So you want me to review whats over in github and commit that? [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13116712#comment-13116712 ] Jonathan Hsieh commented on HBASE-4377: --- @stack: Not yet, I'm still cleaning this up and adding tests right now. [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13116713#comment-13116713 ] Jonathan Hsieh commented on HBASE-4377: --- More detail -- I've done a large refactor of hbck but found that then doing the changes would more difficult understand or review the offline rebuild code. So, my plan is to add the offline rebuild code, and then potentially do a refactor afterwards. Regardless of whether the refactor happens, I feel that I need to add tests and docs for this before it is ready for review. [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13113839#comment-13113839 ] Jonathan Hsieh commented on HBASE-4377: --- I have a very *hacky* version that I've successfully recently used to rebuild a .META. table with over 10k regions. It can be found here: https://github.com/jmhsieh/hbase/tree/hbase-4377 I've also hacked the hack to backport it onto an 0.90.x branch. To run it build hbase and then use the following command line {code} bin/hbase org.apache.hadoop.hbase.util.hbck.OfflineMetaRepair -base ~/pathToHbase/hbase -details {code} The program will fail telling the user about any problems it encounters. It only succeed if all the info gathered from .regioninfo's is clean after going through the regionsplit calculator. This code will take some time to clean up. I would like to do some refactoring of the current hbck and create a o.a.h.hbase.util.hbck or o.a.h.hbase.hbck package. Any preferences or concerns there? [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13103332#comment-13103332 ] stack commented on HBASE-4377: -- +1 on punt to user if whats in fs has overlapping regions (user would rule what to omit). [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira