[jira] [Updated] (HBASE-5128) [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online.

2012-03-23 Thread Jonathan Hsieh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-5128:
--

Attachment: hbase-5128-v4.patch
hbase-5128-0.94-v4.patch
hbase-5128-0.92-v4.patch

Updated to address ted's last concern, arcanist fixes, and a handful of findbug 
fixes. 

 [uber hbck] Enable hbck to automatically repair table integrity problems as 
 well as region consistency problems while online.
 -

 Key: HBASE-5128
 URL: https://issues.apache.org/jira/browse/HBASE-5128
 Project: HBase
  Issue Type: New Feature
  Components: hbck
Affects Versions: 0.90.5, 0.92.0, 0.94.0, 0.96.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0

 Attachments: hbase-5128-0.90-v2.patch, hbase-5128-0.90-v2b.patch, 
 hbase-5128-0.92-v2.patch, hbase-5128-0.92-v4.patch, hbase-5128-0.94-v2.patch, 
 hbase-5128-0.94-v4.patch, hbase-5128-trunk-v2.patch, hbase-5128-trunk.patch, 
 hbase-5128-v3.patch, hbase-5128-v4.patch


 The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region 
 consistency and table integrity invariant violations.  However with '-fix' it 
 can only automatically repair region consistency cases having to do with 
 deployment problems.  This updated version should be able to handle all cases 
 (including a new orphan regiondir case).  When complete will likely deprecate 
 the OfflineMetaRepair tool and subsume several open META-hole related issue.
 Here's the approach (from the comment of at the top of the new version of the 
 file).
 {code}
 /**
  * HBaseFsck (hbck) is a tool for checking and repairing region consistency 
 and
  * table integrity.  
  * 
  * Region consistency checks verify that META, region deployment on
  * region servers and the state of data in HDFS (.regioninfo files) all are in
  * accordance. 
  * 
  * Table integrity checks verify that that all possible row keys can resolve 
 to
  * exactly one region of a table.  This means there are no individual 
 degenerate
  * or backwards regions; no holes between regions; and that there no 
 overlapping
  * regions. 
  * 
  * The general repair strategy works in these steps.
  * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
  * 2) Repair Region Consistency with META and assignments
  * 
  * For table integrity repairs, the tables their region directories are 
 scanned
  * for .regioninfo files.  Each table's integrity is then verified.  If there 
  * are any orphan regions (regions with no .regioninfo files), or holes, new 
  * regions are fabricated.  Backwards regions are sidelined as well as empty
  * degenerate (endkey==startkey) regions.  If there are any overlapping 
 regions,
  * a new region is created and all data is merged into the new region.  
  * 
  * Table integrity repairs deal solely with HDFS and can be done offline -- 
 the
  * hbase region servers or master do not need to be running.  These phase can 
 be
  * use to completely reconstruct the META table in an offline fashion. 
  * 
  * Region consistency requires three conditions -- 1) valid .regioninfo file 
  * present in an hdfs region dir,  2) valid row with .regioninfo data in META,
  * and 3) a region is deployed only at the regionserver that is was assigned 
 to.
  * 
  * Region consistency requires hbck to contact the HBase master and region
  * servers, so the connect() must first be called successfully.  Much of the
  * region consistency information is transient and less risky to repair.
  */
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5128) [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online.

2012-03-23 Thread Jonathan Hsieh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-5128:
--

Attachment: hbase-5128-0.90-v4.patch

Cleaned up 0.90 version.  Requires HBASE-5563 to pass consistently.

 [uber hbck] Enable hbck to automatically repair table integrity problems as 
 well as region consistency problems while online.
 -

 Key: HBASE-5128
 URL: https://issues.apache.org/jira/browse/HBASE-5128
 Project: HBase
  Issue Type: New Feature
  Components: hbck
Affects Versions: 0.90.5, 0.92.0, 0.94.0, 0.96.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0

 Attachments: hbase-5128-0.90-v2.patch, hbase-5128-0.90-v2b.patch, 
 hbase-5128-0.90-v4.patch, hbase-5128-0.92-v2.patch, hbase-5128-0.92-v4.patch, 
 hbase-5128-0.94-v2.patch, hbase-5128-0.94-v4.patch, 
 hbase-5128-trunk-v2.patch, hbase-5128-trunk.patch, hbase-5128-v3.patch, 
 hbase-5128-v4.patch


 The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region 
 consistency and table integrity invariant violations.  However with '-fix' it 
 can only automatically repair region consistency cases having to do with 
 deployment problems.  This updated version should be able to handle all cases 
 (including a new orphan regiondir case).  When complete will likely deprecate 
 the OfflineMetaRepair tool and subsume several open META-hole related issue.
 Here's the approach (from the comment of at the top of the new version of the 
 file).
 {code}
 /**
  * HBaseFsck (hbck) is a tool for checking and repairing region consistency 
 and
  * table integrity.  
  * 
  * Region consistency checks verify that META, region deployment on
  * region servers and the state of data in HDFS (.regioninfo files) all are in
  * accordance. 
  * 
  * Table integrity checks verify that that all possible row keys can resolve 
 to
  * exactly one region of a table.  This means there are no individual 
 degenerate
  * or backwards regions; no holes between regions; and that there no 
 overlapping
  * regions. 
  * 
  * The general repair strategy works in these steps.
  * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
  * 2) Repair Region Consistency with META and assignments
  * 
  * For table integrity repairs, the tables their region directories are 
 scanned
  * for .regioninfo files.  Each table's integrity is then verified.  If there 
  * are any orphan regions (regions with no .regioninfo files), or holes, new 
  * regions are fabricated.  Backwards regions are sidelined as well as empty
  * degenerate (endkey==startkey) regions.  If there are any overlapping 
 regions,
  * a new region is created and all data is merged into the new region.  
  * 
  * Table integrity repairs deal solely with HDFS and can be done offline -- 
 the
  * hbase region servers or master do not need to be running.  These phase can 
 be
  * use to completely reconstruct the META table in an offline fashion. 
  * 
  * Region consistency requires three conditions -- 1) valid .regioninfo file 
  * present in an hdfs region dir,  2) valid row with .regioninfo data in META,
  * and 3) a region is deployed only at the regionserver that is was assigned 
 to.
  * 
  * Region consistency requires hbck to contact the HBase master and region
  * servers, so the connect() must first be called successfully.  Much of the
  * region consistency information is transient and less risky to repair.
  */
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5128) [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online.

2012-03-22 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5128:
--

Comment: was deleted

(was: -1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12519382/hbase-5128-0.92-v2.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 21 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1252//console

This message is automatically generated.)

 [uber hbck] Enable hbck to automatically repair table integrity problems as 
 well as region consistency problems while online.
 -

 Key: HBASE-5128
 URL: https://issues.apache.org/jira/browse/HBASE-5128
 Project: HBase
  Issue Type: New Feature
  Components: hbck
Affects Versions: 0.90.5, 0.92.0, 0.94.0, 0.96.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0

 Attachments: hbase-5128-0.90-v2.patch, hbase-5128-0.90-v2b.patch, 
 hbase-5128-0.92-v2.patch, hbase-5128-0.94-v2.patch, 
 hbase-5128-trunk-v2.patch, hbase-5128-trunk.patch


 The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region 
 consistency and table integrity invariant violations.  However with '-fix' it 
 can only automatically repair region consistency cases having to do with 
 deployment problems.  This updated version should be able to handle all cases 
 (including a new orphan regiondir case).  When complete will likely deprecate 
 the OfflineMetaRepair tool and subsume several open META-hole related issue.
 Here's the approach (from the comment of at the top of the new version of the 
 file).
 {code}
 /**
  * HBaseFsck (hbck) is a tool for checking and repairing region consistency 
 and
  * table integrity.  
  * 
  * Region consistency checks verify that META, region deployment on
  * region servers and the state of data in HDFS (.regioninfo files) all are in
  * accordance. 
  * 
  * Table integrity checks verify that that all possible row keys can resolve 
 to
  * exactly one region of a table.  This means there are no individual 
 degenerate
  * or backwards regions; no holes between regions; and that there no 
 overlapping
  * regions. 
  * 
  * The general repair strategy works in these steps.
  * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
  * 2) Repair Region Consistency with META and assignments
  * 
  * For table integrity repairs, the tables their region directories are 
 scanned
  * for .regioninfo files.  Each table's integrity is then verified.  If there 
  * are any orphan regions (regions with no .regioninfo files), or holes, new 
  * regions are fabricated.  Backwards regions are sidelined as well as empty
  * degenerate (endkey==startkey) regions.  If there are any overlapping 
 regions,
  * a new region is created and all data is merged into the new region.  
  * 
  * Table integrity repairs deal solely with HDFS and can be done offline -- 
 the
  * hbase region servers or master do not need to be running.  These phase can 
 be
  * use to completely reconstruct the META table in an offline fashion. 
  * 
  * Region consistency requires three conditions -- 1) valid .regioninfo file 
  * present in an hdfs region dir,  2) valid row with .regioninfo data in META,
  * and 3) a region is deployed only at the regionserver that is was assigned 
 to.
  * 
  * Region consistency requires hbck to contact the HBase master and region
  * servers, so the connect() must first be called successfully.  Much of the
  * region consistency information is transient and less risky to repair.
  */
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5128) [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online.

2012-03-22 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5128:
--

Comment: was deleted

(was: -1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12519405/hbase-5128-0.90-v2b.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 15 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1254//console

This message is automatically generated.)

 [uber hbck] Enable hbck to automatically repair table integrity problems as 
 well as region consistency problems while online.
 -

 Key: HBASE-5128
 URL: https://issues.apache.org/jira/browse/HBASE-5128
 Project: HBase
  Issue Type: New Feature
  Components: hbck
Affects Versions: 0.90.5, 0.92.0, 0.94.0, 0.96.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0

 Attachments: hbase-5128-0.90-v2.patch, hbase-5128-0.90-v2b.patch, 
 hbase-5128-0.92-v2.patch, hbase-5128-0.94-v2.patch, 
 hbase-5128-trunk-v2.patch, hbase-5128-trunk.patch


 The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region 
 consistency and table integrity invariant violations.  However with '-fix' it 
 can only automatically repair region consistency cases having to do with 
 deployment problems.  This updated version should be able to handle all cases 
 (including a new orphan regiondir case).  When complete will likely deprecate 
 the OfflineMetaRepair tool and subsume several open META-hole related issue.
 Here's the approach (from the comment of at the top of the new version of the 
 file).
 {code}
 /**
  * HBaseFsck (hbck) is a tool for checking and repairing region consistency 
 and
  * table integrity.  
  * 
  * Region consistency checks verify that META, region deployment on
  * region servers and the state of data in HDFS (.regioninfo files) all are in
  * accordance. 
  * 
  * Table integrity checks verify that that all possible row keys can resolve 
 to
  * exactly one region of a table.  This means there are no individual 
 degenerate
  * or backwards regions; no holes between regions; and that there no 
 overlapping
  * regions. 
  * 
  * The general repair strategy works in these steps.
  * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
  * 2) Repair Region Consistency with META and assignments
  * 
  * For table integrity repairs, the tables their region directories are 
 scanned
  * for .regioninfo files.  Each table's integrity is then verified.  If there 
  * are any orphan regions (regions with no .regioninfo files), or holes, new 
  * regions are fabricated.  Backwards regions are sidelined as well as empty
  * degenerate (endkey==startkey) regions.  If there are any overlapping 
 regions,
  * a new region is created and all data is merged into the new region.  
  * 
  * Table integrity repairs deal solely with HDFS and can be done offline -- 
 the
  * hbase region servers or master do not need to be running.  These phase can 
 be
  * use to completely reconstruct the META table in an offline fashion. 
  * 
  * Region consistency requires three conditions -- 1) valid .regioninfo file 
  * present in an hdfs region dir,  2) valid row with .regioninfo data in META,
  * and 3) a region is deployed only at the regionserver that is was assigned 
 to.
  * 
  * Region consistency requires hbck to contact the HBase master and region
  * servers, so the connect() must first be called successfully.  Much of the
  * region consistency information is transient and less risky to repair.
  */
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5128) [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online.

2012-03-22 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5128:
--

Comment: was deleted

(was: -1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12519401/hbase-5128-0.90-v2.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 15 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1253//console

This message is automatically generated.)

 [uber hbck] Enable hbck to automatically repair table integrity problems as 
 well as region consistency problems while online.
 -

 Key: HBASE-5128
 URL: https://issues.apache.org/jira/browse/HBASE-5128
 Project: HBase
  Issue Type: New Feature
  Components: hbck
Affects Versions: 0.90.5, 0.92.0, 0.94.0, 0.96.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0

 Attachments: hbase-5128-0.90-v2.patch, hbase-5128-0.90-v2b.patch, 
 hbase-5128-0.92-v2.patch, hbase-5128-0.94-v2.patch, 
 hbase-5128-trunk-v2.patch, hbase-5128-trunk.patch


 The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region 
 consistency and table integrity invariant violations.  However with '-fix' it 
 can only automatically repair region consistency cases having to do with 
 deployment problems.  This updated version should be able to handle all cases 
 (including a new orphan regiondir case).  When complete will likely deprecate 
 the OfflineMetaRepair tool and subsume several open META-hole related issue.
 Here's the approach (from the comment of at the top of the new version of the 
 file).
 {code}
 /**
  * HBaseFsck (hbck) is a tool for checking and repairing region consistency 
 and
  * table integrity.  
  * 
  * Region consistency checks verify that META, region deployment on
  * region servers and the state of data in HDFS (.regioninfo files) all are in
  * accordance. 
  * 
  * Table integrity checks verify that that all possible row keys can resolve 
 to
  * exactly one region of a table.  This means there are no individual 
 degenerate
  * or backwards regions; no holes between regions; and that there no 
 overlapping
  * regions. 
  * 
  * The general repair strategy works in these steps.
  * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
  * 2) Repair Region Consistency with META and assignments
  * 
  * For table integrity repairs, the tables their region directories are 
 scanned
  * for .regioninfo files.  Each table's integrity is then verified.  If there 
  * are any orphan regions (regions with no .regioninfo files), or holes, new 
  * regions are fabricated.  Backwards regions are sidelined as well as empty
  * degenerate (endkey==startkey) regions.  If there are any overlapping 
 regions,
  * a new region is created and all data is merged into the new region.  
  * 
  * Table integrity repairs deal solely with HDFS and can be done offline -- 
 the
  * hbase region servers or master do not need to be running.  These phase can 
 be
  * use to completely reconstruct the META table in an offline fashion. 
  * 
  * Region consistency requires three conditions -- 1) valid .regioninfo file 
  * present in an hdfs region dir,  2) valid row with .regioninfo data in META,
  * and 3) a region is deployed only at the regionserver that is was assigned 
 to.
  * 
  * Region consistency requires hbck to contact the HBase master and region
  * servers, so the connect() must first be called successfully.  Much of the
  * region consistency information is transient and less risky to repair.
  */
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5128) [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online.

2012-03-22 Thread Jonathan Hsieh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-5128:
--

Attachment: hbase-5128-v3.patch

 [uber hbck] Enable hbck to automatically repair table integrity problems as 
 well as region consistency problems while online.
 -

 Key: HBASE-5128
 URL: https://issues.apache.org/jira/browse/HBASE-5128
 Project: HBase
  Issue Type: New Feature
  Components: hbck
Affects Versions: 0.90.5, 0.92.0, 0.94.0, 0.96.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0

 Attachments: hbase-5128-0.90-v2.patch, hbase-5128-0.90-v2b.patch, 
 hbase-5128-0.92-v2.patch, hbase-5128-0.94-v2.patch, 
 hbase-5128-trunk-v2.patch, hbase-5128-trunk.patch, hbase-5128-v3.patch


 The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region 
 consistency and table integrity invariant violations.  However with '-fix' it 
 can only automatically repair region consistency cases having to do with 
 deployment problems.  This updated version should be able to handle all cases 
 (including a new orphan regiondir case).  When complete will likely deprecate 
 the OfflineMetaRepair tool and subsume several open META-hole related issue.
 Here's the approach (from the comment of at the top of the new version of the 
 file).
 {code}
 /**
  * HBaseFsck (hbck) is a tool for checking and repairing region consistency 
 and
  * table integrity.  
  * 
  * Region consistency checks verify that META, region deployment on
  * region servers and the state of data in HDFS (.regioninfo files) all are in
  * accordance. 
  * 
  * Table integrity checks verify that that all possible row keys can resolve 
 to
  * exactly one region of a table.  This means there are no individual 
 degenerate
  * or backwards regions; no holes between regions; and that there no 
 overlapping
  * regions. 
  * 
  * The general repair strategy works in these steps.
  * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
  * 2) Repair Region Consistency with META and assignments
  * 
  * For table integrity repairs, the tables their region directories are 
 scanned
  * for .regioninfo files.  Each table's integrity is then verified.  If there 
  * are any orphan regions (regions with no .regioninfo files), or holes, new 
  * regions are fabricated.  Backwards regions are sidelined as well as empty
  * degenerate (endkey==startkey) regions.  If there are any overlapping 
 regions,
  * a new region is created and all data is merged into the new region.  
  * 
  * Table integrity repairs deal solely with HDFS and can be done offline -- 
 the
  * hbase region servers or master do not need to be running.  These phase can 
 be
  * use to completely reconstruct the META table in an offline fashion. 
  * 
  * Region consistency requires three conditions -- 1) valid .regioninfo file 
  * present in an hdfs region dir,  2) valid row with .regioninfo data in META,
  * and 3) a region is deployed only at the regionserver that is was assigned 
 to.
  * 
  * Region consistency requires hbck to contact the HBase master and region
  * servers, so the connect() must first be called successfully.  Much of the
  * region consistency information is transient and less risky to repair.
  */
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5128) [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online.

2012-03-21 Thread Jonathan Hsieh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-5128:
--

Attachment: hbase-5128-trunk-v2.patch

 [uber hbck] Enable hbck to automatically repair table integrity problems as 
 well as region consistency problems while online.
 -

 Key: HBASE-5128
 URL: https://issues.apache.org/jira/browse/HBASE-5128
 Project: HBase
  Issue Type: New Feature
  Components: hbck
Affects Versions: 0.90.5, 0.92.0, 0.94.0, 0.96.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Attachments: hbase-5128-trunk-v2.patch, hbase-5128-trunk.patch


 The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region 
 consistency and table integrity invariant violations.  However with '-fix' it 
 can only automatically repair region consistency cases having to do with 
 deployment problems.  This updated version should be able to handle all cases 
 (including a new orphan regiondir case).  When complete will likely deprecate 
 the OfflineMetaRepair tool and subsume several open META-hole related issue.
 Here's the approach (from the comment of at the top of the new version of the 
 file).
 {code}
 /**
  * HBaseFsck (hbck) is a tool for checking and repairing region consistency 
 and
  * table integrity.  
  * 
  * Region consistency checks verify that META, region deployment on
  * region servers and the state of data in HDFS (.regioninfo files) all are in
  * accordance. 
  * 
  * Table integrity checks verify that that all possible row keys can resolve 
 to
  * exactly one region of a table.  This means there are no individual 
 degenerate
  * or backwards regions; no holes between regions; and that there no 
 overlapping
  * regions. 
  * 
  * The general repair strategy works in these steps.
  * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
  * 2) Repair Region Consistency with META and assignments
  * 
  * For table integrity repairs, the tables their region directories are 
 scanned
  * for .regioninfo files.  Each table's integrity is then verified.  If there 
  * are any orphan regions (regions with no .regioninfo files), or holes, new 
  * regions are fabricated.  Backwards regions are sidelined as well as empty
  * degenerate (endkey==startkey) regions.  If there are any overlapping 
 regions,
  * a new region is created and all data is merged into the new region.  
  * 
  * Table integrity repairs deal solely with HDFS and can be done offline -- 
 the
  * hbase region servers or master do not need to be running.  These phase can 
 be
  * use to completely reconstruct the META table in an offline fashion. 
  * 
  * Region consistency requires three conditions -- 1) valid .regioninfo file 
  * present in an hdfs region dir,  2) valid row with .regioninfo data in META,
  * and 3) a region is deployed only at the regionserver that is was assigned 
 to.
  * 
  * Region consistency requires hbck to contact the HBase master and region
  * servers, so the connect() must first be called successfully.  Much of the
  * region consistency information is transient and less risky to repair.
  */
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5128) [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online.

2012-03-21 Thread Jonathan Hsieh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-5128:
--

Attachment: hbase-5128-0.92-v2.patch
hbase-5128-0.94-v2.patch

0.94 and 0.92 versions have minor tweaks from trunk version and in both cases 
TestHBaseFsck passes.

 [uber hbck] Enable hbck to automatically repair table integrity problems as 
 well as region consistency problems while online.
 -

 Key: HBASE-5128
 URL: https://issues.apache.org/jira/browse/HBASE-5128
 Project: HBase
  Issue Type: New Feature
  Components: hbck
Affects Versions: 0.90.5, 0.92.0, 0.94.0, 0.96.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Attachments: hbase-5128-0.92-v2.patch, hbase-5128-0.94-v2.patch, 
 hbase-5128-trunk-v2.patch, hbase-5128-trunk.patch


 The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region 
 consistency and table integrity invariant violations.  However with '-fix' it 
 can only automatically repair region consistency cases having to do with 
 deployment problems.  This updated version should be able to handle all cases 
 (including a new orphan regiondir case).  When complete will likely deprecate 
 the OfflineMetaRepair tool and subsume several open META-hole related issue.
 Here's the approach (from the comment of at the top of the new version of the 
 file).
 {code}
 /**
  * HBaseFsck (hbck) is a tool for checking and repairing region consistency 
 and
  * table integrity.  
  * 
  * Region consistency checks verify that META, region deployment on
  * region servers and the state of data in HDFS (.regioninfo files) all are in
  * accordance. 
  * 
  * Table integrity checks verify that that all possible row keys can resolve 
 to
  * exactly one region of a table.  This means there are no individual 
 degenerate
  * or backwards regions; no holes between regions; and that there no 
 overlapping
  * regions. 
  * 
  * The general repair strategy works in these steps.
  * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
  * 2) Repair Region Consistency with META and assignments
  * 
  * For table integrity repairs, the tables their region directories are 
 scanned
  * for .regioninfo files.  Each table's integrity is then verified.  If there 
  * are any orphan regions (regions with no .regioninfo files), or holes, new 
  * regions are fabricated.  Backwards regions are sidelined as well as empty
  * degenerate (endkey==startkey) regions.  If there are any overlapping 
 regions,
  * a new region is created and all data is merged into the new region.  
  * 
  * Table integrity repairs deal solely with HDFS and can be done offline -- 
 the
  * hbase region servers or master do not need to be running.  These phase can 
 be
  * use to completely reconstruct the META table in an offline fashion. 
  * 
  * Region consistency requires three conditions -- 1) valid .regioninfo file 
  * present in an hdfs region dir,  2) valid row with .regioninfo data in META,
  * and 3) a region is deployed only at the regionserver that is was assigned 
 to.
  * 
  * Region consistency requires hbck to contact the HBase master and region
  * servers, so the connect() must first be called successfully.  Much of the
  * region consistency information is transient and less risky to repair.
  */
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5128) [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online.

2012-03-21 Thread Jonathan Hsieh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-5128:
--

Attachment: hbase-5128-0.90-v2.patch

For the 0.90v2 version, TestHBaseFsck passes consistently.  This is very close 
to the version we've used to repair production clusters. 

0.90 version has a different HBaseFsckRepair.forceOfflineInZK() which is 
somehow responsible for that version not needed HBASE-5563.  I haven't 
investigated enough to determine why the equivalent method for the 
0.92/0.94/trunk versions fail unit tests consistently.


 [uber hbck] Enable hbck to automatically repair table integrity problems as 
 well as region consistency problems while online.
 -

 Key: HBASE-5128
 URL: https://issues.apache.org/jira/browse/HBASE-5128
 Project: HBase
  Issue Type: New Feature
  Components: hbck
Affects Versions: 0.90.5, 0.92.0, 0.94.0, 0.96.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Attachments: hbase-5128-0.90-v2.patch, hbase-5128-0.92-v2.patch, 
 hbase-5128-0.94-v2.patch, hbase-5128-trunk-v2.patch, hbase-5128-trunk.patch


 The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region 
 consistency and table integrity invariant violations.  However with '-fix' it 
 can only automatically repair region consistency cases having to do with 
 deployment problems.  This updated version should be able to handle all cases 
 (including a new orphan regiondir case).  When complete will likely deprecate 
 the OfflineMetaRepair tool and subsume several open META-hole related issue.
 Here's the approach (from the comment of at the top of the new version of the 
 file).
 {code}
 /**
  * HBaseFsck (hbck) is a tool for checking and repairing region consistency 
 and
  * table integrity.  
  * 
  * Region consistency checks verify that META, region deployment on
  * region servers and the state of data in HDFS (.regioninfo files) all are in
  * accordance. 
  * 
  * Table integrity checks verify that that all possible row keys can resolve 
 to
  * exactly one region of a table.  This means there are no individual 
 degenerate
  * or backwards regions; no holes between regions; and that there no 
 overlapping
  * regions. 
  * 
  * The general repair strategy works in these steps.
  * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
  * 2) Repair Region Consistency with META and assignments
  * 
  * For table integrity repairs, the tables their region directories are 
 scanned
  * for .regioninfo files.  Each table's integrity is then verified.  If there 
  * are any orphan regions (regions with no .regioninfo files), or holes, new 
  * regions are fabricated.  Backwards regions are sidelined as well as empty
  * degenerate (endkey==startkey) regions.  If there are any overlapping 
 regions,
  * a new region is created and all data is merged into the new region.  
  * 
  * Table integrity repairs deal solely with HDFS and can be done offline -- 
 the
  * hbase region servers or master do not need to be running.  These phase can 
 be
  * use to completely reconstruct the META table in an offline fashion. 
  * 
  * Region consistency requires three conditions -- 1) valid .regioninfo file 
  * present in an hdfs region dir,  2) valid row with .regioninfo data in META,
  * and 3) a region is deployed only at the regionserver that is was assigned 
 to.
  * 
  * Region consistency requires hbck to contact the HBase master and region
  * servers, so the connect() must first be called successfully.  Much of the
  * region consistency information is transient and less risky to repair.
  */
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5128) [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online.

2012-03-21 Thread Jonathan Hsieh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-5128:
--

Fix Version/s: 0.96.0
   0.94.0
   0.92.2
   0.90.7

 [uber hbck] Enable hbck to automatically repair table integrity problems as 
 well as region consistency problems while online.
 -

 Key: HBASE-5128
 URL: https://issues.apache.org/jira/browse/HBASE-5128
 Project: HBase
  Issue Type: New Feature
  Components: hbck
Affects Versions: 0.90.5, 0.92.0, 0.94.0, 0.96.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0

 Attachments: hbase-5128-0.90-v2.patch, hbase-5128-0.92-v2.patch, 
 hbase-5128-0.94-v2.patch, hbase-5128-trunk-v2.patch, hbase-5128-trunk.patch


 The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region 
 consistency and table integrity invariant violations.  However with '-fix' it 
 can only automatically repair region consistency cases having to do with 
 deployment problems.  This updated version should be able to handle all cases 
 (including a new orphan regiondir case).  When complete will likely deprecate 
 the OfflineMetaRepair tool and subsume several open META-hole related issue.
 Here's the approach (from the comment of at the top of the new version of the 
 file).
 {code}
 /**
  * HBaseFsck (hbck) is a tool for checking and repairing region consistency 
 and
  * table integrity.  
  * 
  * Region consistency checks verify that META, region deployment on
  * region servers and the state of data in HDFS (.regioninfo files) all are in
  * accordance. 
  * 
  * Table integrity checks verify that that all possible row keys can resolve 
 to
  * exactly one region of a table.  This means there are no individual 
 degenerate
  * or backwards regions; no holes between regions; and that there no 
 overlapping
  * regions. 
  * 
  * The general repair strategy works in these steps.
  * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
  * 2) Repair Region Consistency with META and assignments
  * 
  * For table integrity repairs, the tables their region directories are 
 scanned
  * for .regioninfo files.  Each table's integrity is then verified.  If there 
  * are any orphan regions (regions with no .regioninfo files), or holes, new 
  * regions are fabricated.  Backwards regions are sidelined as well as empty
  * degenerate (endkey==startkey) regions.  If there are any overlapping 
 regions,
  * a new region is created and all data is merged into the new region.  
  * 
  * Table integrity repairs deal solely with HDFS and can be done offline -- 
 the
  * hbase region servers or master do not need to be running.  These phase can 
 be
  * use to completely reconstruct the META table in an offline fashion. 
  * 
  * Region consistency requires three conditions -- 1) valid .regioninfo file 
  * present in an hdfs region dir,  2) valid row with .regioninfo data in META,
  * and 3) a region is deployed only at the regionserver that is was assigned 
 to.
  * 
  * Region consistency requires hbck to contact the HBase master and region
  * servers, so the connect() must first be called successfully.  Much of the
  * region consistency information is transient and less risky to repair.
  */
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5128) [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online.

2012-03-21 Thread Jonathan Hsieh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-5128:
--

Attachment: hbase-5128-0.90-v2b.patch

Previous version accidentally included two  dev tools that are not part of this 
patch (ScrubMeta and DumpMeta).

 [uber hbck] Enable hbck to automatically repair table integrity problems as 
 well as region consistency problems while online.
 -

 Key: HBASE-5128
 URL: https://issues.apache.org/jira/browse/HBASE-5128
 Project: HBase
  Issue Type: New Feature
  Components: hbck
Affects Versions: 0.90.5, 0.92.0, 0.94.0, 0.96.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0

 Attachments: hbase-5128-0.90-v2.patch, hbase-5128-0.90-v2b.patch, 
 hbase-5128-0.92-v2.patch, hbase-5128-0.94-v2.patch, 
 hbase-5128-trunk-v2.patch, hbase-5128-trunk.patch


 The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region 
 consistency and table integrity invariant violations.  However with '-fix' it 
 can only automatically repair region consistency cases having to do with 
 deployment problems.  This updated version should be able to handle all cases 
 (including a new orphan regiondir case).  When complete will likely deprecate 
 the OfflineMetaRepair tool and subsume several open META-hole related issue.
 Here's the approach (from the comment of at the top of the new version of the 
 file).
 {code}
 /**
  * HBaseFsck (hbck) is a tool for checking and repairing region consistency 
 and
  * table integrity.  
  * 
  * Region consistency checks verify that META, region deployment on
  * region servers and the state of data in HDFS (.regioninfo files) all are in
  * accordance. 
  * 
  * Table integrity checks verify that that all possible row keys can resolve 
 to
  * exactly one region of a table.  This means there are no individual 
 degenerate
  * or backwards regions; no holes between regions; and that there no 
 overlapping
  * regions. 
  * 
  * The general repair strategy works in these steps.
  * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
  * 2) Repair Region Consistency with META and assignments
  * 
  * For table integrity repairs, the tables their region directories are 
 scanned
  * for .regioninfo files.  Each table's integrity is then verified.  If there 
  * are any orphan regions (regions with no .regioninfo files), or holes, new 
  * regions are fabricated.  Backwards regions are sidelined as well as empty
  * degenerate (endkey==startkey) regions.  If there are any overlapping 
 regions,
  * a new region is created and all data is merged into the new region.  
  * 
  * Table integrity repairs deal solely with HDFS and can be done offline -- 
 the
  * hbase region servers or master do not need to be running.  These phase can 
 be
  * use to completely reconstruct the META table in an offline fashion. 
  * 
  * Region consistency requires three conditions -- 1) valid .regioninfo file 
  * present in an hdfs region dir,  2) valid row with .regioninfo data in META,
  * and 3) a region is deployed only at the regionserver that is was assigned 
 to.
  * 
  * Region consistency requires hbck to contact the HBase master and region
  * servers, so the connect() must first be called successfully.  Much of the
  * region consistency information is transient and less risky to repair.
  */
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5128) [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online.

2012-03-09 Thread Jonathan Hsieh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-5128:
--

Attachment: hbase-5128-trunk.patch

 [uber hbck] Enable hbck to automatically repair table integrity problems as 
 well as region consistency problems while online.
 -

 Key: HBASE-5128
 URL: https://issues.apache.org/jira/browse/HBASE-5128
 Project: HBase
  Issue Type: New Feature
  Components: hbck
Affects Versions: 0.90.5, 0.92.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Attachments: hbase-5128-trunk.patch


 The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region 
 consistency and table integrity invariant violations.  However with '-fix' it 
 can only automatically repair region consistency cases having to do with 
 deployment problems.  This updated version should be able to handle all cases 
 (including a new orphan regiondir case).  When complete will likely deprecate 
 the OfflineMetaRepair tool and subsume several open META-hole related issue.
 Here's the approach (from the comment of at the top of the new version of the 
 file).
 {code}
 /**
  * HBaseFsck (hbck) is a tool for checking and repairing region consistency 
 and
  * table integrity.  
  * 
  * Region consistency checks verify that META, region deployment on
  * region servers and the state of data in HDFS (.regioninfo files) all are in
  * accordance. 
  * 
  * Table integrity checks verify that that all possible row keys can resolve 
 to
  * exactly one region of a table.  This means there are no individual 
 degenerate
  * or backwards regions; no holes between regions; and that there no 
 overlapping
  * regions. 
  * 
  * The general repair strategy works in these steps.
  * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
  * 2) Repair Region Consistency with META and assignments
  * 
  * For table integrity repairs, the tables their region directories are 
 scanned
  * for .regioninfo files.  Each table's integrity is then verified.  If there 
  * are any orphan regions (regions with no .regioninfo files), or holes, new 
  * regions are fabricated.  Backwards regions are sidelined as well as empty
  * degenerate (endkey==startkey) regions.  If there are any overlapping 
 regions,
  * a new region is created and all data is merged into the new region.  
  * 
  * Table integrity repairs deal solely with HDFS and can be done offline -- 
 the
  * hbase region servers or master do not need to be running.  These phase can 
 be
  * use to completely reconstruct the META table in an offline fashion. 
  * 
  * Region consistency requires three conditions -- 1) valid .regioninfo file 
  * present in an hdfs region dir,  2) valid row with .regioninfo data in META,
  * and 3) a region is deployed only at the regionserver that is was assigned 
 to.
  * 
  * Region consistency requires hbck to contact the HBase master and region
  * servers, so the connect() must first be called successfully.  Much of the
  * region consistency information is transient and less risky to repair.
  */
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5128) [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online.

2012-03-09 Thread Jonathan Hsieh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-5128:
--

Affects Version/s: 0.96.0
   0.94.0
   Status: Patch Available  (was: In Progress)

This version is getting very large, and though imperfect it is quite useful.  
Would prefer to get this in and then add follow in jiras to improve this. 

Attached version for trunk.  Will backport to 0.92/0.94.  Also have version for 
0.90 but would like to get trunk version in first now.

 [uber hbck] Enable hbck to automatically repair table integrity problems as 
 well as region consistency problems while online.
 -

 Key: HBASE-5128
 URL: https://issues.apache.org/jira/browse/HBASE-5128
 Project: HBase
  Issue Type: New Feature
  Components: hbck
Affects Versions: 0.92.0, 0.90.5, 0.94.0, 0.96.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Attachments: hbase-5128-trunk.patch


 The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region 
 consistency and table integrity invariant violations.  However with '-fix' it 
 can only automatically repair region consistency cases having to do with 
 deployment problems.  This updated version should be able to handle all cases 
 (including a new orphan regiondir case).  When complete will likely deprecate 
 the OfflineMetaRepair tool and subsume several open META-hole related issue.
 Here's the approach (from the comment of at the top of the new version of the 
 file).
 {code}
 /**
  * HBaseFsck (hbck) is a tool for checking and repairing region consistency 
 and
  * table integrity.  
  * 
  * Region consistency checks verify that META, region deployment on
  * region servers and the state of data in HDFS (.regioninfo files) all are in
  * accordance. 
  * 
  * Table integrity checks verify that that all possible row keys can resolve 
 to
  * exactly one region of a table.  This means there are no individual 
 degenerate
  * or backwards regions; no holes between regions; and that there no 
 overlapping
  * regions. 
  * 
  * The general repair strategy works in these steps.
  * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
  * 2) Repair Region Consistency with META and assignments
  * 
  * For table integrity repairs, the tables their region directories are 
 scanned
  * for .regioninfo files.  Each table's integrity is then verified.  If there 
  * are any orphan regions (regions with no .regioninfo files), or holes, new 
  * regions are fabricated.  Backwards regions are sidelined as well as empty
  * degenerate (endkey==startkey) regions.  If there are any overlapping 
 regions,
  * a new region is created and all data is merged into the new region.  
  * 
  * Table integrity repairs deal solely with HDFS and can be done offline -- 
 the
  * hbase region servers or master do not need to be running.  These phase can 
 be
  * use to completely reconstruct the META table in an offline fashion. 
  * 
  * Region consistency requires three conditions -- 1) valid .regioninfo file 
  * present in an hdfs region dir,  2) valid row with .regioninfo data in META,
  * and 3) a region is deployed only at the regionserver that is was assigned 
 to.
  * 
  * Region consistency requires hbck to contact the HBase master and region
  * servers, so the connect() must first be called successfully.  Much of the
  * region consistency information is transient and less risky to repair.
  */
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5128) [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online.

2012-01-09 Thread Jonathan Hsieh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-5128:
--

  Component/s: hbck
Affects Version/s: 0.90.5
   0.92.0

 [uber hbck] Enable hbck to automatically repair table integrity problems as 
 well as region consistency problems while online.
 -

 Key: HBASE-5128
 URL: https://issues.apache.org/jira/browse/HBASE-5128
 Project: HBase
  Issue Type: New Feature
  Components: hbck
Affects Versions: 0.92.0, 0.90.5
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh

 The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region 
 consistency and table integrity invariant violations.  However with '-fix' it 
 can only automatically repair region consistency cases having to do with 
 deployment problems.  This updated version should be able to handle all cases 
 (including a new orphan regiondir case).  When complete will likely deprecate 
 the OfflineMetaRepair tool and subsume several open META-hole related issue.
 Here's the approach (from the comment of at the top of the new version of the 
 file).
 {code}
 /**
  * HBaseFsck (hbck) is a tool for checking and repairing region consistency 
 and
  * table integrity.  
  * 
  * Region consistency checks verify that META, region deployment on
  * region servers and the state of data in HDFS (.regioninfo files) all are in
  * accordance. 
  * 
  * Table integrity checks verify that that all possible row keys can resolve 
 to
  * exactly one region of a table.  This means there are no individual 
 degenerate
  * or backwards regions; no holes between regions; and that there no 
 overlapping
  * regions. 
  * 
  * The general repair strategy works in these steps.
  * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
  * 2) Repair Region Consistency with META and assignments
  * 
  * For table integrity repairs, the tables their region directories are 
 scanned
  * for .regioninfo files.  Each table's integrity is then verified.  If there 
  * are any orphan regions (regions with no .regioninfo files), or holes, new 
  * regions are fabricated.  Backwards regions are sidelined as well as empty
  * degenerate (endkey==startkey) regions.  If there are any overlapping 
 regions,
  * a new region is created and all data is merged into the new region.  
  * 
  * Table integrity repairs deal solely with HDFS and can be done offline -- 
 the
  * hbase region servers or master do not need to be running.  These phase can 
 be
  * use to completely reconstruct the META table in an offline fashion. 
  * 
  * Region consistency requires three conditions -- 1) valid .regioninfo file 
  * present in an hdfs region dir,  2) valid row with .regioninfo data in META,
  * and 3) a region is deployed only at the regionserver that is was assigned 
 to.
  * 
  * Region consistency requires hbck to contact the HBase master and region
  * servers, so the connect() must first be called successfully.  Much of the
  * region consistency information is transient and less risky to repair.
  */
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5128) [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online.

2012-01-04 Thread Jonathan Hsieh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-5128:
--

Description: 
The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region 
consistency and table integrity invariant violations.  However with '-fix' it 
can only automatically handle deployment problems with region consistency 
cases.  This updated version should be able to handle all cases (including a 
new orphan regiondir case).  When complete will likely deprecate the 
OfflineMetaRepair tool and subsume several open META-hole related issue.

Here's the approach (from the comment of at the top of the new version of the 
file).
{code}
/**
 * HBaseFsck (hbck) is a tool for checking and repairing region consistency and
 * table integrity.  
 * 
 * Region consistency checks verify that META, region deployment on
 * region servers and the state of data in HDFS (.regioninfo files) all are in
 * accordance. 
 * 
 * Table integrity checks verify that that all possible row keys can resolve to
 * exactly one region of a table.  This means there are no individual degenerate
 * or backwards regions; no holes between regions; and that there no overlapping
 * regions. 
 * 
 * The general repair strategy works in these steps.
 * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
 * 2) Repair Region Consistency with META and assignments
 * 
 * For table integrity repairs, the tables their region directories are scanned
 * for .regioninfo files.  Each table's integrity is then verified.  If there 
 * are any orphan regions (regions with no .regioninfo files), or holes, new 
 * regions are fabricated.  Backwards regions are sidelined as well as empty
 * degenerate (endkey==startkey) regions.  If there are any overlapping regions,
 * a new region is created and all data is merged into the new region.  
 * 
 * Table integrity repairs deal solely with HDFS and can be done offline -- the
 * hbase region servers or master do not need to be running.  These phase can be
 * use to completely reconstruct the META table in an offline fashion. 
 * 
 * Region consistency requires three conditions -- 1) valid .regioninfo file 
 * present in an hdfs region dir,  2) valid row with .regioninfo data in META,
 * and 3) a region is deployed only at the regionserver that is was assigned to.
 * 
 * Region consistency requires hbck to contact the HBase master and region
 * servers, so the connect() must first be called successfully.  Much of the
 * region consistency information is transient and less risky to repair.
 */
{code}



  was:
The current (0.90.5, 0.92.0rc2) versions of hbck detect most of the invariant 
violations (orphans is new).  However with '-fix' it can only automatically 
handle deployment problems with region consistency cases.  This updated version 
should be able to handle all cases.  When complete will likely deprecate the 
OfflineMetaRepair tool and subsume several META hole related problems.

{code}
/**
 * HBaseFsck (hbck) is a tool for checking and repairing region consistency and
 * table integrity.  
 * 
 * Region consistency checks verify that META, region deployment on
 * region servers and the state of data in HDFS (.regioninfo files) all are in
 * accordance. 
 * 
 * Table integrity checks verify that that all possible row keys can resolve to
 * exactly one region of a table.  This means there are no individual degenerate
 * or backwards regions; no holes between regions; and that there no overlapping
 * regions. 
 * 
 * The general repair strategy works in these steps.
 * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
 * 2) Repair Region Consistency with META and assignments
 * 
 * For table integrity repairs, the tables their region directories are scanned
 * for .regioninfo files.  Each table's integrity is then verified.  If there 
 * are any orphan regions (regions with no .regioninfo files), or holes, new 
 * regions are fabricated.  Backwards regions are sidelined as well as empty
 * degenerate (endkey==startkey) regions.  If there are any overlapping regions,
 * a new region is created and all data is merged into the new region.  
 * 
 * Table integrity repairs deal solely with HDFS and can be done offline -- the
 * hbase region servers or master do not need to be running.  These phase can be
 * use to completely reconstruct the META table in an offline fashion. 
 * 
 * Region consistency requires three conditions -- 1) valid .regioninfo file 
 * present in an hdfs region dir,  2) valid row with .regioninfo data in META,
 * and 3) a region is deployed only at the regionserver that is was assigned to.
 * 
 * Region consistency requires hbck to contact the HBase master and region
 * servers, so the connect() must first be called successfully.  Much of the
 * region consistency information is transient and less risky to repair.
 */
{code}




 [uber hbck] 

[jira] [Updated] (HBASE-5128) [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online.

2012-01-04 Thread Jonathan Hsieh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-5128:
--

Description: 
The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region 
consistency and table integrity invariant violations.  However with '-fix' it 
can only automatically repair region consistency cases having to do with 
deployment problems.  This updated version should be able to handle all cases 
(including a new orphan regiondir case).  When complete will likely deprecate 
the OfflineMetaRepair tool and subsume several open META-hole related issue.

Here's the approach (from the comment of at the top of the new version of the 
file).
{code}
/**
 * HBaseFsck (hbck) is a tool for checking and repairing region consistency and
 * table integrity.  
 * 
 * Region consistency checks verify that META, region deployment on
 * region servers and the state of data in HDFS (.regioninfo files) all are in
 * accordance. 
 * 
 * Table integrity checks verify that that all possible row keys can resolve to
 * exactly one region of a table.  This means there are no individual degenerate
 * or backwards regions; no holes between regions; and that there no overlapping
 * regions. 
 * 
 * The general repair strategy works in these steps.
 * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
 * 2) Repair Region Consistency with META and assignments
 * 
 * For table integrity repairs, the tables their region directories are scanned
 * for .regioninfo files.  Each table's integrity is then verified.  If there 
 * are any orphan regions (regions with no .regioninfo files), or holes, new 
 * regions are fabricated.  Backwards regions are sidelined as well as empty
 * degenerate (endkey==startkey) regions.  If there are any overlapping regions,
 * a new region is created and all data is merged into the new region.  
 * 
 * Table integrity repairs deal solely with HDFS and can be done offline -- the
 * hbase region servers or master do not need to be running.  These phase can be
 * use to completely reconstruct the META table in an offline fashion. 
 * 
 * Region consistency requires three conditions -- 1) valid .regioninfo file 
 * present in an hdfs region dir,  2) valid row with .regioninfo data in META,
 * and 3) a region is deployed only at the regionserver that is was assigned to.
 * 
 * Region consistency requires hbck to contact the HBase master and region
 * servers, so the connect() must first be called successfully.  Much of the
 * region consistency information is transient and less risky to repair.
 */
{code}



  was:
The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region 
consistency and table integrity invariant violations.  However with '-fix' it 
can only automatically handle deployment problems with region consistency 
cases.  This updated version should be able to handle all cases (including a 
new orphan regiondir case).  When complete will likely deprecate the 
OfflineMetaRepair tool and subsume several open META-hole related issue.

Here's the approach (from the comment of at the top of the new version of the 
file).
{code}
/**
 * HBaseFsck (hbck) is a tool for checking and repairing region consistency and
 * table integrity.  
 * 
 * Region consistency checks verify that META, region deployment on
 * region servers and the state of data in HDFS (.regioninfo files) all are in
 * accordance. 
 * 
 * Table integrity checks verify that that all possible row keys can resolve to
 * exactly one region of a table.  This means there are no individual degenerate
 * or backwards regions; no holes between regions; and that there no overlapping
 * regions. 
 * 
 * The general repair strategy works in these steps.
 * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
 * 2) Repair Region Consistency with META and assignments
 * 
 * For table integrity repairs, the tables their region directories are scanned
 * for .regioninfo files.  Each table's integrity is then verified.  If there 
 * are any orphan regions (regions with no .regioninfo files), or holes, new 
 * regions are fabricated.  Backwards regions are sidelined as well as empty
 * degenerate (endkey==startkey) regions.  If there are any overlapping regions,
 * a new region is created and all data is merged into the new region.  
 * 
 * Table integrity repairs deal solely with HDFS and can be done offline -- the
 * hbase region servers or master do not need to be running.  These phase can be
 * use to completely reconstruct the META table in an offline fashion. 
 * 
 * Region consistency requires three conditions -- 1) valid .regioninfo file 
 * present in an hdfs region dir,  2) valid row with .regioninfo data in META,
 * and 3) a region is deployed only at the regionserver that is was assigned to.
 * 
 * Region consistency requires hbck to contact the HBase master and region
 * servers, so the