[jira] [Commented] (HBASE-6258) Backport some region splitting fixes into 0.90.7
[ https://issues.apache.org/jira/browse/HBASE-6258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402882#comment-13402882 ] Jonathan Hsieh commented on HBASE-6258: --- I think part of HBASE-4816 seems low risk (extra closing checks on splitrequest) but the part eventually gets rooted to HBASE-1502 (removing heartbeats) so sounds like it is best to probably avoid. In discussion you mentioned that HBASE-4881 was pointless because the exception signature changed between 0.90 and 0.92 and suggested not including it. I'd be fine with that. HBASE-6158 is embarrassing and a simple fix, but it sounds best to avoid porting the fix to avoid the potential for compatibility problems. The good news is that this one has an easy work around (don't use merges or splits as colfams). Instead can we add a patch to warn if a user tries to create/alter tables to have 'merges' or 'splits' as column families names? HBASE-5189: I think metrics are always useful and seem low risk (I don't think we have a strict contract staying that adding metrics breaks compatibility). Since this is 0.94, ff we backport this to 0.90 we'd need to get into 0.92 as well. Backport some region splitting fixes into 0.90.7 Key: HBASE-6258 URL: https://issues.apache.org/jira/browse/HBASE-6258 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.6 Reporter: David S. Wang Assignee: David S. Wang Attachments: HBASE-4816+4881+5189+6158.patch Issue tracking backport of some relatively small region splitting fixes into 0.90.7: HBASE-4816: Regionserver wouldn't go down because split happened exactly at same time we issued bulk user region close call on our way out - fixed in 0.92 HBASE-4881: Unhealthy region is on service caused by rollback of region splitting - fixed in 0.92 HBASE-5189: Add metrics to keep track of region-splits in RS - fixed in 0.94 HBASE-6158: Data loss if the words 'merges' or 'splits' are used as Column Family name - fixed in 0.92 and 0.94 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4379) [hbck] Does not complain about tables with no end region [Z,]
[ https://issues.apache.org/jira/browse/HBASE-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-4379: -- Attachment: TestcaseForDisabledTableIssue.patch Attached a test case for illustrating the issue which I was mentioning above. [hbck] Does not complain about tables with no end region [Z,] - Key: HBASE-4379 URL: https://issues.apache.org/jira/browse/HBASE-4379 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.90.5, 0.92.0, 0.94.0, 0.96.0 Reporter: Jonathan Hsieh Assignee: Anoop Sam John Fix For: 0.96.0, 0.94.1 Attachments: 0001-HBASE-4379-hbck-does-not-complain-about-tables-with-.patch, HBASE-4379_94.patch, HBASE-4379_94_V2.patch, HBASE-4379_Trunk.patch, TestcaseForDisabledTableIssue.patch, hbase-4379.v2.patch hbck does not detect or have an error condition when the last region of a table is missing (end key != ''). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4379) [hbck] Does not complain about tables with no end region [Z,]
[ https://issues.apache.org/jira/browse/HBASE-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402891#comment-13402891 ] Hadoop QA commented on HBASE-4379: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12533784/TestcaseForDisabledTableIssue.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2279//console This message is automatically generated. [hbck] Does not complain about tables with no end region [Z,] - Key: HBASE-4379 URL: https://issues.apache.org/jira/browse/HBASE-4379 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.90.5, 0.92.0, 0.94.0, 0.96.0 Reporter: Jonathan Hsieh Assignee: Anoop Sam John Fix For: 0.96.0, 0.94.1 Attachments: 0001-HBASE-4379-hbck-does-not-complain-about-tables-with-.patch, HBASE-4379_94.patch, HBASE-4379_94_V2.patch, HBASE-4379_Trunk.patch, TestcaseForDisabledTableIssue.patch, hbase-4379.v2.patch hbck does not detect or have an error condition when the last region of a table is missing (end key != ''). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6289) ROOT region doesn't get re-assigned in ServerShutdownHandler if the RS is still working but only the RS's ZK node expires.
Maryann Xue created HBASE-6289: -- Summary: ROOT region doesn't get re-assigned in ServerShutdownHandler if the RS is still working but only the RS's ZK node expires. Key: HBASE-6289 URL: https://issues.apache.org/jira/browse/HBASE-6289 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.94.0, 0.90.6 Reporter: Maryann Xue Priority: Critical The ROOT RS has some network problem and its ZK node expires first, which kicks off the ServerShutdownHandler. it calls verifyAndAssignRoot() to try to re-assign ROOT. At that time, the RS is actually still working and passes the verifyRootRegionLocation() check, so the ROOT region is skipped from re-assignment. private void verifyAndAssignRoot() throws InterruptedException, IOException, KeeperException { long timeout = this.server.getConfiguration(). getLong(hbase.catalog.verification.timeout, 1000); if (!this.server.getCatalogTracker().verifyRootRegionLocation(timeout)) { this.services.getAssignmentManager().assignRoot(); } } After a few moments, this RS encounters DFS write problem and decides to abort. The RS then soon gets restarted from commandline, and constantly report: 2012-06-27 23:13:08,627 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,627 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,628 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,628 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,630 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6175) TestFSUtils flaky on hdfs getFileStatus method
[ https://issues.apache.org/jira/browse/HBASE-6175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-6175: --- Attachment: 6175.v1.patch TestFSUtils flaky on hdfs getFileStatus method -- Key: HBASE-6175 URL: https://issues.apache.org/jira/browse/HBASE-6175 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Trivial Fix For: 0.96.0 Attachments: 6175.v1.patch This is a simplified version of a TestFSUtils issue: a sleep and the test works 100% of the time. No sleep and it becomes flaky. Root cause unknown. While the issue appears on the tests, the root cause could be an issue on real production system as well. {noformat} @Test public void testFSUTils() throws Exception { final String hosts[] = {host1, host2, host3, host4}; Path testFile = new Path(/test1.txt); HBaseTestingUtility htu = new HBaseTestingUtility(); try { htu.startMiniDFSCluster(hosts).waitActive(); FileSystem fs = htu.getDFSCluster().getFileSystem(); for (int i = 0; i 100; ++i) { FSDataOutputStream out = fs.create(testFile); byte[] data = new byte[1]; out.write(data, 0, 1); out.close(); // Put a sleep here to make me work //Thread.sleep(2000); FileStatus status = fs.getFileStatus(testFile); HDFSBlocksDistribution blocksDistribution = FSUtils.computeHDFSBlocksDistribution(fs, status, 0, status.getLen()); assertEquals(Wrong number of hosts distributing blocks. at iteration +i, 3, blocksDistribution.getTopHosts().size()); fs.delete(testFile, true); } } finally { htu.shutdownMiniDFSCluster(); } } {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6229) AM.assign() should not set table state to ENABLED directly.
[ https://issues.apache.org/jira/browse/HBASE-6229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] rajeshbabu updated HBASE-6229: -- Resolution: Fixed Fix Version/s: (was: 0.92.3) Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) AM.assign() should not set table state to ENABLED directly. --- Key: HBASE-6229 URL: https://issues.apache.org/jira/browse/HBASE-6229 Project: HBase Issue Type: Bug Affects Versions: 0.92.2, 0.94.1 Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.96.0, 0.94.1 Attachments: 6229_trunk_2.patch, HBASE-6229_94.patch, HBASE-6229_94_1.patch, HBASE-6229_94_2.patch, HBASE-6229_trunk.patch, HBASE-6229_trunk_1.patch In case of assign from EnableTableHandler table state is ENABLING. Any how EnableTableHandler will set ENABLED after assigning all the the table regions. If we try to set to ENABLED directly then client api may think ENABLE table is completed. When we have a case like all the regions are added directly into META and we call assignRegion then we need to make the table ENABLED. Hence in such case the table will not be in ENABLING or ENABLED state. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6175) TestFSUtils flaky on hdfs getFileStatus method
[ https://issues.apache.org/jira/browse/HBASE-6175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-6175: --- Status: Patch Available (was: Open) TestFSUtils flaky on hdfs getFileStatus method -- Key: HBASE-6175 URL: https://issues.apache.org/jira/browse/HBASE-6175 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Trivial Fix For: 0.96.0 Attachments: 6175.v1.patch This is a simplified version of a TestFSUtils issue: a sleep and the test works 100% of the time. No sleep and it becomes flaky. Root cause unknown. While the issue appears on the tests, the root cause could be an issue on real production system as well. {noformat} @Test public void testFSUTils() throws Exception { final String hosts[] = {host1, host2, host3, host4}; Path testFile = new Path(/test1.txt); HBaseTestingUtility htu = new HBaseTestingUtility(); try { htu.startMiniDFSCluster(hosts).waitActive(); FileSystem fs = htu.getDFSCluster().getFileSystem(); for (int i = 0; i 100; ++i) { FSDataOutputStream out = fs.create(testFile); byte[] data = new byte[1]; out.write(data, 0, 1); out.close(); // Put a sleep here to make me work //Thread.sleep(2000); FileStatus status = fs.getFileStatus(testFile); HDFSBlocksDistribution blocksDistribution = FSUtils.computeHDFSBlocksDistribution(fs, status, 0, status.getLen()); assertEquals(Wrong number of hosts distributing blocks. at iteration +i, 3, blocksDistribution.getTopHosts().size()); fs.delete(testFile, true); } } finally { htu.shutdownMiniDFSCluster(); } } {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6175) TestFSUtils flaky on hdfs getFileStatus method
[ https://issues.apache.org/jira/browse/HBASE-6175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402990#comment-13402990 ] nkeywal commented on HBASE-6175: Here is the fix. Without 'no go' I'll commit it this week end. TestFSUtils flaky on hdfs getFileStatus method -- Key: HBASE-6175 URL: https://issues.apache.org/jira/browse/HBASE-6175 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Trivial Fix For: 0.96.0 Attachments: 6175.v1.patch This is a simplified version of a TestFSUtils issue: a sleep and the test works 100% of the time. No sleep and it becomes flaky. Root cause unknown. While the issue appears on the tests, the root cause could be an issue on real production system as well. {noformat} @Test public void testFSUTils() throws Exception { final String hosts[] = {host1, host2, host3, host4}; Path testFile = new Path(/test1.txt); HBaseTestingUtility htu = new HBaseTestingUtility(); try { htu.startMiniDFSCluster(hosts).waitActive(); FileSystem fs = htu.getDFSCluster().getFileSystem(); for (int i = 0; i 100; ++i) { FSDataOutputStream out = fs.create(testFile); byte[] data = new byte[1]; out.write(data, 0, 1); out.close(); // Put a sleep here to make me work //Thread.sleep(2000); FileStatus status = fs.getFileStatus(testFile); HDFSBlocksDistribution blocksDistribution = FSUtils.computeHDFSBlocksDistribution(fs, status, 0, status.getLen()); assertEquals(Wrong number of hosts distributing blocks. at iteration +i, 3, blocksDistribution.getTopHosts().size()); fs.delete(testFile, true); } } finally { htu.shutdownMiniDFSCluster(); } } {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6289) ROOT region doesn't get re-assigned in ServerShutdownHandler if the RS is still working but only the RS's ZK node expires.
[ https://issues.apache.org/jira/browse/HBASE-6289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maryann Xue updated HBASE-6289: --- Attachment: HBASE-6289.patch Add excluded server in verifyRootRegionLocation(). ROOT region doesn't get re-assigned in ServerShutdownHandler if the RS is still working but only the RS's ZK node expires. -- Key: HBASE-6289 URL: https://issues.apache.org/jira/browse/HBASE-6289 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.6, 0.94.0 Reporter: Maryann Xue Priority: Critical Attachments: HBASE-6289.patch The ROOT RS has some network problem and its ZK node expires first, which kicks off the ServerShutdownHandler. it calls verifyAndAssignRoot() to try to re-assign ROOT. At that time, the RS is actually still working and passes the verifyRootRegionLocation() check, so the ROOT region is skipped from re-assignment. private void verifyAndAssignRoot() throws InterruptedException, IOException, KeeperException { long timeout = this.server.getConfiguration(). getLong(hbase.catalog.verification.timeout, 1000); if (!this.server.getCatalogTracker().verifyRootRegionLocation(timeout)) { this.services.getAssignmentManager().assignRoot(); } } After a few moments, this RS encounters DFS write problem and decides to abort. The RS then soon gets restarted from commandline, and constantly report: 2012-06-27 23:13:08,627 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,627 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,628 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,628 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,630 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6289) ROOT region doesn't get re-assigned in ServerShutdownHandler if the RS is still working but only the RS's ZK node expires.
[ https://issues.apache.org/jira/browse/HBASE-6289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maryann Xue updated HBASE-6289: --- Assignee: Maryann Xue Status: Patch Available (was: Open) ROOT region doesn't get re-assigned in ServerShutdownHandler if the RS is still working but only the RS's ZK node expires. -- Key: HBASE-6289 URL: https://issues.apache.org/jira/browse/HBASE-6289 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.94.0, 0.90.6 Reporter: Maryann Xue Assignee: Maryann Xue Priority: Critical Attachments: HBASE-6289.patch The ROOT RS has some network problem and its ZK node expires first, which kicks off the ServerShutdownHandler. it calls verifyAndAssignRoot() to try to re-assign ROOT. At that time, the RS is actually still working and passes the verifyRootRegionLocation() check, so the ROOT region is skipped from re-assignment. private void verifyAndAssignRoot() throws InterruptedException, IOException, KeeperException { long timeout = this.server.getConfiguration(). getLong(hbase.catalog.verification.timeout, 1000); if (!this.server.getCatalogTracker().verifyRootRegionLocation(timeout)) { this.services.getAssignmentManager().assignRoot(); } } After a few moments, this RS encounters DFS write problem and decides to abort. The RS then soon gets restarted from commandline, and constantly report: 2012-06-27 23:13:08,627 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,627 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,628 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,628 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,630 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6175) TestFSUtils flaky on hdfs getFileStatus method
[ https://issues.apache.org/jira/browse/HBASE-6175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403014#comment-13403014 ] Hadoop QA commented on HBASE-6175: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12533802/6175.v1.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestStore Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2280//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2280//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2280//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2280//console This message is automatically generated. TestFSUtils flaky on hdfs getFileStatus method -- Key: HBASE-6175 URL: https://issues.apache.org/jira/browse/HBASE-6175 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Trivial Fix For: 0.96.0 Attachments: 6175.v1.patch This is a simplified version of a TestFSUtils issue: a sleep and the test works 100% of the time. No sleep and it becomes flaky. Root cause unknown. While the issue appears on the tests, the root cause could be an issue on real production system as well. {noformat} @Test public void testFSUTils() throws Exception { final String hosts[] = {host1, host2, host3, host4}; Path testFile = new Path(/test1.txt); HBaseTestingUtility htu = new HBaseTestingUtility(); try { htu.startMiniDFSCluster(hosts).waitActive(); FileSystem fs = htu.getDFSCluster().getFileSystem(); for (int i = 0; i 100; ++i) { FSDataOutputStream out = fs.create(testFile); byte[] data = new byte[1]; out.write(data, 0, 1); out.close(); // Put a sleep here to make me work //Thread.sleep(2000); FileStatus status = fs.getFileStatus(testFile); HDFSBlocksDistribution blocksDistribution = FSUtils.computeHDFSBlocksDistribution(fs, status, 0, status.getLen()); assertEquals(Wrong number of hosts distributing blocks. at iteration +i, 3, blocksDistribution.getTopHosts().size()); fs.delete(testFile, true); } } finally { htu.shutdownMiniDFSCluster(); } } {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6289) ROOT region doesn't get re-assigned in ServerShutdownHandler if the RS is still working but only the RS's ZK node expires.
[ https://issues.apache.org/jira/browse/HBASE-6289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403085#comment-13403085 ] ramkrishna.s.vasudevan commented on HBASE-6289: --- @Maryann Good one Maraynn. Nice catch. I have one suggestion. Can we just say assignRoot instead of verifying the root location like how we do assignMeta. {code} this.services.getAssignmentManager().assignMeta(); {code} similarly can we say {code} if (isCarryingRoot()) { // -ROOT- LOG.info(Server + serverName + was carrying ROOT. Trying to assign.); this.services.getAssignmentManager(). regionOffline(HRegionInfo.ROOT_REGIONINFO); this.services.getAssignmentManager().assignRoot(); } {code} Because we are sure that the root is down here. What do you feel? ROOT region doesn't get re-assigned in ServerShutdownHandler if the RS is still working but only the RS's ZK node expires. -- Key: HBASE-6289 URL: https://issues.apache.org/jira/browse/HBASE-6289 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.6, 0.94.0 Reporter: Maryann Xue Assignee: Maryann Xue Priority: Critical Attachments: HBASE-6289.patch The ROOT RS has some network problem and its ZK node expires first, which kicks off the ServerShutdownHandler. it calls verifyAndAssignRoot() to try to re-assign ROOT. At that time, the RS is actually still working and passes the verifyRootRegionLocation() check, so the ROOT region is skipped from re-assignment. private void verifyAndAssignRoot() throws InterruptedException, IOException, KeeperException { long timeout = this.server.getConfiguration(). getLong(hbase.catalog.verification.timeout, 1000); if (!this.server.getCatalogTracker().verifyRootRegionLocation(timeout)) { this.services.getAssignmentManager().assignRoot(); } } After a few moments, this RS encounters DFS write problem and decides to abort. The RS then soon gets restarted from commandline, and constantly report: 2012-06-27 23:13:08,627 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,627 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,628 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,628 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,630 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (HBASE-6289) ROOT region doesn't get re-assigned in ServerShutdownHandler if the RS is still working but only the RS's ZK node expires.
[ https://issues.apache.org/jira/browse/HBASE-6289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403085#comment-13403085 ] ramkrishna.s.vasudevan edited comment on HBASE-6289 at 6/28/12 1:24 PM: @Maryann Good one Maraynn. Nice catch. I have one suggestion. Can we just say assignRoot instead of verifying the root location like how we do assignMeta. {code} this.services.getAssignmentManager().assignMeta(); {code} similarly can we say {code} if (isCarryingRoot()) { // -ROOT- LOG.info(Server + serverName + was carrying ROOT. Trying to assign.); this.services.getAssignmentManager(). regionOffline(HRegionInfo.ROOT_REGIONINFO); this.services.getAssignmentManager().assignRoot(); } {code} Because we are sure that the root is down here. What do you feel? {Edit} But am not sure if that verification step was added for some specific reasons. {Edit} was (Author: ram_krish): @Maryann Good one Maraynn. Nice catch. I have one suggestion. Can we just say assignRoot instead of verifying the root location like how we do assignMeta. {code} this.services.getAssignmentManager().assignMeta(); {code} similarly can we say {code} if (isCarryingRoot()) { // -ROOT- LOG.info(Server + serverName + was carrying ROOT. Trying to assign.); this.services.getAssignmentManager(). regionOffline(HRegionInfo.ROOT_REGIONINFO); this.services.getAssignmentManager().assignRoot(); } {code} Because we are sure that the root is down here. What do you feel? ROOT region doesn't get re-assigned in ServerShutdownHandler if the RS is still working but only the RS's ZK node expires. -- Key: HBASE-6289 URL: https://issues.apache.org/jira/browse/HBASE-6289 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.6, 0.94.0 Reporter: Maryann Xue Assignee: Maryann Xue Priority: Critical Attachments: HBASE-6289.patch The ROOT RS has some network problem and its ZK node expires first, which kicks off the ServerShutdownHandler. it calls verifyAndAssignRoot() to try to re-assign ROOT. At that time, the RS is actually still working and passes the verifyRootRegionLocation() check, so the ROOT region is skipped from re-assignment. private void verifyAndAssignRoot() throws InterruptedException, IOException, KeeperException { long timeout = this.server.getConfiguration(). getLong(hbase.catalog.verification.timeout, 1000); if (!this.server.getCatalogTracker().verifyRootRegionLocation(timeout)) { this.services.getAssignmentManager().assignRoot(); } } After a few moments, this RS encounters DFS write problem and decides to abort. The RS then soon gets restarted from commandline, and constantly report: 2012-06-27 23:13:08,627 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,627 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,628 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,628 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,630 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6289) ROOT region doesn't get re-assigned in ServerShutdownHandler if the RS is still working but only the RS's ZK node expires.
[ https://issues.apache.org/jira/browse/HBASE-6289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403105#comment-13403105 ] Maryann Xue commented on HBASE-6289: @ramkrishna: Yes, i thought of this too. but i this comment before verifyAndAssignRoot(): Before assign the ROOT region, ensure it haven't been assigned by other place. Not sure if this ROOT assigned elsewhere situation will actually possibly occur, but we seem to have seen META assigned on several Region Servers at the same time when there was chaos going on in our lab's network. There can be only one single search path for any region (incl. meta and root), though, regardless of client cache. And this is the thing i don't understand, why we try to treat ROOT differently? ROOT region doesn't get re-assigned in ServerShutdownHandler if the RS is still working but only the RS's ZK node expires. -- Key: HBASE-6289 URL: https://issues.apache.org/jira/browse/HBASE-6289 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.6, 0.94.0 Reporter: Maryann Xue Assignee: Maryann Xue Priority: Critical Attachments: HBASE-6289.patch The ROOT RS has some network problem and its ZK node expires first, which kicks off the ServerShutdownHandler. it calls verifyAndAssignRoot() to try to re-assign ROOT. At that time, the RS is actually still working and passes the verifyRootRegionLocation() check, so the ROOT region is skipped from re-assignment. private void verifyAndAssignRoot() throws InterruptedException, IOException, KeeperException { long timeout = this.server.getConfiguration(). getLong(hbase.catalog.verification.timeout, 1000); if (!this.server.getCatalogTracker().verifyRootRegionLocation(timeout)) { this.services.getAssignmentManager().assignRoot(); } } After a few moments, this RS encounters DFS write problem and decides to abort. The RS then soon gets restarted from commandline, and constantly report: 2012-06-27 23:13:08,627 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,627 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,628 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,628 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,630 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6289) ROOT region doesn't get re-assigned in ServerShutdownHandler if the RS is still working but only the RS's ZK node expires.
[ https://issues.apache.org/jira/browse/HBASE-6289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403106#comment-13403106 ] Maryann Xue commented on HBASE-6289: @ramkrishna: Yes, i thought of this too. but i saw this comment here before verifyAndAssignRoot(): Before assign the ROOT region, ensure it haven't been assigned by other place. Not sure if this ROOT assigned elsewhere situation will actually possibly occur, but we seem to have seen META assigned on several Region Servers at the same time when there was chaos going on in our lab's network. There can be only one single search path for any region (incl. meta and root), though, regardless of client cache. And this is the thing i don't understand, why we try to treat ROOT differently? ROOT region doesn't get re-assigned in ServerShutdownHandler if the RS is still working but only the RS's ZK node expires. -- Key: HBASE-6289 URL: https://issues.apache.org/jira/browse/HBASE-6289 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.6, 0.94.0 Reporter: Maryann Xue Assignee: Maryann Xue Priority: Critical Attachments: HBASE-6289.patch The ROOT RS has some network problem and its ZK node expires first, which kicks off the ServerShutdownHandler. it calls verifyAndAssignRoot() to try to re-assign ROOT. At that time, the RS is actually still working and passes the verifyRootRegionLocation() check, so the ROOT region is skipped from re-assignment. private void verifyAndAssignRoot() throws InterruptedException, IOException, KeeperException { long timeout = this.server.getConfiguration(). getLong(hbase.catalog.verification.timeout, 1000); if (!this.server.getCatalogTracker().verifyRootRegionLocation(timeout)) { this.services.getAssignmentManager().assignRoot(); } } After a few moments, this RS encounters DFS write problem and decides to abort. The RS then soon gets restarted from commandline, and constantly report: 2012-06-27 23:13:08,627 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,627 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,628 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,628 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,630 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-6197) HRegion's append operation may lose data
[ https://issues.apache.org/jira/browse/HBASE-6197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan reassigned HBASE-6197: - Assignee: ShiXing (was: ramkrishna.s.vasudevan) Incorrectly assigned this to me. HRegion's append operation may lose data Key: HBASE-6197 URL: https://issues.apache.org/jira/browse/HBASE-6197 Project: HBase Issue Type: Bug Components: regionserver Reporter: ShiXing Assignee: ShiXing Fix For: 0.96.0 Attachments: HBASE-6197-trunk-V1.patch Like the HBASE-6195, when flushing the append thread will read out the old value for the larger timestamp in snapshot and smaller timestamp in memstore. We Should make the first-in-thread generates the smaller timestamp. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-6197) HRegion's append operation may lose data
[ https://issues.apache.org/jira/browse/HBASE-6197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan reassigned HBASE-6197: - Assignee: ramkrishna.s.vasudevan (was: ShiXing) HRegion's append operation may lose data Key: HBASE-6197 URL: https://issues.apache.org/jira/browse/HBASE-6197 Project: HBase Issue Type: Bug Components: regionserver Reporter: ShiXing Assignee: ramkrishna.s.vasudevan Fix For: 0.96.0 Attachments: HBASE-6197-trunk-V1.patch Like the HBASE-6195, when flushing the append thread will read out the old value for the larger timestamp in snapshot and smaller timestamp in memstore. We Should make the first-in-thread generates the smaller timestamp. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-6210) Backport HBASE-6197 to 0.94 and 0.92?
[ https://issues.apache.org/jira/browse/HBASE-6210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan reassigned HBASE-6210: - Assignee: ramkrishna.s.vasudevan Backport HBASE-6197 to 0.94 and 0.92? - Key: HBASE-6210 URL: https://issues.apache.org/jira/browse/HBASE-6210 Project: HBase Issue Type: Bug Reporter: stack Assignee: ramkrishna.s.vasudevan Priority: Blocker Fix For: 0.94.1 Backport HBASE-6197 'HRegion's append operation may lose data' and the accompanying HBASE-6195 to 0.94 and 0.92 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3725) HBase increments from old value after delete and write to disk
[ https://issues.apache.org/jira/browse/HBASE-3725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403165#comment-13403165 ] ShiXing commented on HBASE-3725: Is there any progress? HBase increments from old value after delete and write to disk -- Key: HBASE-3725 URL: https://issues.apache.org/jira/browse/HBASE-3725 Project: HBase Issue Type: Bug Components: io, regionserver Affects Versions: 0.90.1 Reporter: Nathaniel Cook Assignee: Jonathan Gray Attachments: HBASE-3725-Test-v1.patch, HBASE-3725-v3.patch, HBASE-3725.patch Deleted row values are sometimes used for starting points on new increments. To reproduce: Create a row r. Set column x to some default value. Force hbase to write that value to the file system (such as restarting the cluster). Delete the row. Call table.incrementColumnValue with some_value Get the row. The returned value in the column was incremented from the old value before the row was deleted instead of being initialized to some_value. Code to reproduce: {code} import java.io.IOException; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.hbase.HBaseConfiguration; import org.apache.hadoop.hbase.HColumnDescriptor; import org.apache.hadoop.hbase.HTableDescriptor; import org.apache.hadoop.hbase.client.Delete; import org.apache.hadoop.hbase.client.Get; import org.apache.hadoop.hbase.client.HBaseAdmin; import org.apache.hadoop.hbase.client.HTableInterface; import org.apache.hadoop.hbase.client.HTablePool; import org.apache.hadoop.hbase.client.Increment; import org.apache.hadoop.hbase.client.Result; import org.apache.hadoop.hbase.util.Bytes; public class HBaseTestIncrement { static String tableName = testIncrement; static byte[] infoCF = Bytes.toBytes(info); static byte[] rowKey = Bytes.toBytes(test-rowKey); static byte[] newInc = Bytes.toBytes(new); static byte[] oldInc = Bytes.toBytes(old); /** * This code reproduces a bug with increment column values in hbase * Usage: First run part one by passing '1' as the first arg *Then restart the hbase cluster so it writes everything to disk *Run part two by passing '2' as the first arg * * This will result in the old deleted data being found and used for the increment calls * * @param args * @throws IOException */ public static void main(String[] args) throws IOException { if(1.equals(args[0])) partOne(); if(2.equals(args[0])) partTwo(); if (both.equals(args[0])) { partOne(); partTwo(); } } /** * Creates a table and increments a column value 10 times by 10 each time. * Results in a value of 100 for the column * * @throws IOException */ static void partOne()throws IOException { Configuration conf = HBaseConfiguration.create(); HBaseAdmin admin = new HBaseAdmin(conf); HTableDescriptor tableDesc = new HTableDescriptor(tableName); tableDesc.addFamily(new HColumnDescriptor(infoCF)); if(admin.tableExists(tableName)) { admin.disableTable(tableName); admin.deleteTable(tableName); } admin.createTable(tableDesc); HTablePool pool = new HTablePool(conf, Integer.MAX_VALUE); HTableInterface table = pool.getTable(Bytes.toBytes(tableName)); //Increment unitialized column for (int j = 0; j 10; j++) { table.incrementColumnValue(rowKey, infoCF, oldInc, (long)10); Increment inc = new Increment(rowKey); inc.addColumn(infoCF, newInc, (long)10); table.increment(inc); } Get get = new Get(rowKey); Result r = table.get(get); System.out.println(initial values: new + Bytes.toLong(r.getValue(infoCF, newInc)) + old + Bytes.toLong(r.getValue(infoCF, oldInc))); } /** * First deletes the data then increments the column 10 times by 1 each time * * Should result in a value of 10 but it doesn't, it results in a values of 110 * * @throws IOException */ static void partTwo()throws IOException { Configuration conf = HBaseConfiguration.create(); HTablePool pool = new HTablePool(conf,
[jira] [Updated] (HBASE-6210) Backport HBASE-6197 to 0.94 and 0.92?
[ https://issues.apache.org/jira/browse/HBASE-6210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-6210: -- Attachment: HBASE-6210.patch Patch for 0.94. Backport HBASE-6197 to 0.94 and 0.92? - Key: HBASE-6210 URL: https://issues.apache.org/jira/browse/HBASE-6210 Project: HBase Issue Type: Bug Reporter: stack Assignee: ramkrishna.s.vasudevan Priority: Blocker Fix For: 0.94.1 Attachments: HBASE-6210.patch Backport HBASE-6197 'HRegion's append operation may lose data' and the accompanying HBASE-6195 to 0.94 and 0.92 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3725) HBase increments from old value after delete and write to disk
[ https://issues.apache.org/jira/browse/HBASE-3725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403170#comment-13403170 ] ShiXing commented on HBASE-3725: Sorry for mistake of Ctrl+Enter. I think the fixup could just change the calls of getLastIncrement() to get(), I see that in 0.94 the code is alreay remove the getLastIncrement() function. HBase increments from old value after delete and write to disk -- Key: HBASE-3725 URL: https://issues.apache.org/jira/browse/HBASE-3725 Project: HBase Issue Type: Bug Components: io, regionserver Affects Versions: 0.90.1 Reporter: Nathaniel Cook Assignee: Jonathan Gray Attachments: HBASE-3725-Test-v1.patch, HBASE-3725-v3.patch, HBASE-3725.patch Deleted row values are sometimes used for starting points on new increments. To reproduce: Create a row r. Set column x to some default value. Force hbase to write that value to the file system (such as restarting the cluster). Delete the row. Call table.incrementColumnValue with some_value Get the row. The returned value in the column was incremented from the old value before the row was deleted instead of being initialized to some_value. Code to reproduce: {code} import java.io.IOException; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.hbase.HBaseConfiguration; import org.apache.hadoop.hbase.HColumnDescriptor; import org.apache.hadoop.hbase.HTableDescriptor; import org.apache.hadoop.hbase.client.Delete; import org.apache.hadoop.hbase.client.Get; import org.apache.hadoop.hbase.client.HBaseAdmin; import org.apache.hadoop.hbase.client.HTableInterface; import org.apache.hadoop.hbase.client.HTablePool; import org.apache.hadoop.hbase.client.Increment; import org.apache.hadoop.hbase.client.Result; import org.apache.hadoop.hbase.util.Bytes; public class HBaseTestIncrement { static String tableName = testIncrement; static byte[] infoCF = Bytes.toBytes(info); static byte[] rowKey = Bytes.toBytes(test-rowKey); static byte[] newInc = Bytes.toBytes(new); static byte[] oldInc = Bytes.toBytes(old); /** * This code reproduces a bug with increment column values in hbase * Usage: First run part one by passing '1' as the first arg *Then restart the hbase cluster so it writes everything to disk *Run part two by passing '2' as the first arg * * This will result in the old deleted data being found and used for the increment calls * * @param args * @throws IOException */ public static void main(String[] args) throws IOException { if(1.equals(args[0])) partOne(); if(2.equals(args[0])) partTwo(); if (both.equals(args[0])) { partOne(); partTwo(); } } /** * Creates a table and increments a column value 10 times by 10 each time. * Results in a value of 100 for the column * * @throws IOException */ static void partOne()throws IOException { Configuration conf = HBaseConfiguration.create(); HBaseAdmin admin = new HBaseAdmin(conf); HTableDescriptor tableDesc = new HTableDescriptor(tableName); tableDesc.addFamily(new HColumnDescriptor(infoCF)); if(admin.tableExists(tableName)) { admin.disableTable(tableName); admin.deleteTable(tableName); } admin.createTable(tableDesc); HTablePool pool = new HTablePool(conf, Integer.MAX_VALUE); HTableInterface table = pool.getTable(Bytes.toBytes(tableName)); //Increment unitialized column for (int j = 0; j 10; j++) { table.incrementColumnValue(rowKey, infoCF, oldInc, (long)10); Increment inc = new Increment(rowKey); inc.addColumn(infoCF, newInc, (long)10); table.increment(inc); } Get get = new Get(rowKey); Result r = table.get(get); System.out.println(initial values: new + Bytes.toLong(r.getValue(infoCF, newInc)) + old + Bytes.toLong(r.getValue(infoCF, oldInc))); } /** * First deletes the data then increments the column 10 times by 1 each time * * Should result in a value of 10 but it doesn't, it results in a values of 110 * * @throws IOException */
[jira] [Created] (HBASE-6290) Add a function a mark a server as dead and start the recovery the process
nkeywal created HBASE-6290: -- Summary: Add a function a mark a server as dead and start the recovery the process Key: HBASE-6290 URL: https://issues.apache.org/jira/browse/HBASE-6290 Project: HBase Issue Type: Improvement Components: monitoring Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor ZooKeeper is used a a monitoring tool: we use znode and we start the recovery process when a znode is deleted by ZK because it got a timeout. This timeout is defaulted to 90 seconds, and often set to 30s However, some HW issues could be detected by specialized hw monitoring tools before the ZK timeout. For this reason, it makes sense to offer a very simple function to mark a RS as dead. This should not take in It could be a hbase shell function such as considerAsDead ipAddress|serverName This would delete all the znodes of the server running on this box, starting the recovery process. Such a function would be easily callable (at callers risk) by any fault detection tool... We could have issues to identify the right master region servers around ipv4 vs ipv6 vs and multi networked boxes however. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6290) Add a function a mark a server as dead and start the recovery the process
[ https://issues.apache.org/jira/browse/HBASE-6290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403207#comment-13403207 ] stack commented on HBASE-6290: -- What would the shell invocation do? Connect to a RS and call its shutdown or shutdown + kill znode? What are you thinking would use this new facility (It sounds like a good thing to have. Would be good to list possible users). Add a function a mark a server as dead and start the recovery the process - Key: HBASE-6290 URL: https://issues.apache.org/jira/browse/HBASE-6290 Project: HBase Issue Type: Improvement Components: monitoring Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor ZooKeeper is used a a monitoring tool: we use znode and we start the recovery process when a znode is deleted by ZK because it got a timeout. This timeout is defaulted to 90 seconds, and often set to 30s However, some HW issues could be detected by specialized hw monitoring tools before the ZK timeout. For this reason, it makes sense to offer a very simple function to mark a RS as dead. This should not take in It could be a hbase shell function such as considerAsDead ipAddress|serverName This would delete all the znodes of the server running on this box, starting the recovery process. Such a function would be easily callable (at callers risk) by any fault detection tool... We could have issues to identify the right master region servers around ipv4 vs ipv6 vs and multi networked boxes however. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3725) HBase increments from old value after delete and write to disk
[ https://issues.apache.org/jira/browse/HBASE-3725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403210#comment-13403210 ] stack commented on HBASE-3725: -- @ShiXing Any chance of a patch. Thanks. HBase increments from old value after delete and write to disk -- Key: HBASE-3725 URL: https://issues.apache.org/jira/browse/HBASE-3725 Project: HBase Issue Type: Bug Components: io, regionserver Affects Versions: 0.90.1 Reporter: Nathaniel Cook Assignee: Jonathan Gray Attachments: HBASE-3725-Test-v1.patch, HBASE-3725-v3.patch, HBASE-3725.patch Deleted row values are sometimes used for starting points on new increments. To reproduce: Create a row r. Set column x to some default value. Force hbase to write that value to the file system (such as restarting the cluster). Delete the row. Call table.incrementColumnValue with some_value Get the row. The returned value in the column was incremented from the old value before the row was deleted instead of being initialized to some_value. Code to reproduce: {code} import java.io.IOException; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.hbase.HBaseConfiguration; import org.apache.hadoop.hbase.HColumnDescriptor; import org.apache.hadoop.hbase.HTableDescriptor; import org.apache.hadoop.hbase.client.Delete; import org.apache.hadoop.hbase.client.Get; import org.apache.hadoop.hbase.client.HBaseAdmin; import org.apache.hadoop.hbase.client.HTableInterface; import org.apache.hadoop.hbase.client.HTablePool; import org.apache.hadoop.hbase.client.Increment; import org.apache.hadoop.hbase.client.Result; import org.apache.hadoop.hbase.util.Bytes; public class HBaseTestIncrement { static String tableName = testIncrement; static byte[] infoCF = Bytes.toBytes(info); static byte[] rowKey = Bytes.toBytes(test-rowKey); static byte[] newInc = Bytes.toBytes(new); static byte[] oldInc = Bytes.toBytes(old); /** * This code reproduces a bug with increment column values in hbase * Usage: First run part one by passing '1' as the first arg *Then restart the hbase cluster so it writes everything to disk *Run part two by passing '2' as the first arg * * This will result in the old deleted data being found and used for the increment calls * * @param args * @throws IOException */ public static void main(String[] args) throws IOException { if(1.equals(args[0])) partOne(); if(2.equals(args[0])) partTwo(); if (both.equals(args[0])) { partOne(); partTwo(); } } /** * Creates a table and increments a column value 10 times by 10 each time. * Results in a value of 100 for the column * * @throws IOException */ static void partOne()throws IOException { Configuration conf = HBaseConfiguration.create(); HBaseAdmin admin = new HBaseAdmin(conf); HTableDescriptor tableDesc = new HTableDescriptor(tableName); tableDesc.addFamily(new HColumnDescriptor(infoCF)); if(admin.tableExists(tableName)) { admin.disableTable(tableName); admin.deleteTable(tableName); } admin.createTable(tableDesc); HTablePool pool = new HTablePool(conf, Integer.MAX_VALUE); HTableInterface table = pool.getTable(Bytes.toBytes(tableName)); //Increment unitialized column for (int j = 0; j 10; j++) { table.incrementColumnValue(rowKey, infoCF, oldInc, (long)10); Increment inc = new Increment(rowKey); inc.addColumn(infoCF, newInc, (long)10); table.increment(inc); } Get get = new Get(rowKey); Result r = table.get(get); System.out.println(initial values: new + Bytes.toLong(r.getValue(infoCF, newInc)) + old + Bytes.toLong(r.getValue(infoCF, oldInc))); } /** * First deletes the data then increments the column 10 times by 1 each time * * Should result in a value of 10 but it doesn't, it results in a values of 110 * * @throws IOException */ static void partTwo()throws IOException { Configuration conf = HBaseConfiguration.create(); HTablePool pool = new
[jira] [Commented] (HBASE-6210) Backport HBASE-6197 to 0.94 and 0.92?
[ https://issues.apache.org/jira/browse/HBASE-6210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403213#comment-13403213 ] stack commented on HBASE-6210: -- +1 on patch. It looks like what was applied to trunk. Is that so Ram? Backport HBASE-6197 to 0.94 and 0.92? - Key: HBASE-6210 URL: https://issues.apache.org/jira/browse/HBASE-6210 Project: HBase Issue Type: Bug Reporter: stack Assignee: ramkrishna.s.vasudevan Priority: Blocker Fix For: 0.94.1 Attachments: HBASE-6210.patch Backport HBASE-6197 'HRegion's append operation may lose data' and the accompanying HBASE-6195 to 0.94 and 0.92 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6210) Backport HBASE-6197 to 0.94 and 0.92?
[ https://issues.apache.org/jira/browse/HBASE-6210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403218#comment-13403218 ] ramkrishna.s.vasudevan commented on HBASE-6210: --- Yes Stack Backport HBASE-6197 to 0.94 and 0.92? - Key: HBASE-6210 URL: https://issues.apache.org/jira/browse/HBASE-6210 Project: HBase Issue Type: Bug Reporter: stack Assignee: ramkrishna.s.vasudevan Priority: Blocker Fix For: 0.94.1 Attachments: HBASE-6210.patch Backport HBASE-6197 'HRegion's append operation may lose data' and the accompanying HBASE-6195 to 0.94 and 0.92 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()
[ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-6284: -- Attachment: HBASE-6284_Trunk.patch Patch on Trunk for review. Will give exact performance test result soon. Introduce HRegion#doMiniBatchDelete() - Key: HBASE-6284 URL: https://issues.apache.org/jira/browse/HBASE-6284 Project: HBase Issue Type: Bug Reporter: Zhihong Ted Yu Assignee: Anoop Sam John Attachments: HBASE-6284_Trunk.patch From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion': The HTable#delete(ListDelete) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete. I have made the new miniBatchDelete () and made the HTable#delete(ListDelete) to call this new batch delete. Just tested initially with the one node cluster. In that itself I am getting a performance boost which is very much promising. Only one CF and qualifier. 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread. With the new way the net time taken is reduced by more than 1/10 Will test in a 4 node cluster also. I think it will worth doing this change. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6261) Better approximate high-percentile percentile latency metrics
[ https://issues.apache.org/jira/browse/HBASE-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403225#comment-13403225 ] Zhihong Ted Yu commented on HBASE-6261: --- Nice work. In QuantileEstimationCKMS.java: {code} long[] buffer = new long[500]; {code} I think the buffer size should be configurable. Can we maintain a metric for how often compress() is called ? Should compress() return an int indicating how many items are removed ? What if no item gets removed coming out of a call to compress() ? Please work on a patch. Better approximate high-percentile percentile latency metrics - Key: HBASE-6261 URL: https://issues.apache.org/jira/browse/HBASE-6261 Project: HBase Issue Type: New Feature Reporter: Andrew Wang Labels: metrics Attachments: Latencyestimation.pdf The existing reservoir-sampling based latency metrics in HBase are not well-suited for providing accurate estimates of high-percentile (e.g. 90th, 95th, or 99th) latency. This is a well-studied problem in the literature (see [1] and [2]), the question is determining which methods best suit our needs and then implementing it. Ideally, we should be able to estimate these high percentiles with minimal memory and CPU usage as well as minimal error (e.g. 1% error on 90th, or .1% on 99th). It's also desirable to provide this over different time-based sliding windows, e.g. last 1 min, 5 mins, 15 mins, and 1 hour. I'll note that this would also be useful in HDFS, or really anywhere latency metrics are kept. [1] http://www.cs.rutgers.edu/~muthu/bquant.pdf [2] http://infolab.stanford.edu/~manku/papers/04pods-sliding.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()
[ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-6284: -- Fix Version/s: 0.94.1 0.96.0 Introduce HRegion#doMiniBatchDelete() - Key: HBASE-6284 URL: https://issues.apache.org/jira/browse/HBASE-6284 Project: HBase Issue Type: Bug Reporter: Zhihong Ted Yu Assignee: Anoop Sam John Fix For: 0.96.0, 0.94.1 Attachments: HBASE-6284_Trunk.patch From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion': The HTable#delete(ListDelete) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete. I have made the new miniBatchDelete () and made the HTable#delete(ListDelete) to call this new batch delete. Just tested initially with the one node cluster. In that itself I am getting a performance boost which is very much promising. Only one CF and qualifier. 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread. With the new way the net time taken is reduced by more than 1/10 Will test in a 4 node cluster also. I think it will worth doing this change. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6287) Port HBASE-5941 improve multiDelete performance by grabbing locks ahead of time
[ https://issues.apache.org/jira/browse/HBASE-6287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403238#comment-13403238 ] Andrew Purtell commented on HBASE-6287: --- Throwing out two possible design responses: 1. Switch on instanceof more specific Mutation type and call existing prePut/postPut or preDelete/postDelete methods. or: 2. Substitute the more general Mutation type for Put and rename the preXXX and postXXX functions here from ...Put to ...Mutate and provide backwards compatible but deprecated prePut/preDelete and postPut/postDelete methods. Port HBASE-5941 improve multiDelete performance by grabbing locks ahead of time --- Key: HBASE-6287 URL: https://issues.apache.org/jira/browse/HBASE-6287 Project: HBase Issue Type: Bug Reporter: Zhihong Ted Yu HBASE-5941 has been integrated to 0.89-fb This JIRA ports it to HBase trunk -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6210) Backport HBASE-6197 to 0.94 and 0.92?
[ https://issues.apache.org/jira/browse/HBASE-6210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403242#comment-13403242 ] ramkrishna.s.vasudevan commented on HBASE-6210: --- Append i could not find in 0.92. Backport HBASE-6197 to 0.94 and 0.92? - Key: HBASE-6210 URL: https://issues.apache.org/jira/browse/HBASE-6210 Project: HBase Issue Type: Bug Reporter: stack Assignee: ramkrishna.s.vasudevan Priority: Blocker Fix For: 0.94.1 Attachments: HBASE-6210.patch Backport HBASE-6197 'HRegion's append operation may lose data' and the accompanying HBASE-6195 to 0.94 and 0.92 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2730) Expose RS work queue contents on web UI
[ https://issues.apache.org/jira/browse/HBASE-2730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403241#comment-13403241 ] Andrew Purtell commented on HBASE-2730: --- Still think it should go into the debug dump. Expose RS work queue contents on web UI --- Key: HBASE-2730 URL: https://issues.apache.org/jira/browse/HBASE-2730 Project: HBase Issue Type: New Feature Components: monitoring, regionserver Reporter: Todd Lipcon Priority: Critical Fix For: 0.96.0 Attachments: hbase-2730-0_94_0.patch, queuedump, queuedump_sample.png Would be nice to be able to see the contents of the various work queues - eg to know what regions are pending compaction/split/flush/etc. This is handy for debugging why a region might be blocked, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5827) [Coprocessors] Observer notifications on exceptions
[ https://issues.apache.org/jira/browse/HBASE-5827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403244#comment-13403244 ] Andrew Purtell commented on HBASE-5827: --- @Jon, both suggestions good, +1 and +1. [Coprocessors] Observer notifications on exceptions --- Key: HBASE-5827 URL: https://issues.apache.org/jira/browse/HBASE-5827 Project: HBase Issue Type: Improvement Components: coprocessors Reporter: Andrew Purtell Assignee: Andrew Purtell Benjamin Busjaeger wrote on dev@: {quote} Is there a reason that RegionObservers are not notified when a get/put/delete fails? Suppose I maintain some (transient) state in my Coprocessor that is created during preGet and discarded during postGet. If the get fails, postGet is not invoked, so I cannot remove the state. If there is a good reason, is there any other way to achieve the same thing? If not, would it be possible to add something the snippet below to the code base? {code} // pre-get CP hook if (withCoprocessor (coprocessorHost != null)) { if (coprocessorHost.preGet(get, results)) { return results; } } +try{ ... +} catch (Throwable t) { +// failed-get CP hook +if (withCoprocessor (coprocessorHost != null)) { + coprocessorHost.failedGet(get, results); +} +rethrow t; +} // post-get CP hook if (withCoprocessor (coprocessorHost != null)) { coprocessorHost.postGet(get, results); } {code} {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6287) Port HBASE-5941 improve multiDelete performance by grabbing locks ahead of time
[ https://issues.apache.org/jira/browse/HBASE-6287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403246#comment-13403246 ] ramkrishna.s.vasudevan commented on HBASE-6287: --- @Andy HBASE-6284 does what is proposed in 1st option. Port HBASE-5941 improve multiDelete performance by grabbing locks ahead of time --- Key: HBASE-6287 URL: https://issues.apache.org/jira/browse/HBASE-6287 Project: HBase Issue Type: Bug Reporter: Zhihong Ted Yu HBASE-5941 has been integrated to 0.89-fb This JIRA ports it to HBase trunk -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6289) ROOT region doesn't get re-assigned in ServerShutdownHandler if the RS is still working but only the RS's ZK node expires.
[ https://issues.apache.org/jira/browse/HBASE-6289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403247#comment-13403247 ] stack commented on HBASE-6289: -- @Maryann Patch looks good. A SSH should not allow the server its handling as a legit .META. or -ROOT- location so your exclude makes sense. You need curly braces here or put the return on same line as the if... to be within our coding convention. {code} +if (exclude != null exclude.equals(server)) + return null; {code} We can fix this on commit though. What will happen here if server returned is same as excludes server? {code} + AdminProtocol getRootServerConnection(long timeout, ServerName exclude) throws InterruptedException, NotAllMetaRegionsOnlineException, IOException { -return getCachedConnection(waitForRoot(timeout)); +ServerName server = waitForRoot(timeout); +if (exclude != null exclude.equals(server)) + return null; + +return getCachedConnection(server); } {code} We return null and go around again until the RS dies? That seems fine but maybe we should log this special handling? Just a suggestion. ROOT region doesn't get re-assigned in ServerShutdownHandler if the RS is still working but only the RS's ZK node expires. -- Key: HBASE-6289 URL: https://issues.apache.org/jira/browse/HBASE-6289 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.6, 0.94.0 Reporter: Maryann Xue Assignee: Maryann Xue Priority: Critical Attachments: HBASE-6289.patch The ROOT RS has some network problem and its ZK node expires first, which kicks off the ServerShutdownHandler. it calls verifyAndAssignRoot() to try to re-assign ROOT. At that time, the RS is actually still working and passes the verifyRootRegionLocation() check, so the ROOT region is skipped from re-assignment. private void verifyAndAssignRoot() throws InterruptedException, IOException, KeeperException { long timeout = this.server.getConfiguration(). getLong(hbase.catalog.verification.timeout, 1000); if (!this.server.getCatalogTracker().verifyRootRegionLocation(timeout)) { this.services.getAssignmentManager().assignRoot(); } } After a few moments, this RS encounters DFS write problem and decides to abort. The RS then soon gets restarted from commandline, and constantly report: 2012-06-27 23:13:08,627 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,627 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,628 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,628 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,630 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6226) move DataBlockEncoding and related classes to hbase-common module
[ https://issues.apache.org/jira/browse/HBASE-6226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403254#comment-13403254 ] Zhihong Ted Yu commented on HBASE-6226: --- @Matt: You should have selected hbase-git as the Repository. The change to prepareDecoding() looks good. {code} + * @param block HFile block object + * @param onDiskBlock on disk bytes to be decoded + * @param offset data start offset in onDiskBlock + * @throws IOException + */ + public void prepareDecoding(int onDiskSizeWithoutHeader, int uncompressedSizeWithoutHeader, {code} The javadoc doesn't match parameters. In HFileBlock.java, there're some white spaces. Here is one review request: https://reviews.apache.org/r/5643/ Feel free to create your own. move DataBlockEncoding and related classes to hbase-common module - Key: HBASE-6226 URL: https://issues.apache.org/jira/browse/HBASE-6226 Project: HBase Issue Type: Improvement Components: io, regionserver Affects Versions: 0.96.0 Reporter: Matt Corgan Assignee: Matt Corgan Fix For: 0.96.0 Attachments: HBASE-6226-v1.patch, HBASE-6226-v2.patch In order to isolate the implementation details of HBASE-4676 (PrefixTrie encoding) and other DataBlockEncoders by putting them in modules, this pulls up the DataBlockEncoding related interfaces into hbase-common. No tests are moved in this patch. The only notable change was trimming a few dependencies on HFileBlock which adds dependencies to much of the regionserver. The test suite passes locally for me. I tried to keep it as simple as possible... let me know if there are any concerns. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()
[ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-6284: -- Component/s: regionserver performance Introduce HRegion#doMiniBatchDelete() - Key: HBASE-6284 URL: https://issues.apache.org/jira/browse/HBASE-6284 Project: HBase Issue Type: Bug Components: performance, regionserver Reporter: Zhihong Ted Yu Assignee: Anoop Sam John Fix For: 0.96.0, 0.94.1 Attachments: HBASE-6284_Trunk.patch From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion': The HTable#delete(ListDelete) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete. I have made the new miniBatchDelete () and made the HTable#delete(ListDelete) to call this new batch delete. Just tested initially with the one node cluster. In that itself I am getting a performance boost which is very much promising. Only one CF and qualifier. 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread. With the new way the net time taken is reduced by more than 1/10 Will test in a 4 node cluster also. I think it will worth doing this change. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6226) move DataBlockEncoding and related classes to hbase-common module
[ https://issues.apache.org/jira/browse/HBASE-6226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Ted Yu updated HBASE-6226: -- Fix Version/s: 0.96.0 Status: Patch Available (was: Open) move DataBlockEncoding and related classes to hbase-common module - Key: HBASE-6226 URL: https://issues.apache.org/jira/browse/HBASE-6226 Project: HBase Issue Type: Improvement Components: io, regionserver Affects Versions: 0.96.0 Reporter: Matt Corgan Assignee: Matt Corgan Fix For: 0.96.0 Attachments: HBASE-6226-v1.patch, HBASE-6226-v2.patch In order to isolate the implementation details of HBASE-4676 (PrefixTrie encoding) and other DataBlockEncoders by putting them in modules, this pulls up the DataBlockEncoding related interfaces into hbase-common. No tests are moved in this patch. The only notable change was trimming a few dependencies on HFileBlock which adds dependencies to much of the regionserver. The test suite passes locally for me. I tried to keep it as simple as possible... let me know if there are any concerns. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6289) ROOT region doesn't get re-assigned in ServerShutdownHandler if the RS is still working but only the RS's ZK node expires.
[ https://issues.apache.org/jira/browse/HBASE-6289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403256#comment-13403256 ] ramkrishna.s.vasudevan commented on HBASE-6289: --- @Stack Any specific reason why we are verifying and then assigning the root region alone? ROOT region doesn't get re-assigned in ServerShutdownHandler if the RS is still working but only the RS's ZK node expires. -- Key: HBASE-6289 URL: https://issues.apache.org/jira/browse/HBASE-6289 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.6, 0.94.0 Reporter: Maryann Xue Assignee: Maryann Xue Priority: Critical Attachments: HBASE-6289.patch The ROOT RS has some network problem and its ZK node expires first, which kicks off the ServerShutdownHandler. it calls verifyAndAssignRoot() to try to re-assign ROOT. At that time, the RS is actually still working and passes the verifyRootRegionLocation() check, so the ROOT region is skipped from re-assignment. private void verifyAndAssignRoot() throws InterruptedException, IOException, KeeperException { long timeout = this.server.getConfiguration(). getLong(hbase.catalog.verification.timeout, 1000); if (!this.server.getCatalogTracker().verifyRootRegionLocation(timeout)) { this.services.getAssignmentManager().assignRoot(); } } After a few moments, this RS encounters DFS write problem and decides to abort. The RS then soon gets restarted from commandline, and constantly report: 2012-06-27 23:13:08,627 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,627 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,628 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,628 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,630 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6289) ROOT region doesn't get re-assigned in ServerShutdownHandler if the RS is still working but only the RS's ZK node expires.
[ https://issues.apache.org/jira/browse/HBASE-6289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403257#comment-13403257 ] Zhihong Ted Yu commented on HBASE-6289: --- Can we have a test case for this scenario ? ROOT region doesn't get re-assigned in ServerShutdownHandler if the RS is still working but only the RS's ZK node expires. -- Key: HBASE-6289 URL: https://issues.apache.org/jira/browse/HBASE-6289 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.6, 0.94.0 Reporter: Maryann Xue Assignee: Maryann Xue Priority: Critical Attachments: HBASE-6289.patch The ROOT RS has some network problem and its ZK node expires first, which kicks off the ServerShutdownHandler. it calls verifyAndAssignRoot() to try to re-assign ROOT. At that time, the RS is actually still working and passes the verifyRootRegionLocation() check, so the ROOT region is skipped from re-assignment. private void verifyAndAssignRoot() throws InterruptedException, IOException, KeeperException { long timeout = this.server.getConfiguration(). getLong(hbase.catalog.verification.timeout, 1000); if (!this.server.getCatalogTracker().verifyRootRegionLocation(timeout)) { this.services.getAssignmentManager().assignRoot(); } } After a few moments, this RS encounters DFS write problem and decides to abort. The RS then soon gets restarted from commandline, and constantly report: 2012-06-27 23:13:08,627 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,627 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,628 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,628 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,630 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6289) ROOT region doesn't get re-assigned in ServerShutdownHandler if the RS is still working but only the RS's ZK node expires.
[ https://issues.apache.org/jira/browse/HBASE-6289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Ted Yu updated HBASE-6289: -- Description: The ROOT RS has some network problem and its ZK node expires first, which kicks off the ServerShutdownHandler. it calls verifyAndAssignRoot() to try to re-assign ROOT. At that time, the RS is actually still working and passes the verifyRootRegionLocation() check, so the ROOT region is skipped from re-assignment. {code} private void verifyAndAssignRoot() throws InterruptedException, IOException, KeeperException { long timeout = this.server.getConfiguration(). getLong(hbase.catalog.verification.timeout, 1000); if (!this.server.getCatalogTracker().verifyRootRegionLocation(timeout)) { this.services.getAssignmentManager().assignRoot(); } } {code} After a few moments, this RS encounters DFS write problem and decides to abort. The RS then soon gets restarted from commandline, and constantly report: {code} 2012-06-27 23:13:08,627 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,627 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,628 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,628 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,630 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 {code} was: The ROOT RS has some network problem and its ZK node expires first, which kicks off the ServerShutdownHandler. it calls verifyAndAssignRoot() to try to re-assign ROOT. At that time, the RS is actually still working and passes the verifyRootRegionLocation() check, so the ROOT region is skipped from re-assignment. private void verifyAndAssignRoot() throws InterruptedException, IOException, KeeperException { long timeout = this.server.getConfiguration(). getLong(hbase.catalog.verification.timeout, 1000); if (!this.server.getCatalogTracker().verifyRootRegionLocation(timeout)) { this.services.getAssignmentManager().assignRoot(); } } After a few moments, this RS encounters DFS write problem and decides to abort. The RS then soon gets restarted from commandline, and constantly report: 2012-06-27 23:13:08,627 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,627 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,628 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,628 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,630 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 ROOT region doesn't get re-assigned in ServerShutdownHandler if the RS is still working but only the RS's ZK node expires. -- Key: HBASE-6289 URL: https://issues.apache.org/jira/browse/HBASE-6289 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.6, 0.94.0 Reporter: Maryann Xue Assignee: Maryann Xue Priority: Critical Attachments: HBASE-6289.patch The ROOT RS has some network problem and its ZK node expires first, which kicks off the ServerShutdownHandler. it calls verifyAndAssignRoot() to try to re-assign ROOT. At that time, the RS is actually still working and passes the verifyRootRegionLocation() check, so the ROOT region is skipped from re-assignment. {code} private void verifyAndAssignRoot() throws InterruptedException, IOException, KeeperException { long timeout = this.server.getConfiguration(). getLong(hbase.catalog.verification.timeout, 1000); if (!this.server.getCatalogTracker().verifyRootRegionLocation(timeout)) { this.services.getAssignmentManager().assignRoot(); } } {code} After a few moments, this RS encounters DFS write problem and decides to abort. The RS then soon gets restarted from commandline, and constantly report: {code} 2012-06-27 23:13:08,627 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,627 DEBUG
[jira] [Commented] (HBASE-6210) Backport HBASE-6197 to 0.94 and 0.92?
[ https://issues.apache.org/jira/browse/HBASE-6210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403264#comment-13403264 ] ramkrishna.s.vasudevan commented on HBASE-6210: --- I will commit the current patch to 0.94. Backport HBASE-6197 to 0.94 and 0.92? - Key: HBASE-6210 URL: https://issues.apache.org/jira/browse/HBASE-6210 Project: HBase Issue Type: Bug Reporter: stack Assignee: ramkrishna.s.vasudevan Priority: Blocker Fix For: 0.94.1 Attachments: HBASE-6210.patch Backport HBASE-6197 'HRegion's append operation may lose data' and the accompanying HBASE-6195 to 0.94 and 0.92 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6200) KeyComparator.compareWithoutRow can be wrong when families have the same prefix
[ https://issues.apache.org/jira/browse/HBASE-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403271#comment-13403271 ] Hudson commented on HBASE-6200: --- Integrated in HBase-0.94-security #38 (See [https://builds.apache.org/job/HBase-0.94-security/38/]) HBASE-6200 KeyComparator.compareWithoutRow can be wrong when families have the same prefix (Jieshan) (Revision 1354293) Result = FAILURE tedyu : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/KeyValue.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/TestKeyValue.java KeyComparator.compareWithoutRow can be wrong when families have the same prefix --- Key: HBASE-6200 URL: https://issues.apache.org/jira/browse/HBASE-6200 Project: HBase Issue Type: Bug Affects Versions: 0.90.6, 0.92.1, 0.94.0 Reporter: Jean-Daniel Cryans Assignee: Jieshan Bean Priority: Blocker Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1 Attachments: 6200-0.92.txt, 6200-0.94.txt, 6200-90.patch, 6200-trunk-v2.patch, 6200-trunk-v3.patch, 6200-trunk-v4.txt As reported by Desert Rose on IRC and on the ML, {{Result}} has a weird behavior when some families share the same prefix. He posted a link to his code to show how it fails, http://pastebin.com/7TBA1XGh Basically {{KeyComparator.compareWithoutRow}} doesn't differentiate families and qualifiers so f:a is said to be bigger than f1:, which is false. Then what happens is that the KVs are returned in the right order from the RS but then doing {{Result.binarySearch}} it uses {{KeyComparator.compareWithoutRow}} which has a different sorting so the end result is undetermined. I added some debug and I can see that the data is returned in the right order but {{Arrays.binarySearch}} returned the wrong KV, which is then verified agains the passed family and qualifier which fails so null is returned. I don't know how frequent it is for users to have families with the same prefix, but those that do have that and that use those families at the same time will have big correctness issues. This is why I mark this as a blocker. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5875) Process RIT and Master restart may remove an online server considering it as a dead server
[ https://issues.apache.org/jira/browse/HBASE-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403273#comment-13403273 ] Hudson commented on HBASE-5875: --- Integrated in HBase-0.94-security #38 (See [https://builds.apache.org/job/HBase-0.94-security/38/]) HBASE-5875 Process RIT and Master restart may remove an online server considering it as a dead server Submitted by:Rajesh Reviewed by:Ram, Ted, Stack (Revision 1353690) Result = FAILURE ramkrishna : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java Process RIT and Master restart may remove an online server considering it as a dead server -- Key: HBASE-5875 URL: https://issues.apache.org/jira/browse/HBASE-5875 Project: HBase Issue Type: Bug Affects Versions: 0.92.1 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 0.96.0, 0.94.1 Attachments: HBASE-5875.patch, HBASE-5875_0.94.patch, HBASE-5875_0.94_1.patch, HBASE-5875_0.94_2.patch, HBASE-5875_trunk.patch, HBASE-5875_trunk.patch, HBASE-5875_trunk_1.patch, HBASE-5875v2.patch If on master restart it finds the ROOT/META to be in RIT state, master tries to assign the ROOT region through ProcessRIT. Master will trigger the assignment and next will try to verify the Root Region Location. Root region location verification is done seeing if the RS has the region in its online list. If the master triggered assignment has not yet been completed in RS then the verify root region location will fail. Because it failed {code} splitLogAndExpireIfOnline(currentRootServer); {code} we do split log and also remove the server from online server list. Ideally here there is nothing to do in splitlog as no region server was restarted. So master, though the server is online, master just invalidates the region server. In a special case, if i have only one RS then my cluster will become non operative. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5360) [uberhbck] Add options for how to handle offline split parents.
[ https://issues.apache.org/jira/browse/HBASE-5360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403274#comment-13403274 ] Hudson commented on HBASE-5360: --- Integrated in HBase-0.94-security #38 (See [https://builds.apache.org/job/HBase-0.94-security/38/]) HBASE-5360 [uberhbck] Add options for how to handle offline split parents, addendum (Revision 1353659) Result = FAILURE jxiang : Files : * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java [uberhbck] Add options for how to handle offline split parents. Key: HBASE-5360 URL: https://issues.apache.org/jira/browse/HBASE-5360 Project: HBase Issue Type: Improvement Components: hbck Affects Versions: 0.90.7, 0.92.1, 0.94.0 Reporter: Jonathan Hsieh Assignee: Jimmy Xiang Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1 Attachments: 5360-0.90-hbase.patch, 5360-0.92-hbase.patch, 5360-0.94-hbase.patch, 5360_hbase_v4.patch, hbase-5360.path In a recent case, we attempted to repair a cluster that suffered from HBASE-4238 that had about 6-7 generations of leftover split data. The hbck repair options in an development version of HBASE-5128 treat HDFS as ground truth but didn't check SPLIT and OFFLINE flags only found in meta. The net effect was that it essentially attempted to merge many regions back into its eldest geneneration's parent's range. More safe guards to prevent mega-merges are being added on HBASE-5128. This issue would automate the handling of the mega-merge avoiding cases such as lingering grandparents. The strategy here would be to add more checks against .META., and perform part of the catalog janitor's responsibilities for lingering grandparents. This would potentially include options to sideline regions, deleting grandparent regions, min size for sidelining, and mechanisms for cleaning .META.. Note: There already exists an mechanism to reload these regions -- the bulk loaded mechanisms in LoadIncrementalHFiles can be used to re-add grandparents (automatically splitting them if necessary) to HBase. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6227) SSH and cluster startup causes data loss
[ https://issues.apache.org/jira/browse/HBASE-6227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403275#comment-13403275 ] Hudson commented on HBASE-6227: --- Integrated in HBase-0.94-security #38 (See [https://builds.apache.org/job/HBase-0.94-security/38/]) HBASE-6227 SSH and cluster startup causes data loss Submitted by:Chunhui Reviewed by:Ted, Ram (Revision 1354635) Result = FAILURE ramkrishna : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java SSH and cluster startup causes data loss - Key: HBASE-6227 URL: https://issues.apache.org/jira/browse/HBASE-6227 Project: HBase Issue Type: Bug Components: master Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0, 0.94.1 Attachments: HBASE-6227.patch, HBASE-6227v2-94.patch, HBASE-6227v2.patch In AssignmentManager#processDeadServersAndRegionsInTransition, if servershutdownhandler is processing and master consider it cluster startup, master will assign all user regions, however, servershutdownhandler has not completed splitting logs. Let me describe it in more detail. Suppose there are two regionservers A1 and B1, A1 carried META and ROOT 1.master restart and completed assignRootAndMeta 2.A1 and B1 are both restarted, new regionservers are A2 and B2 3.SSH which processed for A1 completed assigning ROOT and META 4.master do rebuilding user regions and no regions added to master's region list 5.master consider it as a cluster startup, and assign all user regions 6.SSH which processed for B1 start to split B1's logs 7.All regions' data carried by B1 would loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6269) Lazyseek should use the maxSequenseId StoreFile's KeyValue as the latest KeyValue
[ https://issues.apache.org/jira/browse/HBASE-6269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403276#comment-13403276 ] Hudson commented on HBASE-6269: --- Integrated in HBase-0.94-security #38 (See [https://builds.apache.org/job/HBase-0.94-security/38/]) HBASE-6269 Lazyseek should use the maxSequenseId StoreFile's KeyValue as the latest KeyValue (Xing Shi) (Revision 1354815) Result = FAILURE tedyu : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueHeap.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java Lazyseek should use the maxSequenseId StoreFile's KeyValue as the latest KeyValue - Key: HBASE-6269 URL: https://issues.apache.org/jira/browse/HBASE-6269 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.0 Reporter: ShiXing Assignee: ShiXing Attachments: 6269.94, HBASE-6269-trunk-V1.patch, HBASE-6269-v1.patch, runAllTests.out.txt When I fix the bug HBASE-6195, there is happened to find sometimes the test case will fail, https://builds.apache.org/job/HBase-0.94/259/. If there are two Put/Increment with same row, family, qualifier, timestamp and different memstoreTS, after each Put/Increment, we do a memstore flush. So there will be two StoreFile with same KeyValue(except memstoreTS and SequenceId). When I got the row, I always got the old records, the test case like this: {code} public void testPutWithMemStoreFlush() throws Exception { Configuration conf = HBaseConfiguration.create(); String method = testPutWithMemStoreFlush; byte[] tableName = Bytes.toBytes(method); byte[] family = Bytes.toBytes(family);; byte[] qualifier = Bytes.toBytes(qualifier); byte[] row = Bytes.toBytes(putRow); byte[] value = null; this.region = initHRegion(tableName, method, conf, family); Put put = null; Get get = null; ListKeyValue kvs = null; Result res = null; put = new Put(row); value = Bytes.toBytes(value0); put.add(family, qualifier, 1234567l, value); region.put(put); System.out.print(get value before flush after put value0 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } region.flushcache(); System.out.print(get value after flush after put value0 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } put = new Put(row); value = Bytes.toBytes(value1); put.add(family, qualifier, 1234567l, value); region.put(put); System.out.print(get value before flush after put value1 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } region.flushcache(); System.out.print(get value after flush after put value1 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } put = new Put(row); value = Bytes.toBytes(value2); put.add(family, qualifier, 1234567l, value); region.put(put); System.out.print(get value before flush after put value2 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } region.flushcache(); System.out.print(get value after flush after put value2 : ); get = new Get(row); get.addColumn(family, qualifier); get.setMaxVersions(); res = this.region.get(get, null); kvs = res.getColumn(family, qualifier); for (int i = 0; i kvs.size(); i++) { System.out.println(Bytes.toString(kvs.get(i).getValue())); } } {code} and the result print as followed: {code} get value before flush
[jira] [Commented] (HBASE-6240) Race in HCM.getMaster stalls clients
[ https://issues.apache.org/jira/browse/HBASE-6240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403278#comment-13403278 ] Hudson commented on HBASE-6240: --- Integrated in HBase-0.94-security #38 (See [https://builds.apache.org/job/HBase-0.94-security/38/]) HBASE-6240 Race in HCM.getMaster stalls clients Submitted by:J-D, Ram Reviewed by:J-D, Ted (Revision 1354116) Result = FAILURE ramkrishna : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java Race in HCM.getMaster stalls clients Key: HBASE-6240 URL: https://issues.apache.org/jira/browse/HBASE-6240 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Jean-Daniel Cryans Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.94.1 Attachments: HBASE-6240.patch, HBASE-6240_1_0.94.patch I found this issue trying to run YCSB on 0.94, I don't think it exists on any other branch. I believe that this was introduced in HBASE-5058 Allow HBaseAdmin to use an existing connection. The issue is that in HCM.getMaster it does this recipe: # Check if the master is null and runs (if so, return) # Grab a lock on masterLock # nullify this.master # try to get a new master The issue happens at 3, it should re-run 1 since while you're waiting on the lock someone else could have already fixed it for you. What happens right now is that the threads are all able to set the master to null before others are able to get out of getMaster and it's a complete mess. Figuring it out took me some time because it doesn't manifest itself right away, silent retries are done in the background. Basically the first clue was this: {noformat} Error doing get: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=10, exceptions: Tue Jun 19 23:40:46 UTC 2012, org.apache.hadoop.hbase.client.HTable$3@571a4bd4, java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2eb0a3f5 closed Tue Jun 19 23:40:47 UTC 2012, org.apache.hadoop.hbase.client.HTable$3@571a4bd4, java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2eb0a3f5 closed Tue Jun 19 23:40:48 UTC 2012, org.apache.hadoop.hbase.client.HTable$3@571a4bd4, java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2eb0a3f5 closed Tue Jun 19 23:40:49 UTC 2012, org.apache.hadoop.hbase.client.HTable$3@571a4bd4, java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2eb0a3f5 closed Tue Jun 19 23:40:51 UTC 2012, org.apache.hadoop.hbase.client.HTable$3@571a4bd4, java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2eb0a3f5 closed Tue Jun 19 23:40:53 UTC 2012, org.apache.hadoop.hbase.client.HTable$3@571a4bd4, java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2eb0a3f5 closed Tue Jun 19 23:40:57 UTC 2012, org.apache.hadoop.hbase.client.HTable$3@571a4bd4, java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2eb0a3f5 closed Tue Jun 19 23:41:01 UTC 2012, org.apache.hadoop.hbase.client.HTable$3@571a4bd4, java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2eb0a3f5 closed Tue Jun 19 23:41:09 UTC 2012, org.apache.hadoop.hbase.client.HTable$3@571a4bd4, java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2eb0a3f5 closed Tue Jun 19 23:41:25 UTC 2012, org.apache.hadoop.hbase.client.HTable$3@571a4bd4, java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2eb0a3f5 closed {noformat} This was caused by the little dance up in HBaseAdmin where it deletes stale connections... which are not stale at all. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6267) hbase.store.delete.expired.storefile should be true by default
[ https://issues.apache.org/jira/browse/HBASE-6267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403277#comment-13403277 ] Hudson commented on HBASE-6267: --- Integrated in HBase-0.94-security #38 (See [https://builds.apache.org/job/HBase-0.94-security/38/]) HBASE-6267. hbase.store.delete.expired.storefile should be true by default (Revision 1353813) Result = FAILURE apurtell : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/io/hfile/TestScannerSelectionUsingTTL.java hbase.store.delete.expired.storefile should be true by default -- Key: HBASE-6267 URL: https://issues.apache.org/jira/browse/HBASE-6267 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.96.0, 0.94.1 Attachments: HBASE-6267-0.94.patch, HBASE-6267.patch HBASE-5199 introduces this logic into Store: {code} + // Delete the expired store files before the compaction selection. + if (conf.getBoolean(hbase.store.delete.expired.storefile, false) + (ttl != Long.MAX_VALUE) (this.scanInfo.minVersions == 0)) { +CompactSelection expiredSelection = compactSelection +.selectExpiredStoreFilesToCompact( +EnvironmentEdgeManager.currentTimeMillis() - this.ttl); + +// If there is any expired store files, delete them by compaction. +if (expiredSelection != null) { + return expiredSelection; +} + } {code} Is there any reason why that should not be default {{true}}? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6276) TestClassLoading is racy
[ https://issues.apache.org/jira/browse/HBASE-6276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403272#comment-13403272 ] Hudson commented on HBASE-6276: --- Integrated in HBase-0.94-security #38 (See [https://builds.apache.org/job/HBase-0.94-security/38/]) HBASE-6276. TestClassLoading is racy (Revision 1354256) Result = FAILURE apurtell : Files : * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/coprocessor/TestClassLoading.java TestClassLoading is racy Key: HBASE-6276 URL: https://issues.apache.org/jira/browse/HBASE-6276 Project: HBase Issue Type: Bug Components: coprocessors, test Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Attachments: HBASE-6276-0.94.patch, HBASE-6276.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6261) Better approximate high-percentile percentile latency metrics
[ https://issues.apache.org/jira/browse/HBASE-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403280#comment-13403280 ] Andrew Wang commented on HBASE-6261: I don't think performance is very sensitive to the buffer size, it's just a way of batching inserts for efficiency. Definitely doesn't affect accuracy because I have it call insertBatch() on every query(). We can maintain the compress count and track the # items removed, but I don't know if it's really worth exposing to the user (metrics for our metrics?). I think it's nice for testing though, so I'll try to expose it internally. I've never seen compress() fail to remove any items, but I guess this could happen with some adversarial pattern. I don't think you can do much about it though, since the algo needs those items to maintain the error bounds. Better approximate high-percentile percentile latency metrics - Key: HBASE-6261 URL: https://issues.apache.org/jira/browse/HBASE-6261 Project: HBase Issue Type: New Feature Reporter: Andrew Wang Labels: metrics Attachments: Latencyestimation.pdf The existing reservoir-sampling based latency metrics in HBase are not well-suited for providing accurate estimates of high-percentile (e.g. 90th, 95th, or 99th) latency. This is a well-studied problem in the literature (see [1] and [2]), the question is determining which methods best suit our needs and then implementing it. Ideally, we should be able to estimate these high percentiles with minimal memory and CPU usage as well as minimal error (e.g. 1% error on 90th, or .1% on 99th). It's also desirable to provide this over different time-based sliding windows, e.g. last 1 min, 5 mins, 15 mins, and 1 hour. I'll note that this would also be useful in HDFS, or really anywhere latency metrics are kept. [1] http://www.cs.rutgers.edu/~muthu/bquant.pdf [2] http://infolab.stanford.edu/~manku/papers/04pods-sliding.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()
[ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403279#comment-13403279 ] Zhihong Ted Yu commented on HBASE-6284: --- Please put patch on review board - it is of decent size. Introduce HRegion#doMiniBatchDelete() - Key: HBASE-6284 URL: https://issues.apache.org/jira/browse/HBASE-6284 Project: HBase Issue Type: Bug Components: performance, regionserver Reporter: Zhihong Ted Yu Assignee: Anoop Sam John Fix For: 0.96.0, 0.94.1 Attachments: HBASE-6284_Trunk.patch From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion': The HTable#delete(ListDelete) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete. I have made the new miniBatchDelete () and made the HTable#delete(ListDelete) to call this new batch delete. Just tested initially with the one node cluster. In that itself I am getting a performance boost which is very much promising. Only one CF and qualifier. 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread. With the new way the net time taken is reduced by more than 1/10 Will test in a 4 node cluster also. I think it will worth doing this change. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6261) Better approximate high-percentile percentile latency metrics
[ https://issues.apache.org/jira/browse/HBASE-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403284#comment-13403284 ] Zhihong Ted Yu commented on HBASE-6261: --- bq. algo needs those items to maintain the error bounds. Right. That's why I was looking for data structure that can grow in size. Better approximate high-percentile percentile latency metrics - Key: HBASE-6261 URL: https://issues.apache.org/jira/browse/HBASE-6261 Project: HBase Issue Type: New Feature Reporter: Andrew Wang Labels: metrics Attachments: Latencyestimation.pdf The existing reservoir-sampling based latency metrics in HBase are not well-suited for providing accurate estimates of high-percentile (e.g. 90th, 95th, or 99th) latency. This is a well-studied problem in the literature (see [1] and [2]), the question is determining which methods best suit our needs and then implementing it. Ideally, we should be able to estimate these high percentiles with minimal memory and CPU usage as well as minimal error (e.g. 1% error on 90th, or .1% on 99th). It's also desirable to provide this over different time-based sliding windows, e.g. last 1 min, 5 mins, 15 mins, and 1 hour. I'll note that this would also be useful in HDFS, or really anywhere latency metrics are kept. [1] http://www.cs.rutgers.edu/~muthu/bquant.pdf [2] http://infolab.stanford.edu/~manku/papers/04pods-sliding.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6210) Backport HBASE-6197 to 0.94 and 0.92?
[ https://issues.apache.org/jira/browse/HBASE-6210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403292#comment-13403292 ] ramkrishna.s.vasudevan commented on HBASE-6210: --- Committed to 0.94 Backport HBASE-6197 to 0.94 and 0.92? - Key: HBASE-6210 URL: https://issues.apache.org/jira/browse/HBASE-6210 Project: HBase Issue Type: Bug Reporter: stack Assignee: ramkrishna.s.vasudevan Priority: Blocker Fix For: 0.94.1 Attachments: HBASE-6210.patch Backport HBASE-6197 'HRegion's append operation may lose data' and the accompanying HBASE-6195 to 0.94 and 0.92 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()
[ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Ted Yu updated HBASE-6284: -- Status: Patch Available (was: Open) Introduce HRegion#doMiniBatchDelete() - Key: HBASE-6284 URL: https://issues.apache.org/jira/browse/HBASE-6284 Project: HBase Issue Type: Bug Components: performance, regionserver Reporter: Zhihong Ted Yu Assignee: Anoop Sam John Fix For: 0.96.0, 0.94.1 Attachments: HBASE-6284_Trunk.patch From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion': The HTable#delete(ListDelete) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete. I have made the new miniBatchDelete () and made the HTable#delete(ListDelete) to call this new batch delete. Just tested initially with the one node cluster. In that itself I am getting a performance boost which is very much promising. Only one CF and qualifier. 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread. With the new way the net time taken is reduced by more than 1/10 Will test in a 4 node cluster also. I think it will worth doing this change. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6210) Backport HBASE-6197 to 0.94
[ https://issues.apache.org/jira/browse/HBASE-6210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Ted Yu updated HBASE-6210: -- Summary: Backport HBASE-6197 to 0.94 (was: Backport HBASE-6197 to 0.94 and 0.92?) Backport HBASE-6197 to 0.94 --- Key: HBASE-6210 URL: https://issues.apache.org/jira/browse/HBASE-6210 Project: HBase Issue Type: Bug Reporter: stack Assignee: ramkrishna.s.vasudevan Priority: Blocker Fix For: 0.94.1 Attachments: HBASE-6210.patch Backport HBASE-6197 'HRegion's append operation may lose data' and the accompanying HBASE-6195 to 0.94 and 0.92 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5961) New standard HBase code formatter
[ https://issues.apache.org/jira/browse/HBASE-5961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403300#comment-13403300 ] Matt Corgan commented on HBASE-5961: Jesse - I was using this and noticed the comments are still broken at 80 characters. Unless it's intentional, there's a setting at the bottom of the Comments tab. New standard HBase code formatter - Key: HBASE-5961 URL: https://issues.apache.org/jira/browse/HBASE-5961 Project: HBase Issue Type: Improvement Components: build Affects Versions: 0.96.0 Reporter: Jesse Yates Assignee: Jesse Yates Priority: Minor Attachments: HBase-Formmatter.xml There is currently no good way of passing out the formmatter currently the 'standard' in HBase. The standard Apache formatter is actually not very close to what we are considering 'good'/'pretty' code. Further, its not trivial to get a good formatter setup. Proposing two things: 1) Adding a formmatter to the dev tools and calling out the formmatter usage in the docs 2) Move to a 'better' formmatter that is not the standard apache formmatter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6261) Better approximate high-percentile percentile latency metrics
[ https://issues.apache.org/jira/browse/HBASE-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403304#comment-13403304 ] Andrew Wang commented on HBASE-6261: Yea, everything ultimately goes into the {{sample}} LinkedList. The fixed size {{buffer}} is just used to do more efficient batch inserts into {{sample}}. Better approximate high-percentile percentile latency metrics - Key: HBASE-6261 URL: https://issues.apache.org/jira/browse/HBASE-6261 Project: HBase Issue Type: New Feature Reporter: Andrew Wang Labels: metrics Attachments: Latencyestimation.pdf The existing reservoir-sampling based latency metrics in HBase are not well-suited for providing accurate estimates of high-percentile (e.g. 90th, 95th, or 99th) latency. This is a well-studied problem in the literature (see [1] and [2]), the question is determining which methods best suit our needs and then implementing it. Ideally, we should be able to estimate these high percentiles with minimal memory and CPU usage as well as minimal error (e.g. 1% error on 90th, or .1% on 99th). It's also desirable to provide this over different time-based sliding windows, e.g. last 1 min, 5 mins, 15 mins, and 1 hour. I'll note that this would also be useful in HDFS, or really anywhere latency metrics are kept. [1] http://www.cs.rutgers.edu/~muthu/bquant.pdf [2] http://infolab.stanford.edu/~manku/papers/04pods-sliding.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6289) ROOT region doesn't get re-assigned in ServerShutdownHandler if the RS is still working but only the RS's ZK node expires.
[ https://issues.apache.org/jira/browse/HBASE-6289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403307#comment-13403307 ] stack commented on HBASE-6289: -- bq. Any specific reason why we are verifying and then assigning the root region alone? Are you asking why we assign the root, then the meta, ahead of all other assignments? If so, its because these need to be assigned for sure before any other assignments will complete. Maybe you were asking something else? @Ted A test would be hard to get in here methinks for the startup code. Would take a bunch of mocking. We, the hbase core, should make it easier on folks mocking up these scenarios by building the necessary underpinnings before we can expect the likes of Maryann to deliver a unit test (thats my opinion). ROOT region doesn't get re-assigned in ServerShutdownHandler if the RS is still working but only the RS's ZK node expires. -- Key: HBASE-6289 URL: https://issues.apache.org/jira/browse/HBASE-6289 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.6, 0.94.0 Reporter: Maryann Xue Assignee: Maryann Xue Priority: Critical Attachments: HBASE-6289.patch The ROOT RS has some network problem and its ZK node expires first, which kicks off the ServerShutdownHandler. it calls verifyAndAssignRoot() to try to re-assign ROOT. At that time, the RS is actually still working and passes the verifyRootRegionLocation() check, so the ROOT region is skipped from re-assignment. {code} private void verifyAndAssignRoot() throws InterruptedException, IOException, KeeperException { long timeout = this.server.getConfiguration(). getLong(hbase.catalog.verification.timeout, 1000); if (!this.server.getCatalogTracker().verifyRootRegionLocation(timeout)) { this.services.getAssignmentManager().assignRoot(); } } {code} After a few moments, this RS encounters DFS write problem and decides to abort. The RS then soon gets restarted from commandline, and constantly report: {code} 2012-06-27 23:13:08,627 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,627 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,628 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,628 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,630 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5961) New standard HBase code formatter
[ https://issues.apache.org/jira/browse/HBASE-5961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403316#comment-13403316 ] Jesse Yates commented on HBASE-5961: @Matt - thanks for the heads up. I'll post a new version. New standard HBase code formatter - Key: HBASE-5961 URL: https://issues.apache.org/jira/browse/HBASE-5961 Project: HBase Issue Type: Improvement Components: build Affects Versions: 0.96.0 Reporter: Jesse Yates Assignee: Jesse Yates Priority: Minor Attachments: HBase-Formmatter.xml There is currently no good way of passing out the formmatter currently the 'standard' in HBase. The standard Apache formatter is actually not very close to what we are considering 'good'/'pretty' code. Further, its not trivial to get a good formatter setup. Proposing two things: 1) Adding a formmatter to the dev tools and calling out the formmatter usage in the docs 2) Move to a 'better' formmatter that is not the standard apache formmatter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5961) New standard HBase code formatter
[ https://issues.apache.org/jira/browse/HBASE-5961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesse Yates updated HBASE-5961: --- Attachment: (was: HBase-Formmatter.xml) New standard HBase code formatter - Key: HBASE-5961 URL: https://issues.apache.org/jira/browse/HBASE-5961 Project: HBase Issue Type: Improvement Components: build Affects Versions: 0.96.0 Reporter: Jesse Yates Assignee: Jesse Yates Priority: Minor Attachments: HBase-Formmatter.xml There is currently no good way of passing out the formmatter currently the 'standard' in HBase. The standard Apache formatter is actually not very close to what we are considering 'good'/'pretty' code. Further, its not trivial to get a good formatter setup. Proposing two things: 1) Adding a formmatter to the dev tools and calling out the formmatter usage in the docs 2) Move to a 'better' formmatter that is not the standard apache formmatter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5961) New standard HBase code formatter
[ https://issues.apache.org/jira/browse/HBASE-5961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesse Yates updated HBASE-5961: --- Attachment: HBase-Formmatter.xml Replacing formmater with fix for comment width. New standard HBase code formatter - Key: HBASE-5961 URL: https://issues.apache.org/jira/browse/HBASE-5961 Project: HBase Issue Type: Improvement Components: build Affects Versions: 0.96.0 Reporter: Jesse Yates Assignee: Jesse Yates Priority: Minor Attachments: HBase-Formmatter.xml There is currently no good way of passing out the formmatter currently the 'standard' in HBase. The standard Apache formatter is actually not very close to what we are considering 'good'/'pretty' code. Further, its not trivial to get a good formatter setup. Proposing two things: 1) Adding a formmatter to the dev tools and calling out the formmatter usage in the docs 2) Move to a 'better' formmatter that is not the standard apache formmatter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6239) [replication] ReplicationSink uses the ts of the first KV for the other KVs in the same row
[ https://issues.apache.org/jira/browse/HBASE-6239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403319#comment-13403319 ] stack commented on HBASE-6239: -- +1 on patch. Pity this is so ugly: {code} +HLog.Entry[] entries = new HLog.Entry[3]; +long now = System.currentTimeMillis(); +for(int i = 0; i 3; i++) { + entries[i] = createEntry(TABLE_NAME1, 1, i, KeyValue.Type.Put, now+i); +} +// Kinda ugly, trying to merge all the entries into one +entries[0].getEdit().add(entries[1].getEdit().getKeyValues().get(0)); +entries[0].getEdit().add(entries[2].getEdit().getKeyValues().get(0)); +HLog.Entry[] entry = new HLog.Entry[1]; +entry[0] = entries[0]; +SINK.replicateEntries(entry); {code} ...but its not a blocker. Commit. Backport to 0.94? [replication] ReplicationSink uses the ts of the first KV for the other KVs in the same row --- Key: HBASE-6239 URL: https://issues.apache.org/jira/browse/HBASE-6239 Project: HBase Issue Type: Bug Affects Versions: 0.90.6, 0.92.1 Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Critical Labels: corruption Fix For: 0.92.2 Attachments: HBASE-6239-0.92-v1.patch ReplicationSink assumes that all the KVs for the same row inside a WALEdit will have the same timestamp, which is not necessarily the case. This only affects 0.90 and 0.92 since HBASE-5203 fixes it in 0.94 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6226) move DataBlockEncoding and related classes to hbase-common module
[ https://issues.apache.org/jira/browse/HBASE-6226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403328#comment-13403328 ] Hadoop QA commented on HBASE-6226: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12533779/HBASE-6226-v2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestSplitLogManager Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2282//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2282//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2282//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2282//console This message is automatically generated. move DataBlockEncoding and related classes to hbase-common module - Key: HBASE-6226 URL: https://issues.apache.org/jira/browse/HBASE-6226 Project: HBase Issue Type: Improvement Components: io, regionserver Affects Versions: 0.96.0 Reporter: Matt Corgan Assignee: Matt Corgan Fix For: 0.96.0 Attachments: HBASE-6226-v1.patch, HBASE-6226-v2.patch In order to isolate the implementation details of HBASE-4676 (PrefixTrie encoding) and other DataBlockEncoders by putting them in modules, this pulls up the DataBlockEncoding related interfaces into hbase-common. No tests are moved in this patch. The only notable change was trimming a few dependencies on HFileBlock which adds dependencies to much of the regionserver. The test suite passes locally for me. I tried to keep it as simple as possible... let me know if there are any concerns. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6203) Create hbase-it
[ https://issues.apache.org/jira/browse/HBASE-6203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403357#comment-13403357 ] Zhihong Ted Yu commented on HBASE-6203: --- Can one integration test be added to the patch ? Create hbase-it --- Key: HBASE-6203 URL: https://issues.apache.org/jira/browse/HBASE-6203 Project: HBase Issue Type: Sub-task Reporter: Enis Soztutar Assignee: Enis Soztutar Attachments: HBASE-6203_v1.patch Create hbase-it, as per parent issue, and re-introduce HBASE-4454 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6291) Don't retry increments on an invalid cell
Jean-Daniel Cryans created HBASE-6291: - Summary: Don't retry increments on an invalid cell Key: HBASE-6291 URL: https://issues.apache.org/jira/browse/HBASE-6291 Project: HBase Issue Type: Improvement Affects Versions: 0.94.0, 0.92.1, 0.90.6 Reporter: Jean-Daniel Cryans Fix For: 0.90.7, 0.92.2, 0.94.1 This says it all: {noformat} ERROR: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=7, exceptions: Thu Jun 28 18:34:44 UTC 2012, org.apache.hadoop.hbase.client.HTable$8@4eabaf8c, java.io.IOException: java.io.IOException: Attempted to increment field that isn't 64 bits wide {noformat} {{HRegion}} should be modified here to send a DoNotRetryIOException: {code} if (wrongLength) { throw new DoNotRetryIOException( Attempted to increment field that isn't 64 bits wide); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6203) Create hbase-it
[ https://issues.apache.org/jira/browse/HBASE-6203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403361#comment-13403361 ] Enis Soztutar commented on HBASE-6203: -- How about waiting for HBASE-6241 and committing this and that consecutively? Create hbase-it --- Key: HBASE-6203 URL: https://issues.apache.org/jira/browse/HBASE-6203 Project: HBase Issue Type: Sub-task Reporter: Enis Soztutar Assignee: Enis Soztutar Attachments: HBASE-6203_v1.patch Create hbase-it, as per parent issue, and re-introduce HBASE-4454 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6203) Create hbase-it
[ https://issues.apache.org/jira/browse/HBASE-6203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403369#comment-13403369 ] Zhihong Ted Yu commented on HBASE-6203: --- How about bundling patch here into patch for HBASE-6241 ? Create hbase-it --- Key: HBASE-6203 URL: https://issues.apache.org/jira/browse/HBASE-6203 Project: HBase Issue Type: Sub-task Reporter: Enis Soztutar Assignee: Enis Soztutar Attachments: HBASE-6203_v1.patch Create hbase-it, as per parent issue, and re-introduce HBASE-4454 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6197) HRegion's append operation may lose data
[ https://issues.apache.org/jira/browse/HBASE-6197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403377#comment-13403377 ] Hudson commented on HBASE-6197: --- Integrated in HBase-0.94 #287 (See [https://builds.apache.org/job/HBase-0.94/287/]) HBASE-6210 Backport HBASE-6197 to 0.94 and 0.92? Submitted by:Ram Reviewed by:Stack (Revision 1355087) Result = FAILURE ramkrishna : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java HRegion's append operation may lose data Key: HBASE-6197 URL: https://issues.apache.org/jira/browse/HBASE-6197 Project: HBase Issue Type: Bug Components: regionserver Reporter: ShiXing Assignee: ShiXing Fix For: 0.96.0 Attachments: HBASE-6197-trunk-V1.patch Like the HBASE-6195, when flushing the append thread will read out the old value for the larger timestamp in snapshot and smaller timestamp in memstore. We Should make the first-in-thread generates the smaller timestamp. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6210) Backport HBASE-6197 to 0.94
[ https://issues.apache.org/jira/browse/HBASE-6210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403378#comment-13403378 ] Hudson commented on HBASE-6210: --- Integrated in HBase-0.94 #287 (See [https://builds.apache.org/job/HBase-0.94/287/]) HBASE-6210 Backport HBASE-6197 to 0.94 and 0.92? Submitted by:Ram Reviewed by:Stack (Revision 1355087) Result = FAILURE ramkrishna : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java Backport HBASE-6197 to 0.94 --- Key: HBASE-6210 URL: https://issues.apache.org/jira/browse/HBASE-6210 Project: HBase Issue Type: Bug Reporter: stack Assignee: ramkrishna.s.vasudevan Priority: Blocker Fix For: 0.94.1 Attachments: HBASE-6210.patch Backport HBASE-6197 'HRegion's append operation may lose data' and the accompanying HBASE-6195 to 0.94 and 0.92 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()
[ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403384#comment-13403384 ] Hadoop QA commented on HBASE-6284: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12533845/HBASE-6284_Trunk.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2283//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2283//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2283//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2283//console This message is automatically generated. Introduce HRegion#doMiniBatchDelete() - Key: HBASE-6284 URL: https://issues.apache.org/jira/browse/HBASE-6284 Project: HBase Issue Type: Bug Components: performance, regionserver Reporter: Zhihong Ted Yu Assignee: Anoop Sam John Fix For: 0.96.0, 0.94.1 Attachments: HBASE-6284_Trunk.patch From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion': The HTable#delete(ListDelete) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete. I have made the new miniBatchDelete () and made the HTable#delete(ListDelete) to call this new batch delete. Just tested initially with the one node cluster. In that itself I am getting a performance boost which is very much promising. Only one CF and qualifier. 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread. With the new way the net time taken is reduced by more than 1/10 Will test in a 4 node cluster also. I think it will worth doing this change. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6274) Proto files should be in the same palce
[ https://issues.apache.org/jira/browse/HBASE-6274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403383#comment-13403383 ] Gregory Chanan commented on HBASE-6274: --- Looks fine to me, but HBASE-6000 suggests everything should be under src/resources. Proto files should be in the same palce --- Key: HBASE-6274 URL: https://issues.apache.org/jira/browse/HBASE-6274 Project: HBase Issue Type: Improvement Affects Versions: 0.96.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Trivial Fix For: 0.96.0 Attachments: 6274-hbase.patch Currently, proto files are under hbase-server/src/main/protobuf and hbase-server/src/protobuf. It's better to put them together. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6274) Proto files should be in the same palce
[ https://issues.apache.org/jira/browse/HBASE-6274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403398#comment-13403398 ] Elliott Clark commented on HBASE-6274: -- In the end I think that bug settled on putting everything in src/main/protobuf since idl does not need to be in the resulting jars. Proto files should be in the same palce --- Key: HBASE-6274 URL: https://issues.apache.org/jira/browse/HBASE-6274 Project: HBase Issue Type: Improvement Affects Versions: 0.96.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Trivial Fix For: 0.96.0 Attachments: 6274-hbase.patch Currently, proto files are under hbase-server/src/main/protobuf and hbase-server/src/protobuf. It's better to put them together. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6274) Proto files should be in the same palce
[ https://issues.apache.org/jira/browse/HBASE-6274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403400#comment-13403400 ] Gregory Chanan commented on HBASE-6274: --- @Elliot: You are right, +1 on this patch then. Proto files should be in the same palce --- Key: HBASE-6274 URL: https://issues.apache.org/jira/browse/HBASE-6274 Project: HBase Issue Type: Improvement Affects Versions: 0.96.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Trivial Fix For: 0.96.0 Attachments: 6274-hbase.patch Currently, proto files are under hbase-server/src/main/protobuf and hbase-server/src/protobuf. It's better to put them together. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6274) Proto files should be in the same palce
[ https://issues.apache.org/jira/browse/HBASE-6274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-6274: --- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks Greg and Elliott. It is pushed to trunk. Proto files should be in the same palce --- Key: HBASE-6274 URL: https://issues.apache.org/jira/browse/HBASE-6274 Project: HBase Issue Type: Improvement Affects Versions: 0.96.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Trivial Fix For: 0.96.0 Attachments: 6274-hbase.patch Currently, proto files are under hbase-server/src/main/protobuf and hbase-server/src/protobuf. It's better to put them together. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6226) move DataBlockEncoding and related classes to hbase-common module
[ https://issues.apache.org/jira/browse/HBASE-6226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Corgan updated HBASE-6226: --- Attachment: HBASE-6226-v3.patch Attaching v3 patch which is also up for review at https://reviews.apache.org/r/5648/ Thanks for the ReviewBoard help Ted. move DataBlockEncoding and related classes to hbase-common module - Key: HBASE-6226 URL: https://issues.apache.org/jira/browse/HBASE-6226 Project: HBase Issue Type: Improvement Components: io, regionserver Affects Versions: 0.96.0 Reporter: Matt Corgan Assignee: Matt Corgan Fix For: 0.96.0 Attachments: HBASE-6226-v1.patch, HBASE-6226-v2.patch, HBASE-6226-v3.patch In order to isolate the implementation details of HBASE-4676 (PrefixTrie encoding) and other DataBlockEncoders by putting them in modules, this pulls up the DataBlockEncoding related interfaces into hbase-common. No tests are moved in this patch. The only notable change was trimming a few dependencies on HFileBlock which adds dependencies to much of the regionserver. The test suite passes locally for me. I tried to keep it as simple as possible... let me know if there are any concerns. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6285) HBase master should log INFO message when it attempts to assign a region
[ https://issues.apache.org/jira/browse/HBASE-6285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aditya Kishore updated HBASE-6285: -- Fix Version/s: 0.96.0 Status: Patch Available (was: Open) The attached patch is for trunk. HBase master should log INFO message when it attempts to assign a region Key: HBASE-6285 URL: https://issues.apache.org/jira/browse/HBASE-6285 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.94.0 Reporter: Aditya Kishore Assignee: Aditya Kishore Priority: Minor Fix For: 0.96.0 Attachments: HBASE-6285_trunk.patch With the default logging level (INFO), it is very difficult to diagnose a large HBase cluster that is having problems assigning regions because the HBase master logs a DEBUG message when it instructs a region-server to assign a region. You actually have to crawl EVERY HBase region-server log to find out which node received the request for a particular region. Further, lets say the HBase master sends the request and something goes wrong, we might not even get a message in the region-server log. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6261) Better approximate high-percentile percentile latency metrics
[ https://issues.apache.org/jira/browse/HBASE-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403456#comment-13403456 ] Zhihong Ted Yu commented on HBASE-6261: --- Makes sense. Looking forward to the patch. Better approximate high-percentile percentile latency metrics - Key: HBASE-6261 URL: https://issues.apache.org/jira/browse/HBASE-6261 Project: HBase Issue Type: New Feature Reporter: Andrew Wang Labels: metrics Attachments: Latencyestimation.pdf The existing reservoir-sampling based latency metrics in HBase are not well-suited for providing accurate estimates of high-percentile (e.g. 90th, 95th, or 99th) latency. This is a well-studied problem in the literature (see [1] and [2]), the question is determining which methods best suit our needs and then implementing it. Ideally, we should be able to estimate these high percentiles with minimal memory and CPU usage as well as minimal error (e.g. 1% error on 90th, or .1% on 99th). It's also desirable to provide this over different time-based sliding windows, e.g. last 1 min, 5 mins, 15 mins, and 1 hour. I'll note that this would also be useful in HDFS, or really anywhere latency metrics are kept. [1] http://www.cs.rutgers.edu/~muthu/bquant.pdf [2] http://infolab.stanford.edu/~manku/papers/04pods-sliding.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6274) Proto files should be in the same palce
[ https://issues.apache.org/jira/browse/HBASE-6274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403466#comment-13403466 ] Hudson commented on HBASE-6274: --- Integrated in HBase-TRUNK #3084 (See [https://builds.apache.org/job/HBase-TRUNK/3084/]) HBASE-6274 Proto files should be in the same palce (Revision 1355129) Result = SUCCESS jxiang : Files : * /hbase/trunk/hbase-server/src/main/protobuf/Admin.proto * /hbase/trunk/hbase-server/src/main/protobuf/Client.proto * /hbase/trunk/hbase-server/src/main/protobuf/RegionServerStatus.proto * /hbase/trunk/hbase-server/src/protobuf Proto files should be in the same palce --- Key: HBASE-6274 URL: https://issues.apache.org/jira/browse/HBASE-6274 Project: HBase Issue Type: Improvement Affects Versions: 0.96.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Trivial Fix For: 0.96.0 Attachments: 6274-hbase.patch Currently, proto files are under hbase-server/src/main/protobuf and hbase-server/src/protobuf. It's better to put them together. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6226) move DataBlockEncoding and related classes to hbase-common module
[ https://issues.apache.org/jira/browse/HBASE-6226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403471#comment-13403471 ] Hadoop QA commented on HBASE-6226: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12533882/HBASE-6226-v3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2284//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2284//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2284//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2284//console This message is automatically generated. move DataBlockEncoding and related classes to hbase-common module - Key: HBASE-6226 URL: https://issues.apache.org/jira/browse/HBASE-6226 Project: HBase Issue Type: Improvement Components: io, regionserver Affects Versions: 0.96.0 Reporter: Matt Corgan Assignee: Matt Corgan Fix For: 0.96.0 Attachments: HBASE-6226-v1.patch, HBASE-6226-v2.patch, HBASE-6226-v3.patch In order to isolate the implementation details of HBASE-4676 (PrefixTrie encoding) and other DataBlockEncoders by putting them in modules, this pulls up the DataBlockEncoding related interfaces into hbase-common. No tests are moved in this patch. The only notable change was trimming a few dependencies on HFileBlock which adds dependencies to much of the regionserver. The test suite passes locally for me. I tried to keep it as simple as possible... let me know if there are any concerns. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6285) HBase master should log INFO message when it attempts to assign a region
[ https://issues.apache.org/jira/browse/HBASE-6285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403478#comment-13403478 ] Hadoop QA commented on HBASE-6285: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12533708/HBASE-6285_trunk.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestAtomicOperation Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2285//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2285//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2285//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2285//console This message is automatically generated. HBase master should log INFO message when it attempts to assign a region Key: HBASE-6285 URL: https://issues.apache.org/jira/browse/HBASE-6285 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.94.0 Reporter: Aditya Kishore Assignee: Aditya Kishore Priority: Minor Fix For: 0.96.0 Attachments: HBASE-6285_trunk.patch With the default logging level (INFO), it is very difficult to diagnose a large HBase cluster that is having problems assigning regions because the HBase master logs a DEBUG message when it instructs a region-server to assign a region. You actually have to crawl EVERY HBase region-server log to find out which node received the request for a particular region. Further, lets say the HBase master sends the request and something goes wrong, we might not even get a message in the region-server log. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6226) move DataBlockEncoding and related classes to hbase-common module
[ https://issues.apache.org/jira/browse/HBASE-6226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Corgan updated HBASE-6226: --- Attachment: HBASE-6226-v4.patch Thanks for the review Ted. Attaching v4 patch with license year removed move DataBlockEncoding and related classes to hbase-common module - Key: HBASE-6226 URL: https://issues.apache.org/jira/browse/HBASE-6226 Project: HBase Issue Type: Improvement Components: io, regionserver Affects Versions: 0.96.0 Reporter: Matt Corgan Assignee: Matt Corgan Fix For: 0.96.0 Attachments: HBASE-6226-v1.patch, HBASE-6226-v2.patch, HBASE-6226-v3.patch, HBASE-6226-v4.patch In order to isolate the implementation details of HBASE-4676 (PrefixTrie encoding) and other DataBlockEncoders by putting them in modules, this pulls up the DataBlockEncoding related interfaces into hbase-common. No tests are moved in this patch. The only notable change was trimming a few dependencies on HFileBlock which adds dependencies to much of the regionserver. The test suite passes locally for me. I tried to keep it as simple as possible... let me know if there are any concerns. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6226) move DataBlockEncoding and related classes to hbase-common module
[ https://issues.apache.org/jira/browse/HBASE-6226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403533#comment-13403533 ] Zhihong Ted Yu commented on HBASE-6226: --- Thanks for the quick turn around. Will wait for 2 days to see if Mikhail has further comments. move DataBlockEncoding and related classes to hbase-common module - Key: HBASE-6226 URL: https://issues.apache.org/jira/browse/HBASE-6226 Project: HBase Issue Type: Improvement Components: io, regionserver Affects Versions: 0.96.0 Reporter: Matt Corgan Assignee: Matt Corgan Fix For: 0.96.0 Attachments: HBASE-6226-v1.patch, HBASE-6226-v2.patch, HBASE-6226-v3.patch, HBASE-6226-v4.patch In order to isolate the implementation details of HBASE-4676 (PrefixTrie encoding) and other DataBlockEncoders by putting them in modules, this pulls up the DataBlockEncoding related interfaces into hbase-common. No tests are moved in this patch. The only notable change was trimming a few dependencies on HFileBlock which adds dependencies to much of the regionserver. The test suite passes locally for me. I tried to keep it as simple as possible... let me know if there are any concerns. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()
[ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403535#comment-13403535 ] Zhihong Ted Yu commented on HBASE-6284: --- I went over the patch once. It looks good. Introduce HRegion#doMiniBatchDelete() - Key: HBASE-6284 URL: https://issues.apache.org/jira/browse/HBASE-6284 Project: HBase Issue Type: Bug Components: performance, regionserver Reporter: Zhihong Ted Yu Assignee: Anoop Sam John Fix For: 0.96.0, 0.94.1 Attachments: HBASE-6284_Trunk.patch From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion': The HTable#delete(ListDelete) groups the Deletes for the same RS and make one n/w call only. But within the RS, there will be N number of delete calls on the region one by one. This will include N number of HLog write and sync. If this also can be grouped can we get better performance for the multi row delete. I have made the new miniBatchDelete () and made the HTable#delete(ListDelete) to call this new batch delete. Just tested initially with the one node cluster. In that itself I am getting a performance boost which is very much promising. Only one CF and qualifier. 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table from one thread. With the new way the net time taken is reduced by more than 1/10 Will test in a 4 node cluster also. I think it will worth doing this change. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6285) HBase master should log INFO message when it attempts to assign a region
[ https://issues.apache.org/jira/browse/HBASE-6285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403541#comment-13403541 ] Aditya Kishore commented on HBASE-6285: --- The test failures are unrelated to the patch. HBase master should log INFO message when it attempts to assign a region Key: HBASE-6285 URL: https://issues.apache.org/jira/browse/HBASE-6285 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.94.0 Reporter: Aditya Kishore Assignee: Aditya Kishore Priority: Minor Fix For: 0.96.0 Attachments: HBASE-6285_trunk.patch With the default logging level (INFO), it is very difficult to diagnose a large HBase cluster that is having problems assigning regions because the HBase master logs a DEBUG message when it instructs a region-server to assign a region. You actually have to crawl EVERY HBase region-server log to find out which node received the request for a particular region. Further, lets say the HBase master sends the request and something goes wrong, we might not even get a message in the region-server log. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework
[ https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403542#comment-13403542 ] Zhihong Ted Yu commented on HBASE-4050: --- 3 weeks flew by :-) Update HBase metrics framework to metrics2 framework Key: HBASE-4050 URL: https://issues.apache.org/jira/browse/HBASE-4050 Project: HBase Issue Type: New Feature Components: metrics Affects Versions: 0.90.4 Environment: Java 6 Reporter: Eric Yang Priority: Critical Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, and it might get removed in future Hadoop release. Hence, HBase needs to revise the dependency of MetricsContext to use Metrics2 framework. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5955) Guava 11 drops MapEvictionListener and Hadoop 2.0.0-alpha requires it
[ https://issues.apache.org/jira/browse/HBASE-5955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403565#comment-13403565 ] Lars Hofhansl commented on HBASE-5955: -- I find that I get runtime errors with Hadoop-2 (but not compile time errrors): {code} java.lang.NoSuchMethodError: com.google.common.collect.LinkedListMultimap.values()Ljava/util/List; at org.apache.hadoop.hdfs.SocketCache.clear(SocketCache.java:131) at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:604) at org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:506) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2237) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2215) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:302) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:158) at org.apache.hadoop.hbase.fs.HFileSystem.init(HFileSystem.java:62) at org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:974) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:672) at java.lang.Thread.run(Thread.java:619) {code} The signature of LinkedListMultiMap.values() has changed to return a List rather than a collection. That would imply that switching out Guava would fail with the reverse problem in Hadoop-1 and before. Sigh. So this would need to be profile dependent? Guava 11 drops MapEvictionListener and Hadoop 2.0.0-alpha requires it - Key: HBASE-5955 URL: https://issues.apache.org/jira/browse/HBASE-5955 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Andrew Purtell Assignee: Lars Hofhansl Fix For: 0.94.1 Hadoop 2.0.0-alpha depends on Guava 11.0.2. Updating HBase dependencies to match produces the following compilation errors: {code} [ERROR] SingleSizeCache.java:[41,32] cannot find symbol [ERROR] symbol : class MapEvictionListener [ERROR] location: package com.google.common.collect [ERROR] [ERROR] SingleSizeCache.java:[94,4] cannot find symbol [ERROR] symbol : class MapEvictionListener [ERROR] location: class org.apache.hadoop.hbase.io.hfile.slab.SingleSizeCache [ERROR] [ERROR] SingleSizeCache.java:[94,69] cannot find symbol [ERROR] symbol : class MapEvictionListener [ERROR] location: class org.apache.hadoop.hbase.io.hfile.slab.SingleSizeCache {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6226) move DataBlockEncoding and related classes to hbase-common module
[ https://issues.apache.org/jira/browse/HBASE-6226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403568#comment-13403568 ] Mikhail Bautin commented on HBASE-6226: --- Matt: I have submitted the diff to Phabricator as https://reviews.facebook.net/D3891. I will add my comments there. move DataBlockEncoding and related classes to hbase-common module - Key: HBASE-6226 URL: https://issues.apache.org/jira/browse/HBASE-6226 Project: HBase Issue Type: Improvement Components: io, regionserver Affects Versions: 0.96.0 Reporter: Matt Corgan Assignee: Matt Corgan Fix For: 0.96.0 Attachments: HBASE-6226-v1.patch, HBASE-6226-v2.patch, HBASE-6226-v3.patch, HBASE-6226-v4.patch In order to isolate the implementation details of HBASE-4676 (PrefixTrie encoding) and other DataBlockEncoders by putting them in modules, this pulls up the DataBlockEncoding related interfaces into hbase-common. No tests are moved in this patch. The only notable change was trimming a few dependencies on HFileBlock which adds dependencies to much of the regionserver. The test suite passes locally for me. I tried to keep it as simple as possible... let me know if there are any concerns. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6261) Better approximate high-percentile percentile latency metrics
[ https://issues.apache.org/jira/browse/HBASE-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403571#comment-13403571 ] Andrew Wang commented on HBASE-6261: I filed HADOOP-8541, since this is going to be landing in hadoop-common's metrics2. When HBASE-5040 clears, we can look into actually hooking it up in HBase. Better approximate high-percentile percentile latency metrics - Key: HBASE-6261 URL: https://issues.apache.org/jira/browse/HBASE-6261 Project: HBase Issue Type: New Feature Reporter: Andrew Wang Labels: metrics Attachments: Latencyestimation.pdf The existing reservoir-sampling based latency metrics in HBase are not well-suited for providing accurate estimates of high-percentile (e.g. 90th, 95th, or 99th) latency. This is a well-studied problem in the literature (see [1] and [2]), the question is determining which methods best suit our needs and then implementing it. Ideally, we should be able to estimate these high percentiles with minimal memory and CPU usage as well as minimal error (e.g. 1% error on 90th, or .1% on 99th). It's also desirable to provide this over different time-based sliding windows, e.g. last 1 min, 5 mins, 15 mins, and 1 hour. I'll note that this would also be useful in HDFS, or really anywhere latency metrics are kept. [1] http://www.cs.rutgers.edu/~muthu/bquant.pdf [2] http://infolab.stanford.edu/~manku/papers/04pods-sliding.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6281) Assignment need not be called for disabling table regions during clean cluster start up.
[ https://issues.apache.org/jira/browse/HBASE-6281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Ted Yu updated HBASE-6281: -- Attachment: 6281-trunk-v2.txt Assignment need not be called for disabling table regions during clean cluster start up. Key: HBASE-6281 URL: https://issues.apache.org/jira/browse/HBASE-6281 Project: HBase Issue Type: Bug Affects Versions: 0.92.1, 0.94.0 Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: 6281-trunk-v2.txt, HBASE-6281_94.patch, HBASE-6281_trunk.patch Currently during clean cluster start up if there are tables in DISABLING state, we do bulk assignment through assignAllUserRegions() and after region is OPENED in RS, master checks if the table is in DISBALING/DISABLED state (in Am.regionOnline) and again calls unassign. This roundtrip can be avoided even before calling assignment. This JIRA is to address the above scenario. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6226) move DataBlockEncoding and related classes to hbase-common module
[ https://issues.apache.org/jira/browse/HBASE-6226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403572#comment-13403572 ] Mikhail Bautin commented on HBASE-6226: --- Matt: I have added one comment at https://reviews.facebook.net/D3891 move DataBlockEncoding and related classes to hbase-common module - Key: HBASE-6226 URL: https://issues.apache.org/jira/browse/HBASE-6226 Project: HBase Issue Type: Improvement Components: io, regionserver Affects Versions: 0.96.0 Reporter: Matt Corgan Assignee: Matt Corgan Fix For: 0.96.0 Attachments: HBASE-6226-v1.patch, HBASE-6226-v2.patch, HBASE-6226-v3.patch, HBASE-6226-v4.patch In order to isolate the implementation details of HBASE-4676 (PrefixTrie encoding) and other DataBlockEncoders by putting them in modules, this pulls up the DataBlockEncoding related interfaces into hbase-common. No tests are moved in this patch. The only notable change was trimming a few dependencies on HFileBlock which adds dependencies to much of the regionserver. The test suite passes locally for me. I tried to keep it as simple as possible... let me know if there are any concerns. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6226) move DataBlockEncoding and related classes to hbase-common module
[ https://issues.apache.org/jira/browse/HBASE-6226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403574#comment-13403574 ] Mikhail Bautin commented on HBASE-6226: --- Matt: regarding the Arcanist issue: have you run mvn -Darc initialize? It would download and install .arc_jira_lib in your HBase directory. Phabricator is a more convenient code review system compared to ReviewBoard or reviewing raw patches, because e.g. it correctly identifies renamed files (relevant to this particular diff). move DataBlockEncoding and related classes to hbase-common module - Key: HBASE-6226 URL: https://issues.apache.org/jira/browse/HBASE-6226 Project: HBase Issue Type: Improvement Components: io, regionserver Affects Versions: 0.96.0 Reporter: Matt Corgan Assignee: Matt Corgan Fix For: 0.96.0 Attachments: HBASE-6226-v1.patch, HBASE-6226-v2.patch, HBASE-6226-v3.patch, HBASE-6226-v4.patch In order to isolate the implementation details of HBASE-4676 (PrefixTrie encoding) and other DataBlockEncoders by putting them in modules, this pulls up the DataBlockEncoding related interfaces into hbase-common. No tests are moved in this patch. The only notable change was trimming a few dependencies on HFileBlock which adds dependencies to much of the regionserver. The test suite passes locally for me. I tried to keep it as simple as possible... let me know if there are any concerns. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6261) Better approximate high-percentile percentile latency metrics
[ https://issues.apache.org/jira/browse/HBASE-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403576#comment-13403576 ] Elliott Clark commented on HBASE-6261: -- If it only lands in hadoop are we going to be able to use it at all? Reflection doesn't seem like it's really viable here where we're trying to call the same method on lots of different Histogram objects; it would be pretty slow on top of the perf hit we would be taking for the added accuracy. Can it just replace MetricsHistogram ? Better approximate high-percentile percentile latency metrics - Key: HBASE-6261 URL: https://issues.apache.org/jira/browse/HBASE-6261 Project: HBase Issue Type: New Feature Reporter: Andrew Wang Labels: metrics Attachments: Latencyestimation.pdf The existing reservoir-sampling based latency metrics in HBase are not well-suited for providing accurate estimates of high-percentile (e.g. 90th, 95th, or 99th) latency. This is a well-studied problem in the literature (see [1] and [2]), the question is determining which methods best suit our needs and then implementing it. Ideally, we should be able to estimate these high percentiles with minimal memory and CPU usage as well as minimal error (e.g. 1% error on 90th, or .1% on 99th). It's also desirable to provide this over different time-based sliding windows, e.g. last 1 min, 5 mins, 15 mins, and 1 hour. I'll note that this would also be useful in HDFS, or really anywhere latency metrics are kept. [1] http://www.cs.rutgers.edu/~muthu/bquant.pdf [2] http://infolab.stanford.edu/~manku/papers/04pods-sliding.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5955) Guava 11 drops MapEvictionListener and Hadoop 2.0.0-alpha requires it
[ https://issues.apache.org/jira/browse/HBASE-5955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403580#comment-13403580 ] Andrew Purtell commented on HBASE-5955: --- In the POMs I have on hand for Hadoop 2.0.1 and HBase 0.94, the Guava version for both is 11.0.2 (via HDFS-3187 and HBASE-5739). I'm guessing you have a Hadoop compiled against something older? Guava 11 drops MapEvictionListener and Hadoop 2.0.0-alpha requires it - Key: HBASE-5955 URL: https://issues.apache.org/jira/browse/HBASE-5955 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Andrew Purtell Assignee: Lars Hofhansl Fix For: 0.94.1 Hadoop 2.0.0-alpha depends on Guava 11.0.2. Updating HBase dependencies to match produces the following compilation errors: {code} [ERROR] SingleSizeCache.java:[41,32] cannot find symbol [ERROR] symbol : class MapEvictionListener [ERROR] location: package com.google.common.collect [ERROR] [ERROR] SingleSizeCache.java:[94,4] cannot find symbol [ERROR] symbol : class MapEvictionListener [ERROR] location: class org.apache.hadoop.hbase.io.hfile.slab.SingleSizeCache [ERROR] [ERROR] SingleSizeCache.java:[94,69] cannot find symbol [ERROR] symbol : class MapEvictionListener [ERROR] location: class org.apache.hadoop.hbase.io.hfile.slab.SingleSizeCache {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6261) Better approximate high-percentile percentile latency metrics
[ https://issues.apache.org/jira/browse/HBASE-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403586#comment-13403586 ] Andrew Wang commented on HBASE-6261: I think it'll be usable from common, it's going to be like the existing MutableCounter or MutableStat in that you instantiate it once then call updateMethod() a bunch. Unless HBase does it differently than the datanode, I don't think reflection is used on the hot path of tracking the stream of values, just occasionally to publish it via JMX. Better approximate high-percentile percentile latency metrics - Key: HBASE-6261 URL: https://issues.apache.org/jira/browse/HBASE-6261 Project: HBase Issue Type: New Feature Reporter: Andrew Wang Labels: metrics Attachments: Latencyestimation.pdf The existing reservoir-sampling based latency metrics in HBase are not well-suited for providing accurate estimates of high-percentile (e.g. 90th, 95th, or 99th) latency. This is a well-studied problem in the literature (see [1] and [2]), the question is determining which methods best suit our needs and then implementing it. Ideally, we should be able to estimate these high percentiles with minimal memory and CPU usage as well as minimal error (e.g. 1% error on 90th, or .1% on 99th). It's also desirable to provide this over different time-based sliding windows, e.g. last 1 min, 5 mins, 15 mins, and 1 hour. I'll note that this would also be useful in HDFS, or really anywhere latency metrics are kept. [1] http://www.cs.rutgers.edu/~muthu/bquant.pdf [2] http://infolab.stanford.edu/~manku/papers/04pods-sliding.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6226) move DataBlockEncoding and related classes to hbase-common module
[ https://issues.apache.org/jira/browse/HBASE-6226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403593#comment-13403593 ] Matt Corgan commented on HBASE-6226: Supporting renamed files would be nice. I am running the mvn -Darc initialize. Here is the full output: {code} mcorgan@wyclef:~/hadoop/hbase$ mvn -Darc initialize [INFO] Scanning for projects... [INFO] [INFO] Reactor Build Order: [INFO] [INFO] HBase [INFO] HBase - Common [INFO] HBase - Server [INFO] [INFO] [INFO] Building HBase 0.95-SNAPSHOT [INFO] [INFO] [INFO] [INFO] Building HBase - Common 0.95-SNAPSHOT [INFO] [INFO] [INFO] [INFO] Building HBase - Server 0.95-SNAPSHOT [INFO] [INFO] [INFO] Reactor Summary: [INFO] [INFO] HBase . SUCCESS [0.063s] [INFO] HBase - Common SUCCESS [0.001s] [INFO] HBase - Server SUCCESS [0.000s] [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 0.313s [INFO] Finished at: Thu Jun 28 16:47:22 PDT 2012 [INFO] Final Memory: 5M/480M [INFO] mcorgan@wyclef:~/hadoop/hbase$ arc diff --only Usage Exception: Failed to load phutil library at location '.arc_jira_lib'. This library is specified by the phutil_libraries setting in .arcconfig. Check that the setting is correct and the library is located in the right place. mcorgan@wyclef:~/hadoop/hbase$ .. ..: command not found mcorgan@wyclef:~/hadoop/hbase$ ll total 396 drwxr--r-x 12 mcorgan mcorgan 4096 Jun 25 14:53 ./ drwxrwxr-x 8 mcorgan mcorgan 4096 Jun 28 15:25 ../ -rw-rw-r-- 1 mcorgan mcorgan395 Jun 13 20:06 .arcconfig drwxrw-r-x 4 mcorgan mcorgan 4096 Jun 25 13:17 bin/ -rw-rw-r-- 1 mcorgan mcorgan 261312 Jun 13 20:06 CHANGES.txt drwxrw-r-x 2 mcorgan mcorgan 4096 Jun 13 20:06 conf/ drwxrw-r-x 2 mcorgan mcorgan 4096 Jun 13 20:06 dev-support/ drwxrw-r-x 5 mcorgan mcorgan 4096 Jun 13 20:06 examples/ drwxrw-r-x 8 mcorgan mcorgan 4096 Jun 28 15:25 .git/ -rw-rw-r-- 1 mcorgan mcorgan129 Jun 25 13:31 .gitignore drwxrw-r-x 5 mcorgan mcorgan 4096 Jun 25 14:53 hbase-common/ drwxrw-r-x 5 mcorgan mcorgan 4096 Jun 28 11:46 hbase-server/ -rw-rw-r-- 1 mcorgan mcorgan 11358 Jun 13 20:06 LICENSE.txt -rw-rw-r-- 1 mcorgan mcorgan701 Jun 13 20:06 NOTICE.txt -rw-rw-r-- 1 mcorgan mcorgan 58326 Jun 25 13:17 pom.xml -rw-rw-r-- 1 mcorgan mcorgan368 Jun 13 20:25 .project -rw-rw-r-- 1 mcorgan mcorgan 1358 Jun 13 20:06 README.txt drwxrw-r-x 2 mcorgan mcorgan 4096 Jun 13 20:25 .settings/ drwxrw-r-x 6 mcorgan mcorgan 4096 Jun 25 14:18 src/ drwxr-xr-x 3 mcorgan mcorgan 4096 Jun 25 14:53 target/ mcorgan@wyclef:~/hadoop/hbase$ {code} move DataBlockEncoding and related classes to hbase-common module - Key: HBASE-6226 URL: https://issues.apache.org/jira/browse/HBASE-6226 Project: HBase Issue Type: Improvement Components: io, regionserver Affects Versions: 0.96.0 Reporter: Matt Corgan Assignee: Matt Corgan Fix For: 0.96.0 Attachments: HBASE-6226-v1.patch, HBASE-6226-v2.patch, HBASE-6226-v3.patch, HBASE-6226-v4.patch In order to isolate the implementation details of HBASE-4676 (PrefixTrie encoding) and other DataBlockEncoders by putting them in modules, this pulls up the DataBlockEncoding related interfaces into hbase-common. No tests are moved in this patch. The only notable change was trimming a few dependencies on HFileBlock which adds dependencies to much of the regionserver. The test suite passes locally for me. I tried to keep it as simple as possible... let me know if there are any concerns. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA
[jira] [Commented] (HBASE-6261) Better approximate high-percentile percentile latency metrics
[ https://issues.apache.org/jira/browse/HBASE-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403595#comment-13403595 ] Elliott Clark commented on HBASE-6261: -- But we won't be able to require the new version of hadoop that would contain the code for quite a while. So we would have to keep our current Histogram implementation, use reflection to see if hadoop jars contain UberHistogram(or whatever you plan on calling it), if so use reflection to interact with it. Better approximate high-percentile percentile latency metrics - Key: HBASE-6261 URL: https://issues.apache.org/jira/browse/HBASE-6261 Project: HBase Issue Type: New Feature Reporter: Andrew Wang Labels: metrics Attachments: Latencyestimation.pdf The existing reservoir-sampling based latency metrics in HBase are not well-suited for providing accurate estimates of high-percentile (e.g. 90th, 95th, or 99th) latency. This is a well-studied problem in the literature (see [1] and [2]), the question is determining which methods best suit our needs and then implementing it. Ideally, we should be able to estimate these high percentiles with minimal memory and CPU usage as well as minimal error (e.g. 1% error on 90th, or .1% on 99th). It's also desirable to provide this over different time-based sliding windows, e.g. last 1 min, 5 mins, 15 mins, and 1 hour. I'll note that this would also be useful in HDFS, or really anywhere latency metrics are kept. [1] http://www.cs.rutgers.edu/~muthu/bquant.pdf [2] http://infolab.stanford.edu/~manku/papers/04pods-sliding.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6274) Proto files should be in the same palce
[ https://issues.apache.org/jira/browse/HBASE-6274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403596#comment-13403596 ] Hudson commented on HBASE-6274: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #73 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/73/]) HBASE-6274 Proto files should be in the same palce (Revision 1355129) Result = FAILURE jxiang : Files : * /hbase/trunk/hbase-server/src/main/protobuf/Admin.proto * /hbase/trunk/hbase-server/src/main/protobuf/Client.proto * /hbase/trunk/hbase-server/src/main/protobuf/RegionServerStatus.proto * /hbase/trunk/hbase-server/src/protobuf Proto files should be in the same palce --- Key: HBASE-6274 URL: https://issues.apache.org/jira/browse/HBASE-6274 Project: HBase Issue Type: Improvement Affects Versions: 0.96.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Trivial Fix For: 0.96.0 Attachments: 6274-hbase.patch Currently, proto files are under hbase-server/src/main/protobuf and hbase-server/src/protobuf. It's better to put them together. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6281) Assignment need not be called for disabling table regions during clean cluster start up.
[ https://issues.apache.org/jira/browse/HBASE-6281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403603#comment-13403603 ] Hadoop QA commented on HBASE-6281: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12533906/6281-trunk-v2.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestAtomicOperation org.apache.hadoop.hbase.replication.TestReplication Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2287//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2287//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2287//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2287//console This message is automatically generated. Assignment need not be called for disabling table regions during clean cluster start up. Key: HBASE-6281 URL: https://issues.apache.org/jira/browse/HBASE-6281 Project: HBase Issue Type: Bug Affects Versions: 0.92.1, 0.94.0 Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: 6281-trunk-v2.txt, HBASE-6281_94.patch, HBASE-6281_trunk.patch Currently during clean cluster start up if there are tables in DISABLING state, we do bulk assignment through assignAllUserRegions() and after region is OPENED in RS, master checks if the table is in DISBALING/DISABLED state (in Am.regionOnline) and again calls unassign. This roundtrip can be avoided even before calling assignment. This JIRA is to address the above scenario. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6289) ROOT region doesn't get re-assigned in ServerShutdownHandler if the RS is still working but only the RS's ZK node expires.
[ https://issues.apache.org/jira/browse/HBASE-6289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403613#comment-13403613 ] Maryann Xue commented on HBASE-6289: @stack Thanks for the comments! if getRootServerLocation() returns null, verifyRootRegionLocation() will return false, so assignRoot() can be called. thus, verifyAndAssignRoot() returns with success and there won't be a loop or retry here. {code} if (!this.server.getCatalogTracker().verifyRootRegionLocation(timeout, this.serverName)) { this.services.getAssignmentManager().assignRoot(); } {code} I think ramkrishna was asking why we only verify root before trying to assign it while we directly assign META? that's my question as well. ROOT region doesn't get re-assigned in ServerShutdownHandler if the RS is still working but only the RS's ZK node expires. -- Key: HBASE-6289 URL: https://issues.apache.org/jira/browse/HBASE-6289 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.6, 0.94.0 Reporter: Maryann Xue Assignee: Maryann Xue Priority: Critical Attachments: HBASE-6289.patch The ROOT RS has some network problem and its ZK node expires first, which kicks off the ServerShutdownHandler. it calls verifyAndAssignRoot() to try to re-assign ROOT. At that time, the RS is actually still working and passes the verifyRootRegionLocation() check, so the ROOT region is skipped from re-assignment. {code} private void verifyAndAssignRoot() throws InterruptedException, IOException, KeeperException { long timeout = this.server.getConfiguration(). getLong(hbase.catalog.verification.timeout, 1000); if (!this.server.getCatalogTracker().verifyRootRegionLocation(timeout)) { this.services.getAssignmentManager().assignRoot(); } } {code} After a few moments, this RS encounters DFS write problem and decides to abort. The RS then soon gets restarted from commandline, and constantly report: {code} 2012-06-27 23:13:08,627 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,627 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,628 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,628 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,630 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira