[jira] [Commented] (HBASE-4927) CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty
[ https://issues.apache.org/jira/browse/HBASE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164260#comment-13164260 ] Hudson commented on HBASE-4927: --- Integrated in HBase-0.92-security #32 (See [https://builds.apache.org/job/HBase-0.92-security/32/]) HBASE-4927 CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty stack : Files : * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/HRegionInfo.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/CatalogJanitor.java * /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java * /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty --- Key: HBASE-4927 URL: https://issues.apache.org/jira/browse/HBASE-4927 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0, 0.94.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Fix For: 0.92.0 Attachments: 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator-.patch, 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator_v2.patch, hbase-4927-fix-ws.txt When reviewing HBASE-4238 backporting, Jon found this issue. What happens if the split points are (empty end key is the last key, empty start key is the first key) Parent [A,) L daughter [A,B), R daughter [B,) When sorted, we gets to end key comparision which results in this incorrector order: [A,B), [A,), [B,) we wanted: [A,), [A,B), [B,) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4729) Clash between region unassign and splitting kills the master
[ https://issues.apache.org/jira/browse/HBASE-4729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164259#comment-13164259 ] Hudson commented on HBASE-4729: --- Integrated in HBase-0.92-security #32 (See [https://builds.apache.org/job/HBase-0.92-security/32/]) HBASE-4729 Clash between region unassign and splitting kills the master stack : Files : * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKTable.java Clash between region unassign and splitting kills the master Key: HBASE-4729 URL: https://issues.apache.org/jira/browse/HBASE-4729 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Assignee: stack Priority: Critical Fix For: 0.92.0, 0.94.0 Attachments: 4729-v2.txt, 4729-v3.txt, 4729-v4.txt, 4729-v5.txt, 4729-v6-092.txt, 4729-v6-trunk.txt, 4729.txt I was running an online alter while regions were splitting, and suddenly the master died and left my table half-altered (haven't restarted the master yet). What killed the master: {quote} 2011-11-02 17:06:44,428 FATAL org.apache.hadoop.hbase.master.HMaster: Unexpected ZK exception creating node CLOSING org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = NodeExists for /hbase/unassigned/f7e1783e65ea8d621a4bc96ad310f101 at org.apache.zookeeper.KeeperException.create(KeeperException.java:110) at org.apache.zookeeper.KeeperException.create(KeeperException.java:42) at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:459) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:441) at org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:769) at org.apache.hadoop.hbase.zookeeper.ZKAssign.createNodeClosing(ZKAssign.java:568) at org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1722) at org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1661) at org.apache.hadoop.hbase.master.BulkReOpen$1.run(BulkReOpen.java:69) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {quote} A znode was created because the region server was splitting the region 4 seconds before: {quote} 2011-11-02 17:06:40,704 INFO org.apache.hadoop.hbase.regionserver.SplitTransaction: Starting split of region TestTable,0012469153,1320253135043.f7e1783e65ea8d621a4bc96ad310f101. 2011-11-02 17:06:40,704 DEBUG org.apache.hadoop.hbase.regionserver.SplitTransaction: regionserver:62023-0x132f043bbde0710 Creating ephemeral node for f7e1783e65ea8d621a4bc96ad310f101 in SPLITTING state 2011-11-02 17:06:40,751 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:62023-0x132f043bbde0710 Attempting to transition node f7e1783e65ea8d621a4bc96ad310f101 from RS_ZK_REGION_SPLITTING to RS_ZK_REGION_SPLITTING ... 2011-11-02 17:06:44,061 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:62023-0x132f043bbde0710 Successfully transitioned node f7e1783e65ea8d621a4bc96ad310f101 from RS_ZK_REGION_SPLITTING to RS_ZK_REGION_SPLIT 2011-11-02 17:06:44,061 INFO org.apache.hadoop.hbase.regionserver.SplitTransaction: Still waiting on the master to process the split for f7e1783e65ea8d621a4bc96ad310f101 {quote} Now that the master is dead the region server is spewing those last two lines like mad. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4970) Add a parameter to change keepAliveTime of Htable thread pool.
[ https://issues.apache.org/jira/browse/HBASE-4970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaojinchao updated HBASE-4970: -- Attachment: HBASE-4970_Branch90.patch Add a parameter to change keepAliveTime of Htable thread pool. --- Key: HBASE-4970 URL: https://issues.apache.org/jira/browse/HBASE-4970 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.90.4 Reporter: gaojinchao Assignee: gaojinchao Priority: Trivial Fix For: 0.90.5 Attachments: HBASE-4970_Branch90.patch In my cluster, I changed keepAliveTime from 60 s to 3600 s. Increasing RES is slowed down. Why increasing keepAliveTime of HBase thread pool is slowing down our problem occurance [RES value increase]? You can go through the source of sun.nio.ch.Util. Every thread hold 3 softreference of direct buffer(mustangsrc) for reusage. The code names the 3 softreferences buffercache. If the buffer was all occupied or none was suitable in size, and new request comes, new direct buffer is allocated. After the service, the bigger one replaces the smaller one in buffercache. The replaced buffer is released. So I think we can add a parameter to change keepAliveTime of Htable thread pool. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4970) Add a parameter to change keepAliveTime of Htable thread pool.
[ https://issues.apache.org/jira/browse/HBASE-4970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164274#comment-13164274 ] Ted Yu commented on HBASE-4970: --- I think there shouldn't be upper case letters in name of new config. Add a parameter to change keepAliveTime of Htable thread pool. --- Key: HBASE-4970 URL: https://issues.apache.org/jira/browse/HBASE-4970 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.90.4 Reporter: gaojinchao Assignee: gaojinchao Priority: Trivial Fix For: 0.90.5 Attachments: HBASE-4970_Branch90.patch In my cluster, I changed keepAliveTime from 60 s to 3600 s. Increasing RES is slowed down. Why increasing keepAliveTime of HBase thread pool is slowing down our problem occurance [RES value increase]? You can go through the source of sun.nio.ch.Util. Every thread hold 3 softreference of direct buffer(mustangsrc) for reusage. The code names the 3 softreferences buffercache. If the buffer was all occupied or none was suitable in size, and new request comes, new direct buffer is allocated. After the service, the bigger one replaces the smaller one in buffercache. The replaced buffer is released. So I think we can add a parameter to change keepAliveTime of Htable thread pool. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4970) Add a parameter to change keepAliveTime of Htable thread pool.
[ https://issues.apache.org/jira/browse/HBASE-4970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164276#comment-13164276 ] gaojinchao commented on HBASE-4970: --- Sorry, I didn't see the Lars's comment. I will try to backport HBASE-4805. Add a parameter to change keepAliveTime of Htable thread pool. --- Key: HBASE-4970 URL: https://issues.apache.org/jira/browse/HBASE-4970 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.90.4 Reporter: gaojinchao Assignee: gaojinchao Priority: Trivial Fix For: 0.90.5 Attachments: HBASE-4970_Branch90.patch In my cluster, I changed keepAliveTime from 60 s to 3600 s. Increasing RES is slowed down. Why increasing keepAliveTime of HBase thread pool is slowing down our problem occurance [RES value increase]? You can go through the source of sun.nio.ch.Util. Every thread hold 3 softreference of direct buffer(mustangsrc) for reusage. The code names the 3 softreferences buffercache. If the buffer was all occupied or none was suitable in size, and new request comes, new direct buffer is allocated. After the service, the bigger one replaces the smaller one in buffercache. The replaced buffer is released. So I think we can add a parameter to change keepAliveTime of Htable thread pool. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4927) CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty
[ https://issues.apache.org/jira/browse/HBASE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164281#comment-13164281 ] Hudson commented on HBASE-4927: --- Integrated in HBase-TRUNK-security #24 (See [https://builds.apache.org/job/HBase-TRUNK-security/24/]) HBASE-4927 CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty stack : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HRegionInfo.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/CatalogJanitor.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty --- Key: HBASE-4927 URL: https://issues.apache.org/jira/browse/HBASE-4927 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0, 0.94.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Fix For: 0.92.0 Attachments: 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator-.patch, 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator_v2.patch, hbase-4927-fix-ws.txt When reviewing HBASE-4238 backporting, Jon found this issue. What happens if the split points are (empty end key is the last key, empty start key is the first key) Parent [A,) L daughter [A,B), R daughter [B,) When sorted, we gets to end key comparision which results in this incorrector order: [A,B), [A,), [B,) we wanted: [A,), [A,B), [B,) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4729) Clash between region unassign and splitting kills the master
[ https://issues.apache.org/jira/browse/HBASE-4729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164280#comment-13164280 ] Hudson commented on HBASE-4729: --- Integrated in HBase-TRUNK-security #24 (See [https://builds.apache.org/job/HBase-TRUNK-security/24/]) HBASE-4729 Clash between region unassign and splitting kills the master stack : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKTable.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java Clash between region unassign and splitting kills the master Key: HBASE-4729 URL: https://issues.apache.org/jira/browse/HBASE-4729 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Assignee: stack Priority: Critical Fix For: 0.92.0, 0.94.0 Attachments: 4729-v2.txt, 4729-v3.txt, 4729-v4.txt, 4729-v5.txt, 4729-v6-092.txt, 4729-v6-trunk.txt, 4729.txt I was running an online alter while regions were splitting, and suddenly the master died and left my table half-altered (haven't restarted the master yet). What killed the master: {quote} 2011-11-02 17:06:44,428 FATAL org.apache.hadoop.hbase.master.HMaster: Unexpected ZK exception creating node CLOSING org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = NodeExists for /hbase/unassigned/f7e1783e65ea8d621a4bc96ad310f101 at org.apache.zookeeper.KeeperException.create(KeeperException.java:110) at org.apache.zookeeper.KeeperException.create(KeeperException.java:42) at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:459) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:441) at org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:769) at org.apache.hadoop.hbase.zookeeper.ZKAssign.createNodeClosing(ZKAssign.java:568) at org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1722) at org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1661) at org.apache.hadoop.hbase.master.BulkReOpen$1.run(BulkReOpen.java:69) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {quote} A znode was created because the region server was splitting the region 4 seconds before: {quote} 2011-11-02 17:06:40,704 INFO org.apache.hadoop.hbase.regionserver.SplitTransaction: Starting split of region TestTable,0012469153,1320253135043.f7e1783e65ea8d621a4bc96ad310f101. 2011-11-02 17:06:40,704 DEBUG org.apache.hadoop.hbase.regionserver.SplitTransaction: regionserver:62023-0x132f043bbde0710 Creating ephemeral node for f7e1783e65ea8d621a4bc96ad310f101 in SPLITTING state 2011-11-02 17:06:40,751 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:62023-0x132f043bbde0710 Attempting to transition node f7e1783e65ea8d621a4bc96ad310f101 from RS_ZK_REGION_SPLITTING to RS_ZK_REGION_SPLITTING ... 2011-11-02 17:06:44,061 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:62023-0x132f043bbde0710 Successfully transitioned node f7e1783e65ea8d621a4bc96ad310f101 from RS_ZK_REGION_SPLITTING to RS_ZK_REGION_SPLIT 2011-11-02 17:06:44,061 INFO org.apache.hadoop.hbase.regionserver.SplitTransaction: Still waiting on the master to process the split for f7e1783e65ea8d621a4bc96ad310f101 {quote} Now that the master is dead the region server is spewing those last two lines like mad. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4968) Add to troubleshooting workaround for direct buffer oome's.
[ https://issues.apache.org/jira/browse/HBASE-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164278#comment-13164278 ] Hudson commented on HBASE-4968: --- Integrated in HBase-TRUNK-security #24 (See [https://builds.apache.org/job/HBase-TRUNK-security/24/]) HBASE-4968 Add to troubleshooting workaround for direct buffer oome's. stack : Files : * /hbase/trunk/src/docbkx/troubleshooting.xml Add to troubleshooting workaround for direct buffer oome's. --- Key: HBASE-4968 URL: https://issues.apache.org/jira/browse/HBASE-4968 Project: HBase Issue Type: Task Reporter: stack Assignee: stack Fix For: 0.94.0 Attachments: client.oome.txt Put into book workaround arrived at up on list discussing client oome. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4376) Document login configuration when running on top of secure Hadoop with Kerberos auth enabled
[ https://issues.apache.org/jira/browse/HBASE-4376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164282#comment-13164282 ] Hudson commented on HBASE-4376: --- Integrated in HBase-TRUNK-security #24 (See [https://builds.apache.org/job/HBase-TRUNK-security/24/]) HBASE-4376 Document mutual authentication between HBase and Zookeeper using SASL stack : Files : * /hbase/trunk/src/docbkx/configuration.xml Document login configuration when running on top of secure Hadoop with Kerberos auth enabled Key: HBASE-4376 URL: https://issues.apache.org/jira/browse/HBASE-4376 Project: HBase Issue Type: Task Components: documentation, security Affects Versions: 0.90.4 Reporter: Gary Helmling We provide basic support for HBase to run on top of kerberos-authenticated Hadoop, by providing configuration options to have HMaster and HRegionServer login from a keytab on startup. But this isn't documented anywhere outside of hbase-default.xml. We need to provide some basic guidance on setup in the HBase docs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4712) Document rules for writing tests
[ https://issues.apache.org/jira/browse/HBASE-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164279#comment-13164279 ] Hudson commented on HBASE-4712: --- Integrated in HBase-TRUNK-security #24 (See [https://builds.apache.org/job/HBase-TRUNK-security/24/]) HBASE-4712 Document rules for writing tests stack : Files : * /hbase/trunk/src/docbkx/developer.xml Document rules for writing tests Key: HBASE-4712 URL: https://issues.apache.org/jira/browse/HBASE-4712 Project: HBase Issue Type: Task Components: test Affects Versions: 0.92.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.94.0 Attachments: 4712.txt We saw that some tests could be improved. Documenting the general rules could help. Proposal: HBase tests are divided in three categories: small, medium and large, with corresponding JUnit categories: SmallTest, MediumTest, LargeTest Small tests are executed in parallel in a shared JVM. They must last less than 15 seconds. They must NOT use a cluster. Medium tests are executed in separate JVM. They must last less than 50 seconds. They can use a cluster. They must not fail occasionally. Small and medium tests must not need more than 30 minutes to run altogether. Small and medium tests should be executed by the developers before submitting a patch. Large tests are everything else. They are typically integration tests, non-regression tests for specific bugs, timeout tests, performance tests. Tests rules hints are: - As most as possible, tests should be written as small tests. - All tests should be written to support parallel execution on the same machine, hence should not use shared resources as fixed ports or fixed file names. - All tests should be written to be as fast as possible. - Tests should not overlog. More than 100 lines/second makes the logs complex to read and use i/o that are hence not available for the other tests. - Tests can be written with HBaseTestingUtility . This class offers helper function to create a temp directory and do the cleanup, or to start a cluster. - Sleeps: - Tests should not do a 'Thread.sleep' without testing an ending condition. This allows understanding what the test is waiting for. Moreover, the test will work whatever the machine performances. - Sleep should be minimal to be as fast as possible. Waiting for a variable should be done in a 40ms sleep loop. Waiting for a socket operation should be done in a 200 ms sleep loop. - Tests using cluster: - Tests using a HRegion do not have to start a cluster: A region can use the local file system. - Start/stopping a cluster cost around 10 seconds. They should not be started per test method but per class. - Started cluster must be shutdown using HBaseTestingUtility#shutdownMiniCluster, which cleans the directories. - As most as possible, tests should use the default settings for the cluster. When they don't, they should document it. This will allow to share the cluster later. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4964) Add builddate, make less sections in toc, and add header and footer customizations
[ https://issues.apache.org/jira/browse/HBASE-4964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164283#comment-13164283 ] Hudson commented on HBASE-4964: --- Integrated in HBase-TRUNK-security #24 (See [https://builds.apache.org/job/HBase-TRUNK-security/24/]) HBASE-4964 Add builddate, make less sections in toc, and add header and footer customizations stack : Files : * /hbase/trunk/pom.xml * /hbase/trunk/src/docbkx/book.xml * /hbase/trunk/src/docbkx/customization.xsl Add builddate, make less sections in toc, and add header and footer customizations -- Key: HBASE-4964 URL: https://issues.apache.org/jira/browse/HBASE-4964 Project: HBase Issue Type: Improvement Reporter: stack Fix For: 0.94.0 Attachments: 4964.txt The customizations are for adding facebook comments. I tried it but not working for me immediately; need some xsl jujitsu so I can get name of current page into the current footer. Added a buildDate define in iso-8601 to the pom used in 'reference guide' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4936) Cached HRegionInterface connections crash when getting UnknownHost exceptions
[ https://issues.apache.org/jira/browse/HBASE-4936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164284#comment-13164284 ] Hudson commented on HBASE-4936: --- Integrated in HBase-TRUNK-security #24 (See [https://builds.apache.org/job/HBase-TRUNK-security/24/]) HBASE-4936 Cached HRegionInterface connections crash when getting UnknownHost exceptions stack : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java Cached HRegionInterface connections crash when getting UnknownHost exceptions - Key: HBASE-4936 URL: https://issues.apache.org/jira/browse/HBASE-4936 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.92.0 Reporter: Andrei Dragomir Assignee: Andrei Dragomir Fix For: 0.94.0 Attachments: HBASE-4936-v2.patch, HBASE-4936.patch This isssue is unlikely to come up in a cluster test case. However, for development, the following thing happens: 1. Start the HBase cluster locally, on network A (DNS A, etc) 2. The region locations are cached using the hostname (mycomputer.company.com, 211.x.y.z - real ip) 3. Change network location (go home) 4. Start the HBase cluster locally. My hostname / ips are not different (mycomputer, 192.168.0.130 - new ip) If the region locations have been cached using the hostname, there is an UnknownHostException in CatalogTracker.getCachedConnection(ServerName sn), uncaught in the catch statements. The server will crash constantly. The error should be caught and not rethrown, so that the cached connection expires normally. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4937) Error in Quick Start Shell Exercises
[ https://issues.apache.org/jira/browse/HBASE-4937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164285#comment-13164285 ] Hudson commented on HBASE-4937: --- Integrated in HBase-TRUNK-security #24 (See [https://builds.apache.org/job/HBase-TRUNK-security/24/]) HBASE-4937 Error in Quick Start Shell Exercises stack : Files : * /hbase/trunk/src/docbkx/getting_started.xml Error in Quick Start Shell Exercises Key: HBASE-4937 URL: https://issues.apache.org/jira/browse/HBASE-4937 Project: HBase Issue Type: Bug Components: documentation Reporter: Ryan Berdeen Assignee: stack Fix For: 0.94.0 Attachments: 4937.txt The shell exercises in the Quick Start (http://hbase.apache.org/book/quickstart.html) starts {code} hbase(main):003:0 create 'test', 'cf' 0 row(s) in 1.2200 seconds hbase(main):003:0 list 'table' test 1 row(s) in 0.0550 seconds {code} It looks like the second command is wrong. Running it, the actual output is {code} hbase(main):001:0 create 'test', 'cf' 0 row(s) in 0.3630 seconds hbase(main):002:0 list 'table' TABLE 0 row(s) in 0.0100 seconds {code} The argument to list should be 'test', not 'table', and the output in the example is missing the {{TABLE}} line. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests
[ https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-4965: --- Attachment: 4965_all.patch Monitor the open file descriptors and the threads counters during the unit tests Key: HBASE-4965 URL: https://issues.apache.org/jira/browse/HBASE-4965 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 4965_all.patch, ResourceChecker.java, ResourceCheckerJUnitRule.java We're seeing a lot of issues with hadoop-qa related to threads or file descriptors. Monitoring these counters would ease the analysis. Note as well that - if we want to execute the tests in the same jvm (because the test is small or because we want to share the cluster) we can't afford to leak too many resources - if the tests leak, it's more difficult to detect a leak in the software itself. I attach piece of code that I used. It requires two lines in a unit test class to: - before every test, count the threads and the open file descriptor - after every test, compare with the previous value. I ran it on some tests; we have for example: - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 232 threads! - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 283 file descriptors (was 282). - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 294), 815 file descriptors (was 461) - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 148), 310 file descriptors (was 307). It's not always leaks, we can expect some pooling effects. But still... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests
[ https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-4965: --- Status: Patch Available (was: Open) Monitor the open file descriptors and the threads counters during the unit tests Key: HBASE-4965 URL: https://issues.apache.org/jira/browse/HBASE-4965 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 4965_all.patch, ResourceChecker.java, ResourceCheckerJUnitRule.java We're seeing a lot of issues with hadoop-qa related to threads or file descriptors. Monitoring these counters would ease the analysis. Note as well that - if we want to execute the tests in the same jvm (because the test is small or because we want to share the cluster) we can't afford to leak too many resources - if the tests leak, it's more difficult to detect a leak in the software itself. I attach piece of code that I used. It requires two lines in a unit test class to: - before every test, count the threads and the open file descriptor - after every test, compare with the previous value. I ran it on some tests; we have for example: - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 232 threads! - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 283 file descriptors (was 282). - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 294), 815 file descriptors (was 461) - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 148), 310 file descriptors (was 307). It's not always leaks, we can expect some pooling effects. But still... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4971) Useless sleeps in TestTimestampsFilter and TestMultipleTimestamps
Useless sleeps in TestTimestampsFilter and TestMultipleTimestamps - Key: HBASE-4971 URL: https://issues.apache.org/jira/browse/HBASE-4971 Project: HBase Issue Type: Improvement Components: test Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.94.0 Attachments: 4971.patch Comment says Flush tables. Since flushing is asynchronous, sleep for a bit., but the function is synchronous. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4971) Useless sleeps in TestTimestampsFilter and TestMultipleTimestamps
[ https://issues.apache.org/jira/browse/HBASE-4971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-4971: --- Attachment: 4971.patch Useless sleeps in TestTimestampsFilter and TestMultipleTimestamps - Key: HBASE-4971 URL: https://issues.apache.org/jira/browse/HBASE-4971 Project: HBase Issue Type: Improvement Components: test Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.94.0 Attachments: 4971.patch Comment says Flush tables. Since flushing is asynchronous, sleep for a bit., but the function is synchronous. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4970) Add a parameter to change keepAliveTime of Htable thread pool.
[ https://issues.apache.org/jira/browse/HBASE-4970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaojinchao updated HBASE-4970: -- Attachment: HBASE-4970_Branch90_V1_trial.patch Add a parameter to change keepAliveTime of Htable thread pool. --- Key: HBASE-4970 URL: https://issues.apache.org/jira/browse/HBASE-4970 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.90.4 Reporter: gaojinchao Assignee: gaojinchao Priority: Trivial Fix For: 0.90.5 Attachments: HBASE-4970_Branch90.patch, HBASE-4970_Branch90_V1_trial.patch In my cluster, I changed keepAliveTime from 60 s to 3600 s. Increasing RES is slowed down. Why increasing keepAliveTime of HBase thread pool is slowing down our problem occurance [RES value increase]? You can go through the source of sun.nio.ch.Util. Every thread hold 3 softreference of direct buffer(mustangsrc) for reusage. The code names the 3 softreferences buffercache. If the buffer was all occupied or none was suitable in size, and new request comes, new direct buffer is allocated. After the service, the bigger one replaces the smaller one in buffercache. The replaced buffer is released. So I think we can add a parameter to change keepAliveTime of Htable thread pool. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4971) Useless sleeps in TestTimestampsFilter and TestMultipleTimestamps
[ https://issues.apache.org/jira/browse/HBASE-4971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-4971: --- Status: Patch Available (was: Open) Useless sleeps in TestTimestampsFilter and TestMultipleTimestamps - Key: HBASE-4971 URL: https://issues.apache.org/jira/browse/HBASE-4971 Project: HBase Issue Type: Improvement Components: test Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.94.0 Attachments: 4971.patch Comment says Flush tables. Since flushing is asynchronous, sleep for a bit., but the function is synchronous. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4970) Add a parameter to change keepAliveTime of Htable thread pool.
[ https://issues.apache.org/jira/browse/HBASE-4970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164364#comment-13164364 ] gaojinchao commented on HBASE-4970: --- Fixed Lars's comment. @Lars Please review firstly, I will test it in real cluster tomorrow. Add a parameter to change keepAliveTime of Htable thread pool. --- Key: HBASE-4970 URL: https://issues.apache.org/jira/browse/HBASE-4970 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.90.4 Reporter: gaojinchao Assignee: gaojinchao Priority: Trivial Fix For: 0.90.5 Attachments: HBASE-4970_Branch90.patch, HBASE-4970_Branch90_V1_trial.patch In my cluster, I changed keepAliveTime from 60 s to 3600 s. Increasing RES is slowed down. Why increasing keepAliveTime of HBase thread pool is slowing down our problem occurance [RES value increase]? You can go through the source of sun.nio.ch.Util. Every thread hold 3 softreference of direct buffer(mustangsrc) for reusage. The code names the 3 softreferences buffercache. If the buffer was all occupied or none was suitable in size, and new request comes, new direct buffer is allocated. After the service, the bigger one replaces the smaller one in buffercache. The replaced buffer is released. So I think we can add a parameter to change keepAliveTime of Htable thread pool. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4971) Useless sleeps in TestTimestampsFilter and TestMultipleTimestamps
[ https://issues.apache.org/jira/browse/HBASE-4971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164366#comment-13164366 ] Hadoop QA commented on HBASE-4971: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12506434/4971.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/460//console This message is automatically generated. Useless sleeps in TestTimestampsFilter and TestMultipleTimestamps - Key: HBASE-4971 URL: https://issues.apache.org/jira/browse/HBASE-4971 Project: HBase Issue Type: Improvement Components: test Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.94.0 Attachments: 4971.patch Comment says Flush tables. Since flushing is asynchronous, sleep for a bit., but the function is synchronous. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4971) Useless sleeps in TestTimestampsFilter and TestMultipleTimestamps
[ https://issues.apache.org/jira/browse/HBASE-4971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-4971: --- Attachment: 4971_all.v2.patch Useless sleeps in TestTimestampsFilter and TestMultipleTimestamps - Key: HBASE-4971 URL: https://issues.apache.org/jira/browse/HBASE-4971 Project: HBase Issue Type: Improvement Components: test Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.94.0 Attachments: 4971.patch, 4971_all.v2.patch Comment says Flush tables. Since flushing is asynchronous, sleep for a bit., but the function is synchronous. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4971) Useless sleeps in TestTimestampsFilter and TestMultipleTimestamps
[ https://issues.apache.org/jira/browse/HBASE-4971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-4971: --- Status: Patch Available (was: Open) Useless sleeps in TestTimestampsFilter and TestMultipleTimestamps - Key: HBASE-4971 URL: https://issues.apache.org/jira/browse/HBASE-4971 Project: HBase Issue Type: Improvement Components: test Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.94.0 Attachments: 4971.patch, 4971_all.v2.patch Comment says Flush tables. Since flushing is asynchronous, sleep for a bit., but the function is synchronous. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4971) Useless sleeps in TestTimestampsFilter and TestMultipleTimestamps
[ https://issues.apache.org/jira/browse/HBASE-4971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-4971: --- Status: Open (was: Patch Available) Useless sleeps in TestTimestampsFilter and TestMultipleTimestamps - Key: HBASE-4971 URL: https://issues.apache.org/jira/browse/HBASE-4971 Project: HBase Issue Type: Improvement Components: test Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.94.0 Attachments: 4971.patch, 4971_all.v2.patch Comment says Flush tables. Since flushing is asynchronous, sleep for a bit., but the function is synchronous. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests
[ https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164391#comment-13164391 ] Hadoop QA commented on HBASE-4965: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12506433/4965_all.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 755 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -160 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 72 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildHole org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildBase org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildOverlap Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/459//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/459//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/459//console This message is automatically generated. Monitor the open file descriptors and the threads counters during the unit tests Key: HBASE-4965 URL: https://issues.apache.org/jira/browse/HBASE-4965 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 4965_all.patch, ResourceChecker.java, ResourceCheckerJUnitRule.java We're seeing a lot of issues with hadoop-qa related to threads or file descriptors. Monitoring these counters would ease the analysis. Note as well that - if we want to execute the tests in the same jvm (because the test is small or because we want to share the cluster) we can't afford to leak too many resources - if the tests leak, it's more difficult to detect a leak in the software itself. I attach piece of code that I used. It requires two lines in a unit test class to: - before every test, count the threads and the open file descriptor - after every test, compare with the previous value. I ran it on some tests; we have for example: - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 232 threads! - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 283 file descriptors (was 282). - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 294), 815 file descriptors (was 461) - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 148), 310 file descriptors (was 307). It's not always leaks, we can expect some pooling effects. But still... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests
[ https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164396#comment-13164396 ] nkeywal commented on HBASE-4965: First, Hadoop QA seems to be configured with 1024 file descriptors: {noformat} 2011-12-07 13:16:26,184 ERROR [main] hbase.ResourceChecker(122): Bad configuration: the operating systems file handles maximum is 1024 our is 1 {noformat} Monitor the open file descriptors and the threads counters during the unit tests Key: HBASE-4965 URL: https://issues.apache.org/jira/browse/HBASE-4965 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 4965_all.patch, ResourceChecker.java, ResourceCheckerJUnitRule.java We're seeing a lot of issues with hadoop-qa related to threads or file descriptors. Monitoring these counters would ease the analysis. Note as well that - if we want to execute the tests in the same jvm (because the test is small or because we want to share the cluster) we can't afford to leak too many resources - if the tests leak, it's more difficult to detect a leak in the software itself. I attach piece of code that I used. It requires two lines in a unit test class to: - before every test, count the threads and the open file descriptor - after every test, compare with the previous value. I ran it on some tests; we have for example: - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 232 threads! - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 283 file descriptors (was 282). - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 294), 815 file descriptors (was 461) - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 148), 310 file descriptors (was 307). It's not always leaks, we can expect some pooling effects. But still... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests
[ https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164401#comment-13164401 ] nkeywal commented on HBASE-4965: The error seems unrelated to my patch. It the same error for the 3 patches. {noformat} expected:[NOT_IN_META_OR_DEPLOYED, NOT_IN_META_OR_DEPLOYED, NOT_IN_META_OR_DEPLOYED, NOT_IN_META_OR_DEPLOYED] but was:[NOT_IN_META, NOT_IN_META_OR_DEPLOYED, NOT_IN_META_OR_DEPLOYED, NOT_IN_META_OR_DEPLOYED] {noformat} Monitor the open file descriptors and the threads counters during the unit tests Key: HBASE-4965 URL: https://issues.apache.org/jira/browse/HBASE-4965 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 4965_all.patch, ResourceChecker.java, ResourceCheckerJUnitRule.java We're seeing a lot of issues with hadoop-qa related to threads or file descriptors. Monitoring these counters would ease the analysis. Note as well that - if we want to execute the tests in the same jvm (because the test is small or because we want to share the cluster) we can't afford to leak too many resources - if the tests leak, it's more difficult to detect a leak in the software itself. I attach piece of code that I used. It requires two lines in a unit test class to: - before every test, count the threads and the open file descriptor - after every test, compare with the previous value. I ran it on some tests; we have for example: - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 232 threads! - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 283 file descriptors (was 282). - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 294), 815 file descriptors (was 461) - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 148), 310 file descriptors (was 307). It's not always leaks, we can expect some pooling effects. But still... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss
[ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164412#comment-13164412 ] ramkrishna.s.vasudevan commented on HBASE-4880: --- The patch looks fine to me.. Checking the test failures. @Chenhui Have you done some testing after this patch? Nice work Region is on service before completing openRegionHanlder, may cause data loss - Key: HBASE-4880 URL: https://issues.apache.org/jira/browse/HBASE-4880 Project: HBase Issue Type: Bug Reporter: chunhui shen Assignee: chunhui shen Attachments: hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch OpenRegionHandler in regionserver is processed as the following steps: {code} 1.openregion()(Through it, closed = false, closing = false) 2.addToOnlineRegions(region) 3.update .meta. table 4.update ZK's node state to RS_ZK_REGION_OPEND {code} We can find that region is on service before Step 4. It means client could put data to this region after step 3. What will happen if step 4 is failed processing? It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver. If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4971) Useless sleeps in TestTimestampsFilter and TestMultipleTimestamps
[ https://issues.apache.org/jira/browse/HBASE-4971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164414#comment-13164414 ] Hadoop QA commented on HBASE-4971: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12506438/4971_all.v2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -160 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 72 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildHole org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildOverlap org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildBase Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/461//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/461//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/461//console This message is automatically generated. Useless sleeps in TestTimestampsFilter and TestMultipleTimestamps - Key: HBASE-4971 URL: https://issues.apache.org/jira/browse/HBASE-4971 Project: HBase Issue Type: Improvement Components: test Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.94.0 Attachments: 4971.patch, 4971_all.v2.patch Comment says Flush tables. Since flushing is asynchronous, sleep for a bit., but the function is synchronous. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4971) Useless sleeps in TestTimestampsFilter and TestMultipleTimestamps
[ https://issues.apache.org/jira/browse/HBASE-4971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164416#comment-13164416 ] nkeywal commented on HBASE-4971: These 3 tests are not impacted by my change. They're likely to be broken on trunk as well. imho, patch is ok. Useless sleeps in TestTimestampsFilter and TestMultipleTimestamps - Key: HBASE-4971 URL: https://issues.apache.org/jira/browse/HBASE-4971 Project: HBase Issue Type: Improvement Components: test Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.94.0 Attachments: 4971.patch, 4971_all.v2.patch Comment says Flush tables. Since flushing is asynchronous, sleep for a bit., but the function is synchronous. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4972) Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch.
Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch. -- Key: HBASE-4972 URL: https://issues.apache.org/jira/browse/HBASE-4972 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Priority: Critical Fix For: 0.92.0 There are several issues that have been committed in the 0.90 branch but were not in trunk/0.92 branch. These regressions should be forward ported. HBASE-3320 ! HBASE-3380 ! - HBASE-4610 is a jira to backports this, but it is not done. HBASE-3410 ! HBASE-3501 ! HBASE-3714 ! HBASE-3729 !! Maked in 0.92 but not committed there, committed in 0.90 branch. HBASE-3848 ! HBASE-3892 ! * Comments say trunk does not need. HBASE-3906 ! HBASE-3989 ! HBASE-4109 ! HBASE-4160 !! Marked resolved 0.90.5, but no corresponding commit in either 0.90 or 0.92 HBASE-4423 ! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4972) Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch.
[ https://issues.apache.org/jira/browse/HBASE-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164419#comment-13164419 ] Jonathan Hsieh commented on HBASE-4972: --- Good news is that most of these patches are small. Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch. -- Key: HBASE-4972 URL: https://issues.apache.org/jira/browse/HBASE-4972 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Priority: Critical Fix For: 0.92.0 There are several issues that have been committed in the 0.90 branch but were not in trunk/0.92 branch. These regressions should be forward ported. HBASE-3320 ! HBASE-3380 ! - HBASE-4610 is a jira to backports this, but it is not done. HBASE-3410 ! HBASE-3501 ! HBASE-3714 ! HBASE-3729 !! Maked in 0.92 but not committed there, committed in 0.90 branch. HBASE-3848 ! HBASE-3892 ! * Comments say trunk does not need. HBASE-3906 ! HBASE-3989 ! HBASE-4109 ! HBASE-4160 !! Marked resolved 0.90.5, but no corresponding commit in either 0.90 or 0.92 HBASE-4423 ! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests
[ https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164427#comment-13164427 ] nkeywal commented on HBASE-4965: Here are the possible leaks. I am gonna fix some of them in a separate patch. Leaks on SmallTests are critical, because we the JVM is used for multiple tests. This one should be studied: client.TestAdmin#testCheckHBaseAvailableClosesConnection: 523 threads (was 298), 913 file descriptors (was 488). -thread leak?- -file handle leak?- As the limit on hadoop-QA is 1024 open file descriptor, it's not far from hitting this limit. Especially is another test is ran after this one. avro.TestAvroServer#testTableAdminAndMetadata: 140 threads (was 130), 255 file descriptors (was 253). -thread leak?- -file handle leak?- avro.TestAvroServer#testFamilyAdminAndMetadata: 144 threads (was 140), 255 file descriptors (was 255). -thread leak?- avro.TestAvroServer#testDML: 146 threads (was 144), 255 file descriptors (was 255). -thread leak?- catalog.TestCatalogTrackerOnCluster#testBadOriginalRootLocation: 23 threads (was 4), 127 file descriptors (was 70). -thread leak?- -file handle leak?- catalog.TestCatalogTracker#testThatIfMETAMovesWeAreNotified: 9 threads (was 8), 84 file descriptors (was 79). -thread leak?- -file handle leak?- catalog.TestCatalogTracker#testInterruptWaitOnMetaAndRoot: 10 threads (was 9), 86 file descriptors (was 84). -file handle leak?- catalog.TestCatalogTracker#testVerifyRootRegionLocationFails: 11 threads (was 9), 89 file descriptors (was 85). -thread leak?- -file handle leak?- catalog.TestMetaReaderEditorNoCluster#testRideOverServerNotRunning: 7 threads (was 4), 85 file descriptors (was 70). -thread leak?- -file handle leak?- catalog.TestMetaReaderEditor#testGetRegionsCatalogTables: 190 threads (was 185), 360 file descriptors (was 354). -thread leak?- -file handle leak?- catalog.TestMetaReaderEditor#testTableExists: 191 threads (was 187), 365 file descriptors (was 360). -thread leak?- -file handle leak?- catalog.TestMetaReaderEditor#testGetRegion: 193 threads (was 191), 370 file descriptors (was 365). -thread leak?- -file handle leak?- client.TestAdmin#testDeleteEditUnknownColumnFamilyAndOrTable: 254 threads (was 246), 423 file descriptors (was 417). -thread leak?- -file handle leak?- client.TestAdmin#testDisableAndEnableTable: 273 threads (was 254), 452 file descriptors (was 423). -thread leak?- -file handle leak?- client.TestAdmin#testDisableAndEnableTables: 294 threads (was 272), 482 file descriptors (was 452). -thread leak?- -file handle leak?- client.TestAdmin#testCreateTable: 294 threads (was 294), 491 file descriptors (was 482). -file handle leak?- client.TestAdmin#testOnlineChangeTableSchema: 295 threads (was 294), 494 file descriptors (was 491). -thread leak?- -file handle leak?- client.TestAdmin#testCreateTableWithRegions: 296 threads (was 294), 490 file descriptors (was 490). -thread leak?- client.TestAdmin#testTableExist: 297 threads (was 296), 494 file descriptors (was 490). -thread leak?- -file handle leak?- client.TestAdmin#testForceSplit: 303 threads (was 297), 487 file descriptors (was 494). -thread leak?- client.TestAdmin#testForceSplitMultiFamily: 309 threads (was 293), 499 file descriptors (was 464). -thread leak?- -file handle leak?- client.TestAdmin#testEnableDisableAddColumnDeleteColumn: 312 threads (was 309), 505 file descriptors (was 499). -thread leak?- -file handle leak?- client.TestAdmin#testCreateBadTables: 313 threads (was 312), 507 file descriptors (was 505). -thread leak?- -file handle leak?- client.TestAdmin#testCreateTableRPCTimeOut: 312 threads (was 313), 526 file descriptors (was 507). -file handle leak?- client.TestAdmin#testReadOnlyTable: 314 threads (was 312), 530 file descriptors (was 526). -thread leak?- -file handle leak?- client.TestAdmin#testCloseRegionThatFetchesTheHRIFromMeta: 315 threads (was 312), 513 file descriptors (was 507). -thread leak?- -file handle leak?- client.TestAdmin#testGetTableRegions: 309 threads (was 308), 512 file descriptors (was 499). -thread leak?- -file handle leak?- client.TestAdmin#testCheckHBaseAvailableClosesConnection: 523 threads (was 298), 913 file descriptors (was 488). -thread leak?- -file handle leak?- client.TestFromClientSide#testKeepDeletedCells: 261 threads (was 246), 437 file descriptors (was 414). -thread leak?- -file handle leak?- client.TestFromClientSide#testRegionCacheDeSerialization: 276 threads (was 261), 485 file descriptors (was 437). -thread leak?- -file handle leak?- client.TestFromClientSide#testRegionCachePreWarm: 277 threads (was 276), 488 file descriptors (was 485). -thread leak?- -file handle leak?- client.TestFromClientSide#testWeirdCacheBehaviour: 285 threads (was 277), 500 file descriptors (was 488). -thread leak?- -file handle leak?-
[jira] [Commented] (HBASE-2675) Quick smoke tests testsuite
[ https://issues.apache.org/jira/browse/HBASE-2675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164430#comment-13164430 ] nkeywal commented on HBASE-2675: 'mvn test -P runSmallTests' runs about 400 tests out the the 1200 tests today, in about 3 minutes. Is this ok for you Benoit? Quick smoke tests testsuite - Key: HBASE-2675 URL: https://issues.apache.org/jira/browse/HBASE-2675 Project: HBase Issue Type: Test Reporter: Benoit Sigoure Assignee: nkeywal Priority: Minor It would be nice if there was a known subset of the tests that run fast (e.g. not more than a few seconds) and quickly help us check whether the code isn't horribly broken. This way one could run those tests at a frequent interval when iterating and only run the entire testsuite at the end, when they think they're done, since doing so is very time consuming. Someone would need to identify which tests really focus on the core functionality and add a target in the build system to just run those tests. As a bonus, it would be awesome++ if the core tests ran, say, 10x faster than they currently do. There's a lot of sleep-based synchronization in the tests and it would be nice to remove some of that where possible to make the tests run as fast as the machine can handle them. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4972) Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch.
[ https://issues.apache.org/jira/browse/HBASE-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-4972: -- Description: There are several issues that have been committed in the 0.90 branch but were not in trunk/0.92 branch. These regressions should be forward ported. HBASE-3320 ! HBASE-3380 ! - HBASE-4610 is a jira to backports this, but it is not done. HBASE-3410 ! HBASE-3501 ! HBASE-3714 ! HBASE-3729 !! Marked in 0.92 but not committed there, committed in 0.90 branch. HBASE-3848 ! HBASE-3892 ! * Comments say trunk does not need. HBASE-3906 ! HBASE-3989 ! HBASE-4109 ! HBASE-4160 !! Marked resolved 0.90.5, but no corresponding commit in either 0.90 or 0.92 HBASE-4423 ! was: There are several issues that have been committed in the 0.90 branch but were not in trunk/0.92 branch. These regressions should be forward ported. HBASE-3320 ! HBASE-3380 ! - HBASE-4610 is a jira to backports this, but it is not done. HBASE-3410 ! HBASE-3501 ! HBASE-3714 ! HBASE-3729 !! Maked in 0.92 but not committed there, committed in 0.90 branch. HBASE-3848 ! HBASE-3892 ! * Comments say trunk does not need. HBASE-3906 ! HBASE-3989 ! HBASE-4109 ! HBASE-4160 !! Marked resolved 0.90.5, but no corresponding commit in either 0.90 or 0.92 HBASE-4423 ! Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch. -- Key: HBASE-4972 URL: https://issues.apache.org/jira/browse/HBASE-4972 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Priority: Critical Fix For: 0.92.0 There are several issues that have been committed in the 0.90 branch but were not in trunk/0.92 branch. These regressions should be forward ported. HBASE-3320 ! HBASE-3380 ! - HBASE-4610 is a jira to backports this, but it is not done. HBASE-3410 ! HBASE-3501 ! HBASE-3714 ! HBASE-3729 !! Marked in 0.92 but not committed there, committed in 0.90 branch. HBASE-3848 ! HBASE-3892 ! * Comments say trunk does not need. HBASE-3906 ! HBASE-3989 ! HBASE-4109 ! HBASE-4160 !! Marked resolved 0.90.5, but no corresponding commit in either 0.90 or 0.92 HBASE-4423 ! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4973) On failure, HBaseAdmin sleeps one time too many
On failure, HBaseAdmin sleeps one time too many --- Key: HBASE-4973 URL: https://issues.apache.org/jira/browse/HBASE-4973 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor In this code last sleep is useless as we're not retrying. This can slow down failure scenarios by a few seconds (up to 32 second). {noformat} public HBaseAdmin(Configuration c) throws MasterNotRunningException, ZooKeeperConnectionException { this.conf = HBaseConfiguration.create(c); this.connection = HConnectionManager.getConnection(this.conf); this.pause = this.conf.getLong(hbase.client.pause, 1000); this.numRetries = this.conf.getInt(hbase.client.retries.number, 10); this.retryLongerMultiplier = this.conf.getInt( hbase.client.retries.longer.multiplier, 10); int tries = 0; for (; tries numRetries; ++tries) { try { this.connection.getMaster(); break; } catch (MasterNotRunningException mnre) { HConnectionManager.deleteStaleConnection(this.connection); this.connection = HConnectionManager.getConnection(this.conf); } catch (UndeclaredThrowableException ute) { HConnectionManager.deleteStaleConnection(this.connection); this.connection = HConnectionManager.getConnection(this.conf); } try { // Sleep Thread.sleep(getPauseTime(tries)); } catch (InterruptedException e) { Thread.currentThread().interrupt(); // we should delete connection between client and zookeeper HConnectionManager.deleteStaleConnection(this.connection); throw new MasterNotRunningException(Interrupted); } } if (tries = numRetries) { // we should delete connection between client and zookeeper HConnectionManager.deleteStaleConnection(this.connection); throw new MasterNotRunningException(Retried + numRetries + times); } } {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4974) Remove some resources leaks on the tests
Remove some resources leaks on the tests Key: HBASE-4974 URL: https://issues.apache.org/jira/browse/HBASE-4974 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Cf. title and HBASE-4965 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4972) Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch.
[ https://issues.apache.org/jira/browse/HBASE-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164501#comment-13164501 ] Jonathan Hsieh commented on HBASE-4972: --- * HBASE-3848 This work has been idle since Jun/11 * HBASE-3892 Comments say trunk doesn't need, but no test case so can't verify without effort. Seems to have significant differences between 0.90 and 0.92. * HBASE-3906 Comments say doesn't make sense on trunk. * HBASE-3989 Comments say not needed on trunk * HBASE-4109 Comments say not needed on trunk * HBASE-4160 Patch and commit present but does not contain name HBASE-4160. * HBASE-4423 Contained in 0.92's HBASE-4238 Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch. -- Key: HBASE-4972 URL: https://issues.apache.org/jira/browse/HBASE-4972 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Priority: Critical Fix For: 0.92.0 There are several issues that have been committed in the 0.90 branch but were not in trunk/0.92 branch. These regressions should be forward ported. HBASE-3320 ! HBASE-3380 ! - HBASE-4610 is a jira to backports this, but it is not done. HBASE-3410 ! HBASE-3501 ! HBASE-3714 ! HBASE-3729 !! Marked in 0.92 but not committed there, committed in 0.90 branch. HBASE-3848 ! HBASE-3892 ! * Comments say trunk does not need. HBASE-3906 ! HBASE-3989 ! HBASE-4109 ! HBASE-4160 !! Marked resolved 0.90.5, but no corresponding commit in either 0.90 or 0.92 HBASE-4423 ! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4974) Remove some resources leaks on the tests
[ https://issues.apache.org/jira/browse/HBASE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-4974: --- Status: Patch Available (was: Open) Remove some resources leaks on the tests Key: HBASE-4974 URL: https://issues.apache.org/jira/browse/HBASE-4974 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 4974_all.patch Cf. title and HBASE-4965 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4974) Remove some resources leaks on the tests
[ https://issues.apache.org/jira/browse/HBASE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-4974: --- Attachment: 4974_all.patch Remove some resources leaks on the tests Key: HBASE-4974 URL: https://issues.apache.org/jira/browse/HBASE-4974 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 4974_all.patch Cf. title and HBASE-4965 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4972) Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch.
[ https://issues.apache.org/jira/browse/HBASE-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164512#comment-13164512 ] Jonathan Hsieh commented on HBASE-4972: --- So two issue remain, * HBASE-4610 which is explicitly a forward porting issue. * HBASE-3848 which is open -- currently with a commit on 0.90 branch but not trunk/0.92. Maybe this should be closed on 0.90 and a new forward porting issue should be created? The other issues are basically non-issues code-wise: * subsequent patches picked up the fix. * patch is not relevant to 0.92/trunk branches. (would be nice to have this in title). * typos in commit messages. Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch. -- Key: HBASE-4972 URL: https://issues.apache.org/jira/browse/HBASE-4972 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Priority: Critical Fix For: 0.92.0 There are several issues that have been committed in the 0.90 branch but were not in trunk/0.92 branch. These regressions should be forward ported. HBASE-3320 ! HBASE-3380 ! - HBASE-4610 is a jira to backports this, but it is not done. HBASE-3410 ! HBASE-3501 ! HBASE-3714 ! HBASE-3729 !! Marked in 0.92 but not committed there, committed in 0.90 branch. HBASE-3848 ! HBASE-3892 ! * Comments say trunk does not need. HBASE-3906 ! HBASE-3989 ! HBASE-4109 ! HBASE-4160 !! Marked resolved 0.90.5, but no corresponding commit in either 0.90 or 0.92 HBASE-4423 ! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4972) Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch.
[ https://issues.apache.org/jira/browse/HBASE-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-4972: -- Issue Type: Task (was: Bug) Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch. -- Key: HBASE-4972 URL: https://issues.apache.org/jira/browse/HBASE-4972 Project: HBase Issue Type: Task Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Priority: Critical Fix For: 0.92.0 There are several issues that have been committed in the 0.90 branch but were not in trunk/0.92 branch. These regressions should be forward ported. HBASE-3320 ! HBASE-3380 ! - HBASE-4610 is a jira to backports this, but it is not done. HBASE-3410 ! HBASE-3501 ! HBASE-3714 ! HBASE-3729 !! Marked in 0.92 but not committed there, committed in 0.90 branch. HBASE-3848 ! HBASE-3892 ! * Comments say trunk does not need. HBASE-3906 ! HBASE-3989 ! HBASE-4109 ! HBASE-4160 !! Marked resolved 0.90.5, but no corresponding commit in either 0.90 or 0.92 HBASE-4423 ! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4974) Remove some resources leaks on the tests
[ https://issues.apache.org/jira/browse/HBASE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164581#comment-13164581 ] Hadoop QA commented on HBASE-4974: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12506481/4974_all.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 12 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -160 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 72 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildOverlap org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildHole org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildBase Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/462//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/462//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/462//console This message is automatically generated. Remove some resources leaks on the tests Key: HBASE-4974 URL: https://issues.apache.org/jira/browse/HBASE-4974 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 4974_all.patch Cf. title and HBASE-4965 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4970) Add a parameter to change keepAliveTime of Htable thread pool.
[ https://issues.apache.org/jira/browse/HBASE-4970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164580#comment-13164580 ] Zhihong Yu commented on HBASE-4970: --- Patch v2 is a backport and doesn't change keepAliveTime. I feel we should address the needs of HTable users. I am fine with the backport - we may want to modify the title of this JIRA accordingly. Add a parameter to change keepAliveTime of Htable thread pool. --- Key: HBASE-4970 URL: https://issues.apache.org/jira/browse/HBASE-4970 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.90.4 Reporter: gaojinchao Assignee: gaojinchao Priority: Trivial Fix For: 0.90.5 Attachments: HBASE-4970_Branch90.patch, HBASE-4970_Branch90_V1_trial.patch In my cluster, I changed keepAliveTime from 60 s to 3600 s. Increasing RES is slowed down. Why increasing keepAliveTime of HBase thread pool is slowing down our problem occurance [RES value increase]? You can go through the source of sun.nio.ch.Util. Every thread hold 3 softreference of direct buffer(mustangsrc) for reusage. The code names the 3 softreferences buffercache. If the buffer was all occupied or none was suitable in size, and new request comes, new direct buffer is allocated. After the service, the bigger one replaces the smaller one in buffercache. The replaced buffer is released. So I think we can add a parameter to change keepAliveTime of Htable thread pool. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4974) Remove some resources leaks on the tests
[ https://issues.apache.org/jira/browse/HBASE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164583#comment-13164583 ] nkeywal commented on HBASE-4974: The 3 tests fails on trunk as well. However, it means that the large tests have not been tested, and I have some strange errors on these ones locally... Remove some resources leaks on the tests Key: HBASE-4974 URL: https://issues.apache.org/jira/browse/HBASE-4974 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 4974_all.patch Cf. title and HBASE-4965 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-4927) CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty
[ https://issues.apache.org/jira/browse/HBASE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu reopened HBASE-4927: --- As we can see from the report here: https://issues.apache.org/jira/browse/HBASE-4927?focusedCommentId=13163785page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13163785 there were 3 failed tests. These test failures rippled through all 0.92 builds. CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty --- Key: HBASE-4927 URL: https://issues.apache.org/jira/browse/HBASE-4927 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0, 0.94.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Fix For: 0.92.0 Attachments: 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator-.patch, 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator_v2.patch, hbase-4927-fix-ws.txt When reviewing HBASE-4238 backporting, Jon found this issue. What happens if the split points are (empty end key is the last key, empty start key is the first key) Parent [A,) L daughter [A,B), R daughter [B,) When sorted, we gets to end key comparision which results in this incorrector order: [A,B), [A,), [B,) we wanted: [A,), [A,B), [B,) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4224) Need a flush by regionserver rather than by table option
[ https://issues.apache.org/jira/browse/HBASE-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164599#comment-13164599 ] Ted Yu commented on HBASE-4224: --- @Akash: Do you have a newer patch ? If so, please upload to this JIRA. Need a flush by regionserver rather than by table option Key: HBASE-4224 URL: https://issues.apache.org/jira/browse/HBASE-4224 Project: HBase Issue Type: Bug Components: shell Reporter: stack Assignee: Akash Ashok Attachments: HBase-4224.patch This evening needed to clean out logs on the cluster. logs are by regionserver. to let go of logs, we need to have all edits emptied from memory. only flush is by table or region. We need to be able to flush the regionserver. Need to add this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss
[ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164605#comment-13164605 ] Zhihong Yu commented on HBASE-4880: --- The three test failures would be fixed by addendum to HBASE-4927. Region is on service before completing openRegionHanlder, may cause data loss - Key: HBASE-4880 URL: https://issues.apache.org/jira/browse/HBASE-4880 Project: HBase Issue Type: Bug Reporter: chunhui shen Assignee: chunhui shen Attachments: hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch OpenRegionHandler in regionserver is processed as the following steps: {code} 1.openregion()(Through it, closed = false, closing = false) 2.addToOnlineRegions(region) 3.update .meta. table 4.update ZK's node state to RS_ZK_REGION_OPEND {code} We can find that region is on service before Step 4. It means client could put data to this region after step 3. What will happen if step 4 is failed processing? It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver. If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4880) Region is on service before openRegionHandler completes, may cause data loss
[ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-4880: -- Affects Version/s: 0.94.0 0.19.2 Summary: Region is on service before openRegionHandler completes, may cause data loss (was: Region is on service before completing openRegionHanlder, may cause data loss) Region is on service before openRegionHandler completes, may cause data loss Key: HBASE-4880 URL: https://issues.apache.org/jira/browse/HBASE-4880 Project: HBase Issue Type: Bug Affects Versions: 0.19.2, 0.94.0 Reporter: chunhui shen Assignee: chunhui shen Attachments: hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch OpenRegionHandler in regionserver is processed as the following steps: {code} 1.openregion()(Through it, closed = false, closing = false) 2.addToOnlineRegions(region) 3.update .meta. table 4.update ZK's node state to RS_ZK_REGION_OPEND {code} We can find that region is on service before Step 4. It means client could put data to this region after step 3. What will happen if step 4 is failed processing? It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver. If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4975) fix spurious -1's from Hadoop QA
fix spurious -1's from Hadoop QA Key: HBASE-4975 URL: https://issues.apache.org/jira/browse/HBASE-4975 Project: HBase Issue Type: Bug Components: build Reporter: Eugene Koontz Priority: Minor Hadoop QA generated comments based on patches submitted to JIRAs; for example: https://issues.apache.org/jira/browse/HBASE-4960?focusedCommentId=13163191page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13163191 There are some spurious -1's given to the patch. The patch only affects documentation, not source code, but Hadoop QA says that: {noformat} -1 findbugs. The patch appears to introduce 72 new Findbugs (version 1.3.9) warnings. {noformat} Evidently Hadoop QA is not able to recall the set of Findbugs warnings from the previous build. (Of course the Findbugs warnings themselves should be addressed, but this patch could not have added to them). {noformat} -1 javadoc. The javadoc tool appears to have generated -160 warning messages. {noformat} This should be 160 warning messages, not -160 warning messages. Thanks to NKeywal for suggesting that the relevant file is {{dev-support/test-patch.sh}}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4880) Region is on service before openRegionHandler completes, may cause data loss
[ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-4880: -- Affects Version/s: (was: 0.19.2) 0.92.0 Region is on service before openRegionHandler completes, may cause data loss Key: HBASE-4880 URL: https://issues.apache.org/jira/browse/HBASE-4880 Project: HBase Issue Type: Bug Affects Versions: 0.92.0, 0.94.0 Reporter: chunhui shen Assignee: chunhui shen Attachments: hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch OpenRegionHandler in regionserver is processed as the following steps: {code} 1.openregion()(Through it, closed = false, closing = false) 2.addToOnlineRegions(region) 3.update .meta. table 4.update ZK's node state to RS_ZK_REGION_OPEND {code} We can find that region is on service before Step 4. It means client could put data to this region after step 3. What will happen if step 4 is failed processing? It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver. If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3857) Change the HFile Format
[ https://issues.apache.org/jira/browse/HBASE-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164613#comment-13164613 ] Jean-Daniel Cryans commented on HBASE-3857: --- Any reason why this patch is removing compaction and flush queue sizes? {code} -this.metrics.compactionQueueSize.set(compactSplitThread -.getCompactionQueueSize()); -this.metrics.flushQueueSize.set(cacheFlusher -.getFlushQueueSize()); {code} If it was intentional, there's a bunch of dead code that also needs to be removed like those methods that were called. If it wasn't, meaning there's currently no way in 0.92 to get the compaction queue size, then this would be sufficient for me to kill the RC. Change the HFile Format --- Key: HBASE-3857 URL: https://issues.apache.org/jira/browse/HBASE-3857 Project: HBase Issue Type: New Feature Affects Versions: 0.90.4 Reporter: Liyin Tang Assignee: Mikhail Bautin Attachments: 0001-Adding-release-notes-for-HBASE-3857.patch, 0001-Fix-TestHFileBlock.testBlockHeapSize.patch, 0001-review_hfile-v2-r1144693_2011-07-15_11_14_44.patch, 0001-review_hfile-v2-r1147350_2011-07-26_11_55_59.patch, 0001-review_hfile-v2-r1152122-2011_08_01_03_18_00.patch, 0001-review_hfile-v2-r1153300-git-1152532-2011_08_02_19_4.patch, 0001-review_hfile-v2-r1153300-git-1152532-2011_08_03_12_4.patch, hfile_format_v2_design_draft_0.1.pdf, hfile_format_v2_design_draft_0.3.pdf, hfile_format_v2_design_draft_0.4.odt In order to support HBASE-3763 and HBASE-3856, we need to change the format of the HFile. The new format proposal is attached here. Thanks for Mikhail Bautin for the documentation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3857) Change the HFile Format
[ https://issues.apache.org/jira/browse/HBASE-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164618#comment-13164618 ] Zhihong Yu commented on HBASE-3857: --- @J-D: Nice catch. We should open another JIRA to deal with queue sizes. Change the HFile Format --- Key: HBASE-3857 URL: https://issues.apache.org/jira/browse/HBASE-3857 Project: HBase Issue Type: New Feature Affects Versions: 0.90.4 Reporter: Liyin Tang Assignee: Mikhail Bautin Attachments: 0001-Adding-release-notes-for-HBASE-3857.patch, 0001-Fix-TestHFileBlock.testBlockHeapSize.patch, 0001-review_hfile-v2-r1144693_2011-07-15_11_14_44.patch, 0001-review_hfile-v2-r1147350_2011-07-26_11_55_59.patch, 0001-review_hfile-v2-r1152122-2011_08_01_03_18_00.patch, 0001-review_hfile-v2-r1153300-git-1152532-2011_08_02_19_4.patch, 0001-review_hfile-v2-r1153300-git-1152532-2011_08_03_12_4.patch, hfile_format_v2_design_draft_0.1.pdf, hfile_format_v2_design_draft_0.3.pdf, hfile_format_v2_design_draft_0.4.odt In order to support HBASE-3763 and HBASE-3856, we need to change the format of the HFile. The new format proposal is attached here. Thanks for Mikhail Bautin for the documentation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3857) Change the HFile Format
[ https://issues.apache.org/jira/browse/HBASE-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164619#comment-13164619 ] Jean-Daniel Cryans commented on HBASE-3857: --- Yeah I just want to verify first what's the situation, maybe I'm missing something. Change the HFile Format --- Key: HBASE-3857 URL: https://issues.apache.org/jira/browse/HBASE-3857 Project: HBase Issue Type: New Feature Affects Versions: 0.90.4 Reporter: Liyin Tang Assignee: Mikhail Bautin Attachments: 0001-Adding-release-notes-for-HBASE-3857.patch, 0001-Fix-TestHFileBlock.testBlockHeapSize.patch, 0001-review_hfile-v2-r1144693_2011-07-15_11_14_44.patch, 0001-review_hfile-v2-r1147350_2011-07-26_11_55_59.patch, 0001-review_hfile-v2-r1152122-2011_08_01_03_18_00.patch, 0001-review_hfile-v2-r1153300-git-1152532-2011_08_02_19_4.patch, 0001-review_hfile-v2-r1153300-git-1152532-2011_08_03_12_4.patch, hfile_format_v2_design_draft_0.1.pdf, hfile_format_v2_design_draft_0.3.pdf, hfile_format_v2_design_draft_0.4.odt In order to support HBASE-3763 and HBASE-3856, we need to change the format of the HFile. The new format proposal is attached here. Thanks for Mikhail Bautin for the documentation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4938) Create a HRegion.getScanner public method that allows reading from a specified readPoint
[ https://issues.apache.org/jira/browse/HBASE-4938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164620#comment-13164620 ] Todd Lipcon commented on HBASE-4938: Dhruba, can you clarify a little more what the purpose of this change is? I didn't quite understand what you meant by We have some internal HRegion API that needs to scan based on a external readPoint. You have some other non-HBase software which is using HBase's storage engine components? Create a HRegion.getScanner public method that allows reading from a specified readPoint Key: HBASE-4938 URL: https://issues.apache.org/jira/browse/HBASE-4938 Project: HBase Issue Type: Improvement Reporter: dhruba borthakur Assignee: dhruba borthakur Priority: Minor There is an existing api HRegion.getScanner(Scan) that allows scanning a table. My proposal is to extend it to HRegion.getScanner(Scan, long readPoint) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4976) Add compaction/flush queue size metrics mistakenly removed by HFile v2
Add compaction/flush queue size metrics mistakenly removed by HFile v2 -- Key: HBASE-4976 URL: https://issues.apache.org/jira/browse/HBASE-4976 Project: HBase Issue Type: Bug Reporter: Mikhail Bautin Assignee: Mikhail Bautin -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4610) Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug)
[ https://issues.apache.org/jira/browse/HBASE-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-4610: -- Attachment: 4610.txt Jonathan's patch from HBASE-3380, rebased for TRUNK. Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug) - Key: HBASE-4610 URL: https://issues.apache.org/jira/browse/HBASE-4610 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0, 0.94.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Fix For: 0.92.1 Attachments: 4610.txt Over in HBASE-3380 we were having some TestMasterFailover flakiness. We added some more config parameters to better control the master startup loop where it waits for RS to heartbeat in. We had thought at the time that 92 would have a different solution but it is still relying on heartbeats to learn about RSs. For now, we should definitely bring these config params into 92/trunk. Otherwise this is an incompatible regression and adding these will also make things like what was just reported over in HBASE-4603 trivial to fix in an optimal way. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4610) Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug)
[ https://issues.apache.org/jira/browse/HBASE-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-4610: -- Status: Patch Available (was: Open) Patch testing. Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug) - Key: HBASE-4610 URL: https://issues.apache.org/jira/browse/HBASE-4610 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0, 0.94.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Fix For: 0.92.1 Attachments: 4610.txt Over in HBASE-3380 we were having some TestMasterFailover flakiness. We added some more config parameters to better control the master startup loop where it waits for RS to heartbeat in. We had thought at the time that 92 would have a different solution but it is still relying on heartbeats to learn about RSs. For now, we should definitely bring these config params into 92/trunk. Otherwise this is an incompatible regression and adding these will also make things like what was just reported over in HBASE-4603 trivial to fix in an optimal way. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-3848) request is always zero in WebUI for region server
[ https://issues.apache.org/jira/browse/HBASE-3848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu resolved HBASE-3848. --- Resolution: Fixed The remaining work would be completed by HBASE-4977 request is always zero in WebUI for region server - Key: HBASE-3848 URL: https://issues.apache.org/jira/browse/HBASE-3848 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.2 Reporter: gaojinchao Assignee: gaojinchao Priority: Minor Attachments: RegionServer_90PatchV2.patch, RegionseverMetric_TrunkPathV2.patch request is always zero in WebUI for region server Metrics request=0.0, regions=36, stores=36, storefiles=148, storefileIndexSize=29, memstoreSize=253, compactionQueueSize=24, flushQueueSize=0, usedHeap=655, maxHeap=8175, blockCacheSize=14230920, blockCacheFree=1700269560, blockCacheCount=21, blockCacheHitCount=2887, blockCacheMissCount=204829, blockCacheEvictedCount=0, blockCacheHitRatio=1, blockCacheHitCachingRatio=99 requests is not zero in WebUI for Hmaster requests=15000, regions=35, usedHeap=513, maxHeap=8175 Is there any different for these metrics? How do I use it? Thanks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4977) Forward port HBASE-3848 to 0.92 and TRUNK
Forward port HBASE-3848 to 0.92 and TRUNK - Key: HBASE-4977 URL: https://issues.apache.org/jira/browse/HBASE-4977 Project: HBase Issue Type: Task Reporter: Ted Yu HBASE-3848, request is always zero in WebUI for region server, was integrated to 0.90 This JIRA is a forward port. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4336) Convert source tree into maven modules
[ https://issues.apache.org/jira/browse/HBASE-4336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164654#comment-13164654 ] Jesse Yates commented on HBASE-4336: Started working on this. I have a fork up on github with hbase split into multiple modules (https://github.com/jyates/hbase) - a patch is just too massive to reasonably look at. Currently, the fork compiles and tests. Packaging is coming next. How do we want to bundle each of the pieces? I was thinking having jars for core, core-tests (so the minicluster can be used across modules), security, server and test. The test module will be where we have the 'api level tests' discussed on dev@ recently. These are things that are run against a cluster and just test the interfaces. Here is where we would use failsafe to spin up a minicluster for local testing or connect out to a real cluster (all of this would in follow-on JIRA(s)). Its _not_ intended as the place to put all the tests. The assemble module would then combine all of these into a tar, rpm, etc. as needed. Profiles would necessarily be split across multiple modules as each module will require different things and I don't want to add in the same dependency multiple times in different modules. This works nicely with Gary's original comment about just having the secure hadoop stuff in the security module (translates to having a the profile just in that module). The alternative would be to exclude certain dependencies in modules that don't need them, amounting to about the same amount of work across modules, but harder to reason about. Feedback is appreciated. Convert source tree into maven modules -- Key: HBASE-4336 URL: https://issues.apache.org/jira/browse/HBASE-4336 Project: HBase Issue Type: Task Components: build Reporter: Gary Helmling Priority: Critical Fix For: 0.94.0 When we originally converted the build to maven we had a single core module defined, but later reverted this to a module-less build for the sake of simplicity. It now looks like it's time to re-address this, as we have an actual need for modules to: * provide a trimmed down client library that applications can make use of * more cleanly support building against different versions of Hadoop, in place of some of the reflection machinations currently required * incorporate the secure RPC engine that depends on some secure Hadoop classes I propose we start simply by refactoring into two initial modules: * core - common classes and utilities, and client-side code and interfaces * server - master and region server implementations and supporting code This would also lay the groundwork for incorporating the HBase security features that have been developed. Once the module structure is in place, security-related features could then be incorporated into a third module -- security -- after normal review and approval. The security module could then depend on secure Hadoop, without modifying the dependencies of the rest of the HBase code. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4729) Clash between region unassign and splitting kills the master
[ https://issues.apache.org/jira/browse/HBASE-4729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164658#comment-13164658 ] Ted Yu commented on HBASE-4729: --- The HadoopQA report @ https://builds.apache.org/job/PreCommit-HBASE-Build/405//testReport/ showed basically no tests were run. A manual test suite execution should have been performed. Clash between region unassign and splitting kills the master Key: HBASE-4729 URL: https://issues.apache.org/jira/browse/HBASE-4729 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Assignee: stack Priority: Critical Fix For: 0.92.0, 0.94.0 Attachments: 4729-v2.txt, 4729-v3.txt, 4729-v4.txt, 4729-v5.txt, 4729-v6-092.txt, 4729-v6-trunk.txt, 4729.txt I was running an online alter while regions were splitting, and suddenly the master died and left my table half-altered (haven't restarted the master yet). What killed the master: {quote} 2011-11-02 17:06:44,428 FATAL org.apache.hadoop.hbase.master.HMaster: Unexpected ZK exception creating node CLOSING org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = NodeExists for /hbase/unassigned/f7e1783e65ea8d621a4bc96ad310f101 at org.apache.zookeeper.KeeperException.create(KeeperException.java:110) at org.apache.zookeeper.KeeperException.create(KeeperException.java:42) at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:459) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:441) at org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:769) at org.apache.hadoop.hbase.zookeeper.ZKAssign.createNodeClosing(ZKAssign.java:568) at org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1722) at org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1661) at org.apache.hadoop.hbase.master.BulkReOpen$1.run(BulkReOpen.java:69) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {quote} A znode was created because the region server was splitting the region 4 seconds before: {quote} 2011-11-02 17:06:40,704 INFO org.apache.hadoop.hbase.regionserver.SplitTransaction: Starting split of region TestTable,0012469153,1320253135043.f7e1783e65ea8d621a4bc96ad310f101. 2011-11-02 17:06:40,704 DEBUG org.apache.hadoop.hbase.regionserver.SplitTransaction: regionserver:62023-0x132f043bbde0710 Creating ephemeral node for f7e1783e65ea8d621a4bc96ad310f101 in SPLITTING state 2011-11-02 17:06:40,751 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:62023-0x132f043bbde0710 Attempting to transition node f7e1783e65ea8d621a4bc96ad310f101 from RS_ZK_REGION_SPLITTING to RS_ZK_REGION_SPLITTING ... 2011-11-02 17:06:44,061 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:62023-0x132f043bbde0710 Successfully transitioned node f7e1783e65ea8d621a4bc96ad310f101 from RS_ZK_REGION_SPLITTING to RS_ZK_REGION_SPLIT 2011-11-02 17:06:44,061 INFO org.apache.hadoop.hbase.regionserver.SplitTransaction: Still waiting on the master to process the split for f7e1783e65ea8d621a4bc96ad310f101 {quote} Now that the master is dead the region server is spewing those last two lines like mad. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4927) CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty
[ https://issues.apache.org/jira/browse/HBASE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-4927: --- Attachment: 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty --- Key: HBASE-4927 URL: https://issues.apache.org/jira/browse/HBASE-4927 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0, 0.94.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Fix For: 0.92.0 Attachments: 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch, 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch, 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator-.patch, 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator_v2.patch, hbase-4927-fix-ws.txt When reviewing HBASE-4238 backporting, Jon found this issue. What happens if the split points are (empty end key is the last key, empty start key is the first key) Parent [A,) L daughter [A,B), R daughter [B,) When sorted, we gets to end key comparision which results in this incorrector order: [A,B), [A,), [B,) we wanted: [A,), [A,B), [B,) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4610) Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug)
[ https://issues.apache.org/jira/browse/HBASE-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164682#comment-13164682 ] Hadoop QA commented on HBASE-4610: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12506499/4610.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -160 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 72 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildOverlap org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildHole org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildBase Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/463//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/463//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/463//console This message is automatically generated. Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug) - Key: HBASE-4610 URL: https://issues.apache.org/jira/browse/HBASE-4610 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0, 0.94.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Fix For: 0.92.1 Attachments: 4610.txt Over in HBASE-3380 we were having some TestMasterFailover flakiness. We added some more config parameters to better control the master startup loop where it waits for RS to heartbeat in. We had thought at the time that 92 would have a different solution but it is still relying on heartbeats to learn about RSs. For now, we should definitely bring these config params into 92/trunk. Otherwise this is an incompatible regression and adding these will also make things like what was just reported over in HBASE-4603 trivial to fix in an optimal way. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4927) CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty
[ https://issues.apache.org/jira/browse/HBASE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164686#comment-13164686 ] Jonathan Hsieh commented on HBASE-4927: --- Verison initially committed with this patch made the HRegionInfo's comparator declare region ['','') smaller ['', 'A'). Previously it was the other way around. In the TestOfflineMeta* tests, disableTable call eventually calls AssignmentManager#getRegionsOfTable(table). This returns 3 regions instead of 4. This is because this uses a boundary region with has [startkey='', endkey=''). The change likely left either the begin or end region out with this call. The core problem is because the definintion of greater than or less than regions is inconsistent wrt to '' start and end keys. CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty --- Key: HBASE-4927 URL: https://issues.apache.org/jira/browse/HBASE-4927 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0, 0.94.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Fix For: 0.92.0 Attachments: 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch, 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch, 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator-.patch, 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator_v2.patch, hbase-4927-fix-ws.txt When reviewing HBASE-4238 backporting, Jon found this issue. What happens if the split points are (empty end key is the last key, empty start key is the first key) Parent [A,) L daughter [A,B), R daughter [B,) When sorted, we gets to end key comparision which results in this incorrector order: [A,B), [A,), [B,) we wanted: [A,), [A,B), [B,) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4927) CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty
[ https://issues.apache.org/jira/browse/HBASE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-4927: --- Status: Patch Available (was: Reopened) CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty --- Key: HBASE-4927 URL: https://issues.apache.org/jira/browse/HBASE-4927 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0, 0.94.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Fix For: 0.92.0 Attachments: 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch, 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch, 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator-.patch, 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator_v2.patch, hbase-4927-fix-ws.txt When reviewing HBASE-4238 backporting, Jon found this issue. What happens if the split points are (empty end key is the last key, empty start key is the first key) Parent [A,) L daughter [A,B), R daughter [B,) When sorted, we gets to end key comparision which results in this incorrector order: [A,B), [A,), [B,) we wanted: [A,), [A,B), [B,) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4927) CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty
[ https://issues.apache.org/jira/browse/HBASE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164695#comment-13164695 ] Ted Yu commented on HBASE-4927: --- Ran through the previously failing tests: {code} 1010 mt -Dtest=TestMasterRestartAfterDisablingTable 1012 mt -Dtest=TestOfflineMetaRebuildBase#testMetaRebuild 1013 mt -Dtest=TestOfflineMetaRebuildHole {code} They pass now. Going to commit to 0.92 and TRUNK. CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty --- Key: HBASE-4927 URL: https://issues.apache.org/jira/browse/HBASE-4927 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0, 0.94.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Fix For: 0.92.0 Attachments: 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch, 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch, 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator-.patch, 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator_v2.patch, hbase-4927-fix-ws.txt When reviewing HBASE-4238 backporting, Jon found this issue. What happens if the split points are (empty end key is the last key, empty start key is the first key) Parent [A,) L daughter [A,B), R daughter [B,) When sorted, we gets to end key comparision which results in this incorrector order: [A,B), [A,), [B,) we wanted: [A,), [A,B), [B,) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4927) CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty
[ https://issues.apache.org/jira/browse/HBASE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164699#comment-13164699 ] Ted Yu commented on HBASE-4927: --- Also ran through the two tests in original patch: {code} 1302 mt -Dtest=TestHRegionInfo 1303 mt -Dtest=TestCatalogJanitor {code} They passed as well. Integrated to 0.92 and TRUNK. Thanks for the addendum, Jimmy. Thanks for the help, Jonathan. CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty --- Key: HBASE-4927 URL: https://issues.apache.org/jira/browse/HBASE-4927 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0, 0.94.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Fix For: 0.92.0 Attachments: 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch, 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch, 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator-.patch, 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator_v2.patch, hbase-4927-fix-ws.txt When reviewing HBASE-4238 backporting, Jon found this issue. What happens if the split points are (empty end key is the last key, empty start key is the first key) Parent [A,) L daughter [A,B), R daughter [B,) When sorted, we gets to end key comparision which results in this incorrector order: [A,B), [A,), [B,) we wanted: [A,), [A,B), [B,) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4610) Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug)
[ https://issues.apache.org/jira/browse/HBASE-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-4610: -- Attachment: (was: 4610.txt) Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug) - Key: HBASE-4610 URL: https://issues.apache.org/jira/browse/HBASE-4610 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0, 0.94.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Fix For: 0.92.1 Over in HBASE-3380 we were having some TestMasterFailover flakiness. We added some more config parameters to better control the master startup loop where it waits for RS to heartbeat in. We had thought at the time that 92 would have a different solution but it is still relying on heartbeats to learn about RSs. For now, we should definitely bring these config params into 92/trunk. Otherwise this is an incompatible regression and adding these will also make things like what was just reported over in HBASE-4603 trivial to fix in an optimal way. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4610) Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug)
[ https://issues.apache.org/jira/browse/HBASE-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-4610: -- Status: Patch Available (was: Open) Patch testing now that HBASE-4927 addendum has been integrated Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug) - Key: HBASE-4610 URL: https://issues.apache.org/jira/browse/HBASE-4610 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0, 0.94.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Fix For: 0.92.1 Attachments: 4610.txt Over in HBASE-3380 we were having some TestMasterFailover flakiness. We added some more config parameters to better control the master startup loop where it waits for RS to heartbeat in. We had thought at the time that 92 would have a different solution but it is still relying on heartbeats to learn about RSs. For now, we should definitely bring these config params into 92/trunk. Otherwise this is an incompatible regression and adding these will also make things like what was just reported over in HBASE-4603 trivial to fix in an optimal way. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4880) Region is on service before openRegionHandler completes, may cause data loss
[ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-4880: -- Status: Open (was: Patch Available) Region is on service before openRegionHandler completes, may cause data loss Key: HBASE-4880 URL: https://issues.apache.org/jira/browse/HBASE-4880 Project: HBase Issue Type: Bug Affects Versions: 0.92.0, 0.94.0 Reporter: chunhui shen Assignee: chunhui shen Attachments: hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch OpenRegionHandler in regionserver is processed as the following steps: {code} 1.openregion()(Through it, closed = false, closing = false) 2.addToOnlineRegions(region) 3.update .meta. table 4.update ZK's node state to RS_ZK_REGION_OPEND {code} We can find that region is on service before Step 4. It means client could put data to this region after step 3. What will happen if step 4 is failed processing? It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver. If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4610) Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug)
[ https://issues.apache.org/jira/browse/HBASE-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-4610: -- Attachment: 4610.txt Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug) - Key: HBASE-4610 URL: https://issues.apache.org/jira/browse/HBASE-4610 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0, 0.94.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Fix For: 0.92.1 Attachments: 4610.txt Over in HBASE-3380 we were having some TestMasterFailover flakiness. We added some more config parameters to better control the master startup loop where it waits for RS to heartbeat in. We had thought at the time that 92 would have a different solution but it is still relying on heartbeats to learn about RSs. For now, we should definitely bring these config params into 92/trunk. Otherwise this is an incompatible regression and adding these will also make things like what was just reported over in HBASE-4603 trivial to fix in an optimal way. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4880) Region is on service before openRegionHandler completes, may cause data loss
[ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-4880: -- Attachment: 4880.txt Region is on service before openRegionHandler completes, may cause data loss Key: HBASE-4880 URL: https://issues.apache.org/jira/browse/HBASE-4880 Project: HBase Issue Type: Bug Affects Versions: 0.92.0, 0.94.0 Reporter: chunhui shen Assignee: chunhui shen Attachments: 4880.txt, hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch OpenRegionHandler in regionserver is processed as the following steps: {code} 1.openregion()(Through it, closed = false, closing = false) 2.addToOnlineRegions(region) 3.update .meta. table 4.update ZK's node state to RS_ZK_REGION_OPEND {code} We can find that region is on service before Step 4. It means client could put data to this region after step 3. What will happen if step 4 is failed processing? It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver. If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4880) Region is on service before openRegionHandler completes, may cause data loss
[ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-4880: -- Status: Patch Available (was: Open) Patch testing again. Region is on service before openRegionHandler completes, may cause data loss Key: HBASE-4880 URL: https://issues.apache.org/jira/browse/HBASE-4880 Project: HBase Issue Type: Bug Affects Versions: 0.92.0, 0.94.0 Reporter: chunhui shen Assignee: chunhui shen Attachments: 4880.txt, hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch OpenRegionHandler in regionserver is processed as the following steps: {code} 1.openregion()(Through it, closed = false, closing = false) 2.addToOnlineRegions(region) 3.update .meta. table 4.update ZK's node state to RS_ZK_REGION_OPEND {code} We can find that region is on service before Step 4. It means client could put data to this region after step 3. What will happen if step 4 is failed processing? It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver. If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4927) CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty
[ https://issues.apache.org/jira/browse/HBASE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164733#comment-13164733 ] Hadoop QA commented on HBASE-4927: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12506510/0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javadoc. The javadoc tool appears to have generated -160 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 72 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/464//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/464//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/464//console This message is automatically generated. CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty --- Key: HBASE-4927 URL: https://issues.apache.org/jira/browse/HBASE-4927 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0, 0.94.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Fix For: 0.92.0 Attachments: 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch, 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch, 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator-.patch, 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator_v2.patch, hbase-4927-fix-ws.txt When reviewing HBASE-4238 backporting, Jon found this issue. What happens if the split points are (empty end key is the last key, empty start key is the first key) Parent [A,) L daughter [A,B), R daughter [B,) When sorted, we gets to end key comparision which results in this incorrector order: [A,B), [A,), [B,) we wanted: [A,), [A,B), [B,) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4946) HTable.coprocessorExec (and possibly coprocessorProxy) does not work with dynamically loaded coprocessors (from hdfs or local system), because the RPC system tries to des
[ https://issues.apache.org/jira/browse/HBASE-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-4946: -- Attachment: 4946-v4.txt Patch v4 removes eager instantiation. HTable.coprocessorExec (and possibly coprocessorProxy) does not work with dynamically loaded coprocessors (from hdfs or local system), because the RPC system tries to deserialize an unknown class. - Key: HBASE-4946 URL: https://issues.apache.org/jira/browse/HBASE-4946 Project: HBase Issue Type: Bug Components: coprocessors Affects Versions: 0.92.0 Reporter: Andrei Dragomir Assignee: Andrei Dragomir Attachments: 4946-v4.txt, HBASE-4946-v2.patch, HBASE-4946-v3.patch, HBASE-4946.patch Loading coprocessors jars from hdfs works fine. I load it from the shell, after setting the attribute, and it gets loaded: {noformat} INFO org.apache.hadoop.hbase.regionserver.HRegion: Setting up tabledescriptor config now ... INFO org.apache.hadoop.hbase.coprocessor.CoprocessorHost: Class com.MyCoprocessorClass needs to be loaded from a file - hdfs://localhost:9000/coproc/rt- 0.0.1-SNAPSHOT.jar. INFO org.apache.hadoop.hbase.coprocessor.CoprocessorHost: loadInstance: com.MyCoprocessorClass INFO org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost: RegionEnvironment createEnvironment DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Registered protocol handler: region=t1,,1322572939753.6409aee1726d31f5e5671a59fe6e384f. protocol=com.MyCoprocessorClassProtocol INFO org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost: Load coprocessor com.MyCoprocessorClass from HTD of t1 successfully. {noformat} The problem is that this coprocessors simply extends BaseEndpointCoprocessor, with a dynamic method. When calling this method from the client with HTable.coprocessorExec, I get errors on the HRegionServer, because the call cannot be deserialized from writables. The problem is that Exec tries to do an early resolve of the coprocessor class. The coprocessor class is loaded, but it is in the context of the HRegionServer / HRegion. So, the call fails: {noformat} 2011-12-02 00:34:17,348 ERROR org.apache.hadoop.hbase.io.HbaseObjectWritable: Error in readFields java.io.IOException: Protocol class com.MyCoprocessorClassProtocol not found at org.apache.hadoop.hbase.client.coprocessor.Exec.readFields(Exec.java:125) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:575) at org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:105) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1237) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1167) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:703) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:495) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:470) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:680) Caused by: java.lang.ClassNotFoundException: com.MyCoprocessorClassProtocol at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:943) at org.apache.hadoop.hbase.client.coprocessor.Exec.readFields(Exec.java:122) ... 10 more {noformat} Probably the correct way to fix this is to make Exec really smart, so that it knows all the class definitions loaded in CoprocessorHost(s). I created a small patch that simply doesn't resolve the class definition in the Exec, instead passing it as string down to the HRegion layer. This layer knows all the definitions, and simply loads it by name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see:
[jira] [Updated] (HBASE-4946) HTable.coprocessorExec (and possibly coprocessorProxy) does not work with dynamically loaded coprocessors (from hdfs or local system), because the RPC system tries to des
[ https://issues.apache.org/jira/browse/HBASE-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-4946: -- Status: Open (was: Patch Available) HTable.coprocessorExec (and possibly coprocessorProxy) does not work with dynamically loaded coprocessors (from hdfs or local system), because the RPC system tries to deserialize an unknown class. - Key: HBASE-4946 URL: https://issues.apache.org/jira/browse/HBASE-4946 Project: HBase Issue Type: Bug Components: coprocessors Affects Versions: 0.92.0 Reporter: Andrei Dragomir Assignee: Andrei Dragomir Attachments: 4946-v4.txt, HBASE-4946-v2.patch, HBASE-4946-v3.patch, HBASE-4946.patch Loading coprocessors jars from hdfs works fine. I load it from the shell, after setting the attribute, and it gets loaded: {noformat} INFO org.apache.hadoop.hbase.regionserver.HRegion: Setting up tabledescriptor config now ... INFO org.apache.hadoop.hbase.coprocessor.CoprocessorHost: Class com.MyCoprocessorClass needs to be loaded from a file - hdfs://localhost:9000/coproc/rt- 0.0.1-SNAPSHOT.jar. INFO org.apache.hadoop.hbase.coprocessor.CoprocessorHost: loadInstance: com.MyCoprocessorClass INFO org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost: RegionEnvironment createEnvironment DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Registered protocol handler: region=t1,,1322572939753.6409aee1726d31f5e5671a59fe6e384f. protocol=com.MyCoprocessorClassProtocol INFO org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost: Load coprocessor com.MyCoprocessorClass from HTD of t1 successfully. {noformat} The problem is that this coprocessors simply extends BaseEndpointCoprocessor, with a dynamic method. When calling this method from the client with HTable.coprocessorExec, I get errors on the HRegionServer, because the call cannot be deserialized from writables. The problem is that Exec tries to do an early resolve of the coprocessor class. The coprocessor class is loaded, but it is in the context of the HRegionServer / HRegion. So, the call fails: {noformat} 2011-12-02 00:34:17,348 ERROR org.apache.hadoop.hbase.io.HbaseObjectWritable: Error in readFields java.io.IOException: Protocol class com.MyCoprocessorClassProtocol not found at org.apache.hadoop.hbase.client.coprocessor.Exec.readFields(Exec.java:125) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:575) at org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:105) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1237) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1167) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:703) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:495) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:470) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:680) Caused by: java.lang.ClassNotFoundException: com.MyCoprocessorClassProtocol at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:943) at org.apache.hadoop.hbase.client.coprocessor.Exec.readFields(Exec.java:122) ... 10 more {noformat} Probably the correct way to fix this is to make Exec really smart, so that it knows all the class definitions loaded in CoprocessorHost(s). I created a small patch that simply doesn't resolve the class definition in the Exec, instead passing it as string down to the HRegion layer. This layer knows all the definitions, and simply loads it by name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4712) Document rules for writing tests
[ https://issues.apache.org/jira/browse/HBASE-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-4712: - Attachment: test-doc-cleanup.txt Corrections from Jesse Yates Document rules for writing tests Key: HBASE-4712 URL: https://issues.apache.org/jira/browse/HBASE-4712 Project: HBase Issue Type: Task Components: test Affects Versions: 0.92.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.94.0 Attachments: 4712.txt, test-doc-cleanup.txt We saw that some tests could be improved. Documenting the general rules could help. Proposal: HBase tests are divided in three categories: small, medium and large, with corresponding JUnit categories: SmallTest, MediumTest, LargeTest Small tests are executed in parallel in a shared JVM. They must last less than 15 seconds. They must NOT use a cluster. Medium tests are executed in separate JVM. They must last less than 50 seconds. They can use a cluster. They must not fail occasionally. Small and medium tests must not need more than 30 minutes to run altogether. Small and medium tests should be executed by the developers before submitting a patch. Large tests are everything else. They are typically integration tests, non-regression tests for specific bugs, timeout tests, performance tests. Tests rules hints are: - As most as possible, tests should be written as small tests. - All tests should be written to support parallel execution on the same machine, hence should not use shared resources as fixed ports or fixed file names. - All tests should be written to be as fast as possible. - Tests should not overlog. More than 100 lines/second makes the logs complex to read and use i/o that are hence not available for the other tests. - Tests can be written with HBaseTestingUtility . This class offers helper function to create a temp directory and do the cleanup, or to start a cluster. - Sleeps: - Tests should not do a 'Thread.sleep' without testing an ending condition. This allows understanding what the test is waiting for. Moreover, the test will work whatever the machine performances. - Sleep should be minimal to be as fast as possible. Waiting for a variable should be done in a 40ms sleep loop. Waiting for a socket operation should be done in a 200 ms sleep loop. - Tests using cluster: - Tests using a HRegion do not have to start a cluster: A region can use the local file system. - Start/stopping a cluster cost around 10 seconds. They should not be started per test method but per class. - Started cluster must be shutdown using HBaseTestingUtility#shutdownMiniCluster, which cleans the directories. - As most as possible, tests should use the default settings for the cluster. When they don't, they should document it. This will allow to share the cluster later. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4712) Document rules for writing tests
[ https://issues.apache.org/jira/browse/HBASE-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164748#comment-13164748 ] stack commented on HBASE-4712: -- I committed Jesse's addendum to TRUNK. Document rules for writing tests Key: HBASE-4712 URL: https://issues.apache.org/jira/browse/HBASE-4712 Project: HBase Issue Type: Task Components: test Affects Versions: 0.92.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.94.0 Attachments: 4712.txt, test-doc-cleanup.txt We saw that some tests could be improved. Documenting the general rules could help. Proposal: HBase tests are divided in three categories: small, medium and large, with corresponding JUnit categories: SmallTest, MediumTest, LargeTest Small tests are executed in parallel in a shared JVM. They must last less than 15 seconds. They must NOT use a cluster. Medium tests are executed in separate JVM. They must last less than 50 seconds. They can use a cluster. They must not fail occasionally. Small and medium tests must not need more than 30 minutes to run altogether. Small and medium tests should be executed by the developers before submitting a patch. Large tests are everything else. They are typically integration tests, non-regression tests for specific bugs, timeout tests, performance tests. Tests rules hints are: - As most as possible, tests should be written as small tests. - All tests should be written to support parallel execution on the same machine, hence should not use shared resources as fixed ports or fixed file names. - All tests should be written to be as fast as possible. - Tests should not overlog. More than 100 lines/second makes the logs complex to read and use i/o that are hence not available for the other tests. - Tests can be written with HBaseTestingUtility . This class offers helper function to create a temp directory and do the cleanup, or to start a cluster. - Sleeps: - Tests should not do a 'Thread.sleep' without testing an ending condition. This allows understanding what the test is waiting for. Moreover, the test will work whatever the machine performances. - Sleep should be minimal to be as fast as possible. Waiting for a variable should be done in a 40ms sleep loop. Waiting for a socket operation should be done in a 200 ms sleep loop. - Tests using cluster: - Tests using a HRegion do not have to start a cluster: A region can use the local file system. - Start/stopping a cluster cost around 10 seconds. They should not be started per test method but per class. - Started cluster must be shutdown using HBaseTestingUtility#shutdownMiniCluster, which cleans the directories. - As most as possible, tests should use the default settings for the cluster. When they don't, they should document it. This will allow to share the cluster later. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4976) Add compaction/flush queue size metrics mistakenly removed by HFile v2
[ https://issues.apache.org/jira/browse/HBASE-4976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-4976: - Description: Upping priority, and putting it against 0.92 since J-D fingered it as blocker. Which metrics in particular are missing? Hard to patch? Priority: Blocker (was: Major) Fix Version/s: 0.92.0 Add compaction/flush queue size metrics mistakenly removed by HFile v2 -- Key: HBASE-4976 URL: https://issues.apache.org/jira/browse/HBASE-4976 Project: HBase Issue Type: Bug Reporter: Mikhail Bautin Assignee: Mikhail Bautin Priority: Blocker Fix For: 0.92.0 Upping priority, and putting it against 0.92 since J-D fingered it as blocker. Which metrics in particular are missing? Hard to patch? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4605) Constraints
[ https://issues.apache.org/jira/browse/HBASE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164765#comment-13164765 ] Hadoop QA commented on HBASE-4605: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12505280/java_HBASE-4605_v3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 18 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -160 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 74 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/465//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/465//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/465//console This message is automatically generated. Constraints --- Key: HBASE-4605 URL: https://issues.apache.org/jira/browse/HBASE-4605 Project: HBase Issue Type: Improvement Components: client, coprocessors Affects Versions: 0.94.0 Reporter: Jesse Yates Assignee: Jesse Yates Attachments: 4605.v7, constraint_as_cp.txt, java_Constraint_v2.patch, java_HBASE-4605_v1.patch, java_HBASE-4605_v2.patch, java_HBASE-4605_v3.patch From Jesse's comment on dev: {quote} What I would like to propose is a simple interface that people can use to implement a 'constraint' (matching the classic database definition). This would help ease of adoption by helping HBase more easily check that box, help minimize code duplication across organizations, and lead to easier adoption. Essentially, people would implement a 'Constraint' interface for checking keys before they are put into a table. Puts that are valid get written to the table, but if not people can will throw an exception that gets propagated back to the client explaining why the put was invalid. Constraints would be set on a per-table basis and the user would be expected to ensure the jars containing the constraint are present on the machines serving that table. Yes, people could roll their own mechanism for doing this via coprocessors each time, but this would make it easier to do so, so you only have to implement a very minimal interface and not worry about the specifics. {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4610) Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug)
[ https://issues.apache.org/jira/browse/HBASE-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164772#comment-13164772 ] Hadoop QA commented on HBASE-4610: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12506517/4610.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -160 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 72 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/467//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/467//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/467//console This message is automatically generated. Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug) - Key: HBASE-4610 URL: https://issues.apache.org/jira/browse/HBASE-4610 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0, 0.94.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Fix For: 0.92.1 Attachments: 4610.txt Over in HBASE-3380 we were having some TestMasterFailover flakiness. We added some more config parameters to better control the master startup loop where it waits for RS to heartbeat in. We had thought at the time that 92 would have a different solution but it is still relying on heartbeats to learn about RSs. For now, we should definitely bring these config params into 92/trunk. Otherwise this is an incompatible regression and adding these will also make things like what was just reported over in HBASE-4603 trivial to fix in an optimal way. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4880) Region is on service before openRegionHandler completes, may cause data loss
[ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164773#comment-13164773 ] Hadoop QA commented on HBASE-4880: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12506518/4880.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -160 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 72 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/466//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/466//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/466//console This message is automatically generated. Region is on service before openRegionHandler completes, may cause data loss Key: HBASE-4880 URL: https://issues.apache.org/jira/browse/HBASE-4880 Project: HBase Issue Type: Bug Affects Versions: 0.92.0, 0.94.0 Reporter: chunhui shen Assignee: chunhui shen Attachments: 4880.txt, hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch OpenRegionHandler in regionserver is processed as the following steps: {code} 1.openregion()(Through it, closed = false, closing = false) 2.addToOnlineRegions(region) 3.update .meta. table 4.update ZK's node state to RS_ZK_REGION_OPEND {code} We can find that region is on service before Step 4. It means client could put data to this region after step 3. What will happen if step 4 is failed processing? It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver. If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4880) Region is on service before openRegionHandler completes, may cause data loss
[ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164777#comment-13164777 ] Zhihong Yu commented on HBASE-4880: --- Latest patch passes all tests. +1. Region is on service before openRegionHandler completes, may cause data loss Key: HBASE-4880 URL: https://issues.apache.org/jira/browse/HBASE-4880 Project: HBase Issue Type: Bug Affects Versions: 0.92.0, 0.94.0 Reporter: chunhui shen Assignee: chunhui shen Attachments: 4880.txt, hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch OpenRegionHandler in regionserver is processed as the following steps: {code} 1.openregion()(Through it, closed = false, closing = false) 2.addToOnlineRegions(region) 3.update .meta. table 4.update ZK's node state to RS_ZK_REGION_OPEND {code} We can find that region is on service before Step 4. It means client could put data to this region after step 3. What will happen if step 4 is failed processing? It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver. If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4610) Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug)
[ https://issues.apache.org/jira/browse/HBASE-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164785#comment-13164785 ] Zhihong Yu commented on HBASE-4610: --- Test suite passes. Will commit later today if no objections. Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug) - Key: HBASE-4610 URL: https://issues.apache.org/jira/browse/HBASE-4610 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0, 0.94.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Fix For: 0.92.1 Attachments: 4610.txt Over in HBASE-3380 we were having some TestMasterFailover flakiness. We added some more config parameters to better control the master startup loop where it waits for RS to heartbeat in. We had thought at the time that 92 would have a different solution but it is still relying on heartbeats to learn about RSs. For now, we should definitely bring these config params into 92/trunk. Otherwise this is an incompatible regression and adding these will also make things like what was just reported over in HBASE-4603 trivial to fix in an optimal way. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4927) CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty
[ https://issues.apache.org/jira/browse/HBASE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164797#comment-13164797 ] Hudson commented on HBASE-4927: --- Integrated in HBase-TRUNK #2524 (See [https://builds.apache.org/job/HBase-TRUNK/2524/]) HBASE-4927 Addendum fixes case where start key is empty and end key is empty tedyu : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HRegionInfo.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/CatalogJanitor.java CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty --- Key: HBASE-4927 URL: https://issues.apache.org/jira/browse/HBASE-4927 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0, 0.94.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Fix For: 0.92.0 Attachments: 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch, 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch, 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator-.patch, 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator_v2.patch, hbase-4927-fix-ws.txt When reviewing HBASE-4238 backporting, Jon found this issue. What happens if the split points are (empty end key is the last key, empty start key is the first key) Parent [A,) L daughter [A,B), R daughter [B,) When sorted, we gets to end key comparision which results in this incorrector order: [A,B), [A,), [B,) we wanted: [A,), [A,B), [B,) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4956) Control direct memory buffer consumption by HBaseClient
[ https://issues.apache.org/jira/browse/HBASE-4956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164798#comment-13164798 ] Zhihong Yu commented on HBASE-4956: --- Since the proposal involves asynchronous communication, we should devise new API which can be used to validate the reduction in use of direct memory buffer. Control direct memory buffer consumption by HBaseClient --- Key: HBASE-4956 URL: https://issues.apache.org/jira/browse/HBASE-4956 Project: HBase Issue Type: New Feature Reporter: Ted Yu As Jonathan explained here https://groups.google.com/group/asynchbase/browse_thread/thread/c45bc7ba788b2357?pli=1 , standard hbase client inadvertently consumes large amount of direct memory. We should consider using netty for NIO-related tasks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4927) CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty
[ https://issues.apache.org/jira/browse/HBASE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-4927: -- Resolution: Fixed Status: Resolved (was: Patch Available) TRUNK build is back to normal. Resolving again. CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty --- Key: HBASE-4927 URL: https://issues.apache.org/jira/browse/HBASE-4927 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0, 0.94.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Fix For: 0.92.0 Attachments: 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch, 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch, 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator-.patch, 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator_v2.patch, hbase-4927-fix-ws.txt When reviewing HBASE-4238 backporting, Jon found this issue. What happens if the split points are (empty end key is the last key, empty start key is the first key) Parent [A,) L daughter [A,B), R daughter [B,) When sorted, we gets to end key comparision which results in this incorrector order: [A,B), [A,), [B,) we wanted: [A,), [A,B), [B,) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4946) HTable.coprocessorExec (and possibly coprocessorProxy) does not work with dynamically loaded coprocessors (from hdfs or local system), because the RPC system tries to d
[ https://issues.apache.org/jira/browse/HBASE-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164805#comment-13164805 ] Hadoop QA commented on HBASE-4946: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12506521/4946-v4.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -160 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 73 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/468//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/468//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/468//console This message is automatically generated. HTable.coprocessorExec (and possibly coprocessorProxy) does not work with dynamically loaded coprocessors (from hdfs or local system), because the RPC system tries to deserialize an unknown class. - Key: HBASE-4946 URL: https://issues.apache.org/jira/browse/HBASE-4946 Project: HBase Issue Type: Bug Components: coprocessors Affects Versions: 0.92.0 Reporter: Andrei Dragomir Assignee: Andrei Dragomir Attachments: 4946-v4.txt, HBASE-4946-v2.patch, HBASE-4946-v3.patch, HBASE-4946.patch Loading coprocessors jars from hdfs works fine. I load it from the shell, after setting the attribute, and it gets loaded: {noformat} INFO org.apache.hadoop.hbase.regionserver.HRegion: Setting up tabledescriptor config now ... INFO org.apache.hadoop.hbase.coprocessor.CoprocessorHost: Class com.MyCoprocessorClass needs to be loaded from a file - hdfs://localhost:9000/coproc/rt- 0.0.1-SNAPSHOT.jar. INFO org.apache.hadoop.hbase.coprocessor.CoprocessorHost: loadInstance: com.MyCoprocessorClass INFO org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost: RegionEnvironment createEnvironment DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Registered protocol handler: region=t1,,1322572939753.6409aee1726d31f5e5671a59fe6e384f. protocol=com.MyCoprocessorClassProtocol INFO org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost: Load coprocessor com.MyCoprocessorClass from HTD of t1 successfully. {noformat} The problem is that this coprocessors simply extends BaseEndpointCoprocessor, with a dynamic method. When calling this method from the client with HTable.coprocessorExec, I get errors on the HRegionServer, because the call cannot be deserialized from writables. The problem is that Exec tries to do an early resolve of the coprocessor class. The coprocessor class is loaded, but it is in the context of the HRegionServer / HRegion. So, the call fails: {noformat} 2011-12-02 00:34:17,348 ERROR org.apache.hadoop.hbase.io.HbaseObjectWritable: Error in readFields java.io.IOException: Protocol class com.MyCoprocessorClassProtocol not found at org.apache.hadoop.hbase.client.coprocessor.Exec.readFields(Exec.java:125) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:575) at org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:105) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1237) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1167) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:703) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:495) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:470) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:680) Caused by: java.lang.ClassNotFoundException: com.MyCoprocessorClassProtocol at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at
[jira] [Commented] (HBASE-4974) Remove some resources leaks on the tests
[ https://issues.apache.org/jira/browse/HBASE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164827#comment-13164827 ] Jonathan Hsieh commented on HBASE-4974: --- The test failures are related to a problem in HBASE-4927. An addendum was added and those 3 tests should pass now. Remove some resources leaks on the tests Key: HBASE-4974 URL: https://issues.apache.org/jira/browse/HBASE-4974 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 4974_all.patch Cf. title and HBASE-4965 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4972) Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch.
[ https://issues.apache.org/jira/browse/HBASE-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164830#comment-13164830 ] Jonathan Hsieh commented on HBASE-4972: --- Ted has filed HBASE-4977 and closed HBASE-3848. I will resolving this issue as Not a bug Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch. -- Key: HBASE-4972 URL: https://issues.apache.org/jira/browse/HBASE-4972 Project: HBase Issue Type: Task Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Priority: Critical Fix For: 0.92.0 There are several issues that have been committed in the 0.90 branch but were not in trunk/0.92 branch. These regressions should be forward ported. HBASE-3320 ! HBASE-3380 ! - HBASE-4610 is a jira to backports this, but it is not done. HBASE-3410 ! HBASE-3501 ! HBASE-3714 ! HBASE-3729 !! Marked in 0.92 but not committed there, committed in 0.90 branch. HBASE-3848 ! HBASE-3892 ! * Comments say trunk does not need. HBASE-3906 ! HBASE-3989 ! HBASE-4109 ! HBASE-4160 !! Marked resolved 0.90.5, but no corresponding commit in either 0.90 or 0.92 HBASE-4423 ! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-4972) Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch.
[ https://issues.apache.org/jira/browse/HBASE-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh resolved HBASE-4972. --- Resolution: Not A Problem Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch. -- Key: HBASE-4972 URL: https://issues.apache.org/jira/browse/HBASE-4972 Project: HBase Issue Type: Task Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Priority: Critical Fix For: 0.92.0 There are several issues that have been committed in the 0.90 branch but were not in trunk/0.92 branch. These regressions should be forward ported. HBASE-3320 ! HBASE-3380 ! - HBASE-4610 is a jira to backports this, but it is not done. HBASE-3410 ! HBASE-3501 ! HBASE-3714 ! HBASE-3729 !! Marked in 0.92 but not committed there, committed in 0.90 branch. HBASE-3848 ! HBASE-3892 ! * Comments say trunk does not need. HBASE-3906 ! HBASE-3989 ! HBASE-4109 ! HBASE-4160 !! Marked resolved 0.90.5, but no corresponding commit in either 0.90 or 0.92 HBASE-4423 ! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4610) Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug)
[ https://issues.apache.org/jira/browse/HBASE-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164835#comment-13164835 ] Jonathan Hsieh commented on HBASE-4610: --- I had started doing this also -- are you sure you want to keep the 'if (count == oldcount count 0) break' line? It was removed on the 0.90 version. {code} +long slept = 0; for (int oldcount = countOfRegionServers(); !this.master.isStopped();) { Thread.sleep(interval); + slept += interval; count = countOfRegionServers(); if (count == oldcount count 0) break; String msg; + if (count == oldcount count = minToStart slept = timeout) { +LOG.info(Finished waiting for regionserver count to settle; + +count= + count + , sleptFor= + slept); +break; {code} Before and after test, TestMasterFailover seemed flaky for me on the 0.92 branch. Is the plan for this 0.92.0 or 0.92.1? Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug) - Key: HBASE-4610 URL: https://issues.apache.org/jira/browse/HBASE-4610 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0, 0.94.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Fix For: 0.92.1 Attachments: 4610.txt Over in HBASE-3380 we were having some TestMasterFailover flakiness. We added some more config parameters to better control the master startup loop where it waits for RS to heartbeat in. We had thought at the time that 92 would have a different solution but it is still relying on heartbeats to learn about RSs. For now, we should definitely bring these config params into 92/trunk. Otherwise this is an incompatible regression and adding these will also make things like what was just reported over in HBASE-4603 trivial to fix in an optimal way. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4610) Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug)
[ https://issues.apache.org/jira/browse/HBASE-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164839#comment-13164839 ] Zhihong Yu commented on HBASE-4610: --- Thanks for the review Jonathan. The first break statement should be removed. I ran TestMasterFailover on MacBook and didn't see failure. I think this should go to 0.92.0 Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug) - Key: HBASE-4610 URL: https://issues.apache.org/jira/browse/HBASE-4610 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0, 0.94.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Fix For: 0.92.1 Attachments: 4610.txt Over in HBASE-3380 we were having some TestMasterFailover flakiness. We added some more config parameters to better control the master startup loop where it waits for RS to heartbeat in. We had thought at the time that 92 would have a different solution but it is still relying on heartbeats to learn about RSs. For now, we should definitely bring these config params into 92/trunk. Otherwise this is an incompatible regression and adding these will also make things like what was just reported over in HBASE-4603 trivial to fix in an optimal way. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4974) Remove some resources leaks on the tests
[ https://issues.apache.org/jira/browse/HBASE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-4974: --- Status: Open (was: Patch Available) Remove some resources leaks on the tests Key: HBASE-4974 URL: https://issues.apache.org/jira/browse/HBASE-4974 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 4974_all.patch, 4974_all.v2.patch Cf. title and HBASE-4965 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4610) Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug)
[ https://issues.apache.org/jira/browse/HBASE-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-4610: -- Fix Version/s: (was: 0.92.1) 0.94.0 0.92.0 Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug) - Key: HBASE-4610 URL: https://issues.apache.org/jira/browse/HBASE-4610 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0, 0.94.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Fix For: 0.92.0, 0.94.0 Attachments: 4610.txt Over in HBASE-3380 we were having some TestMasterFailover flakiness. We added some more config parameters to better control the master startup loop where it waits for RS to heartbeat in. We had thought at the time that 92 would have a different solution but it is still relying on heartbeats to learn about RSs. For now, we should definitely bring these config params into 92/trunk. Otherwise this is an incompatible regression and adding these will also make things like what was just reported over in HBASE-4603 trivial to fix in an optimal way. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4972) Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch.
[ https://issues.apache.org/jira/browse/HBASE-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164841#comment-13164841 ] stack commented on HBASE-4972: -- Nice work Jon 'Auditor' Hsieh. Thanks. Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch. -- Key: HBASE-4972 URL: https://issues.apache.org/jira/browse/HBASE-4972 Project: HBase Issue Type: Task Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Priority: Critical Fix For: 0.92.0 There are several issues that have been committed in the 0.90 branch but were not in trunk/0.92 branch. These regressions should be forward ported. HBASE-3320 ! HBASE-3380 ! - HBASE-4610 is a jira to backports this, but it is not done. HBASE-3410 ! HBASE-3501 ! HBASE-3714 ! HBASE-3729 !! Marked in 0.92 but not committed there, committed in 0.90 branch. HBASE-3848 ! HBASE-3892 ! * Comments say trunk does not need. HBASE-3906 ! HBASE-3989 ! HBASE-4109 ! HBASE-4160 !! Marked resolved 0.90.5, but no corresponding commit in either 0.90 or 0.92 HBASE-4423 ! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4972) Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch.
[ https://issues.apache.org/jira/browse/HBASE-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164842#comment-13164842 ] stack commented on HBASE-4972: -- Nice work Jon 'Auditor' Hsieh. Thanks. Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch. -- Key: HBASE-4972 URL: https://issues.apache.org/jira/browse/HBASE-4972 Project: HBase Issue Type: Task Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Priority: Critical Fix For: 0.92.0 There are several issues that have been committed in the 0.90 branch but were not in trunk/0.92 branch. These regressions should be forward ported. HBASE-3320 ! HBASE-3380 ! - HBASE-4610 is a jira to backports this, but it is not done. HBASE-3410 ! HBASE-3501 ! HBASE-3714 ! HBASE-3729 !! Marked in 0.92 but not committed there, committed in 0.90 branch. HBASE-3848 ! HBASE-3892 ! * Comments say trunk does not need. HBASE-3906 ! HBASE-3989 ! HBASE-4109 ! HBASE-4160 !! Marked resolved 0.90.5, but no corresponding commit in either 0.90 or 0.92 HBASE-4423 ! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4974) Remove some resources leaks on the tests
[ https://issues.apache.org/jira/browse/HBASE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-4974: --- Status: Patch Available (was: Open) Thanks for the info, Jon. Let's retry then. Remove some resources leaks on the tests Key: HBASE-4974 URL: https://issues.apache.org/jira/browse/HBASE-4974 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 4974_all.patch, 4974_all.v2.patch Cf. title and HBASE-4965 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4610) Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug)
[ https://issues.apache.org/jira/browse/HBASE-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164844#comment-13164844 ] Jonathan Hsieh commented on HBASE-4610: --- I think if the tests are no worse than before, 0.92.0 sounds reasonable to me. Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug) - Key: HBASE-4610 URL: https://issues.apache.org/jira/browse/HBASE-4610 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0, 0.94.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Fix For: 0.92.0, 0.94.0 Attachments: 4610.txt Over in HBASE-3380 we were having some TestMasterFailover flakiness. We added some more config parameters to better control the master startup loop where it waits for RS to heartbeat in. We had thought at the time that 92 would have a different solution but it is still relying on heartbeats to learn about RSs. For now, we should definitely bring these config params into 92/trunk. Otherwise this is an incompatible regression and adding these will also make things like what was just reported over in HBASE-4603 trivial to fix in an optimal way. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira