[jira] [Commented] (HBASE-4927) CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty

2011-12-07 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164260#comment-13164260
 ] 

Hudson commented on HBASE-4927:
---

Integrated in HBase-0.92-security #32 (See 
[https://builds.apache.org/job/HBase-0.92-security/32/])
HBASE-4927 CatalogJanior:SplitParentFirstComparator doesn't sort as 
expected, for the last region when the endkey is empty

stack : 
Files : 
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/HRegionInfo.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/CatalogJanitor.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java


 CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the 
 last region when the endkey is empty
 ---

 Key: HBASE-4927
 URL: https://issues.apache.org/jira/browse/HBASE-4927
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.92.0

 Attachments: 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator-.patch, 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator_v2.patch, 
 hbase-4927-fix-ws.txt


 When reviewing HBASE-4238 backporting, Jon found this issue.
 What happens if the split points are  (empty end key is the last key, empty 
 start key is the first key)
 Parent [A,)
 L daughter [A,B), 
 R daughter [B,)
 When sorted, we gets to end key comparision which results in this incorrector 
 order:
 [A,B), [A,), [B,) 
 we wanted:
 [A,), [A,B), [B,)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4729) Clash between region unassign and splitting kills the master

2011-12-07 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164259#comment-13164259
 ] 

Hudson commented on HBASE-4729:
---

Integrated in HBase-0.92-security #32 (See 
[https://builds.apache.org/job/HBase-0.92-security/32/])
HBASE-4729 Clash between region unassign and splitting kills the master

stack : 
Files : 
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKTable.java


 Clash between region unassign and splitting kills the master
 

 Key: HBASE-4729
 URL: https://issues.apache.org/jira/browse/HBASE-4729
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Assignee: stack
Priority: Critical
 Fix For: 0.92.0, 0.94.0

 Attachments: 4729-v2.txt, 4729-v3.txt, 4729-v4.txt, 4729-v5.txt, 
 4729-v6-092.txt, 4729-v6-trunk.txt, 4729.txt


 I was running an online alter while regions were splitting, and suddenly the 
 master died and left my table half-altered (haven't restarted the master yet).
 What killed the master:
 {quote}
 2011-11-02 17:06:44,428 FATAL org.apache.hadoop.hbase.master.HMaster: 
 Unexpected ZK exception creating node CLOSING
 org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = 
 NodeExists for /hbase/unassigned/f7e1783e65ea8d621a4bc96ad310f101
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:110)
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
 at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637)
 at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:459)
 at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:441)
 at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:769)
 at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.createNodeClosing(ZKAssign.java:568)
 at 
 org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1722)
 at 
 org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1661)
 at org.apache.hadoop.hbase.master.BulkReOpen$1.run(BulkReOpen.java:69)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 {quote}
 A znode was created because the region server was splitting the region 4 
 seconds before:
 {quote}
 2011-11-02 17:06:40,704 INFO 
 org.apache.hadoop.hbase.regionserver.SplitTransaction: Starting split of 
 region TestTable,0012469153,1320253135043.f7e1783e65ea8d621a4bc96ad310f101.
 2011-11-02 17:06:40,704 DEBUG 
 org.apache.hadoop.hbase.regionserver.SplitTransaction: 
 regionserver:62023-0x132f043bbde0710 Creating ephemeral node for 
 f7e1783e65ea8d621a4bc96ad310f101 in SPLITTING state
 2011-11-02 17:06:40,751 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 regionserver:62023-0x132f043bbde0710 Attempting to transition node 
 f7e1783e65ea8d621a4bc96ad310f101 from RS_ZK_REGION_SPLITTING to 
 RS_ZK_REGION_SPLITTING
 ...
 2011-11-02 17:06:44,061 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 regionserver:62023-0x132f043bbde0710 Successfully transitioned node 
 f7e1783e65ea8d621a4bc96ad310f101 from RS_ZK_REGION_SPLITTING to 
 RS_ZK_REGION_SPLIT
 2011-11-02 17:06:44,061 INFO 
 org.apache.hadoop.hbase.regionserver.SplitTransaction: Still waiting on the 
 master to process the split for f7e1783e65ea8d621a4bc96ad310f101
 {quote}
 Now that the master is dead the region server is spewing those last two lines 
 like mad.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4970) Add a parameter to change keepAliveTime of Htable thread pool.

2011-12-07 Thread gaojinchao (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gaojinchao updated HBASE-4970:
--

Attachment: HBASE-4970_Branch90.patch

 Add a parameter  to change keepAliveTime of Htable thread pool.
 ---

 Key: HBASE-4970
 URL: https://issues.apache.org/jira/browse/HBASE-4970
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.90.4
Reporter: gaojinchao
Assignee: gaojinchao
Priority: Trivial
 Fix For: 0.90.5

 Attachments: HBASE-4970_Branch90.patch


 In my cluster, I changed keepAliveTime from 60 s to 3600 s.  Increasing RES 
 is slowed down.
 Why increasing keepAliveTime of HBase thread pool is slowing down our problem 
 occurance [RES value increase]?
 You can go through the source of sun.nio.ch.Util. Every thread hold 3 
 softreference of direct buffer(mustangsrc) for reusage. The code names the 3 
 softreferences buffercache. If the buffer was all occupied or none was 
 suitable in size, and new request comes, new direct buffer is allocated. 
 After the service, the bigger one replaces the smaller one in buffercache. 
 The replaced buffer is released.
 So I think we can add a parameter to change keepAliveTime of Htable thread 
 pool.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4970) Add a parameter to change keepAliveTime of Htable thread pool.

2011-12-07 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164274#comment-13164274
 ] 

Ted Yu commented on HBASE-4970:
---

I think there shouldn't be upper case letters in name of new config. 

 Add a parameter  to change keepAliveTime of Htable thread pool.
 ---

 Key: HBASE-4970
 URL: https://issues.apache.org/jira/browse/HBASE-4970
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.90.4
Reporter: gaojinchao
Assignee: gaojinchao
Priority: Trivial
 Fix For: 0.90.5

 Attachments: HBASE-4970_Branch90.patch


 In my cluster, I changed keepAliveTime from 60 s to 3600 s.  Increasing RES 
 is slowed down.
 Why increasing keepAliveTime of HBase thread pool is slowing down our problem 
 occurance [RES value increase]?
 You can go through the source of sun.nio.ch.Util. Every thread hold 3 
 softreference of direct buffer(mustangsrc) for reusage. The code names the 3 
 softreferences buffercache. If the buffer was all occupied or none was 
 suitable in size, and new request comes, new direct buffer is allocated. 
 After the service, the bigger one replaces the smaller one in buffercache. 
 The replaced buffer is released.
 So I think we can add a parameter to change keepAliveTime of Htable thread 
 pool.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4970) Add a parameter to change keepAliveTime of Htable thread pool.

2011-12-07 Thread gaojinchao (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164276#comment-13164276
 ] 

gaojinchao commented on HBASE-4970:
---

Sorry, I didn't see the Lars's comment. I will try to backport HBASE-4805.

 Add a parameter  to change keepAliveTime of Htable thread pool.
 ---

 Key: HBASE-4970
 URL: https://issues.apache.org/jira/browse/HBASE-4970
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.90.4
Reporter: gaojinchao
Assignee: gaojinchao
Priority: Trivial
 Fix For: 0.90.5

 Attachments: HBASE-4970_Branch90.patch


 In my cluster, I changed keepAliveTime from 60 s to 3600 s.  Increasing RES 
 is slowed down.
 Why increasing keepAliveTime of HBase thread pool is slowing down our problem 
 occurance [RES value increase]?
 You can go through the source of sun.nio.ch.Util. Every thread hold 3 
 softreference of direct buffer(mustangsrc) for reusage. The code names the 3 
 softreferences buffercache. If the buffer was all occupied or none was 
 suitable in size, and new request comes, new direct buffer is allocated. 
 After the service, the bigger one replaces the smaller one in buffercache. 
 The replaced buffer is released.
 So I think we can add a parameter to change keepAliveTime of Htable thread 
 pool.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4927) CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty

2011-12-07 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164281#comment-13164281
 ] 

Hudson commented on HBASE-4927:
---

Integrated in HBase-TRUNK-security #24 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/24/])
HBASE-4927 CatalogJanior:SplitParentFirstComparator doesn't sort as 
expected, for the last region when the endkey is empty

stack : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HRegionInfo.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/CatalogJanitor.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java


 CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the 
 last region when the endkey is empty
 ---

 Key: HBASE-4927
 URL: https://issues.apache.org/jira/browse/HBASE-4927
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.92.0

 Attachments: 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator-.patch, 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator_v2.patch, 
 hbase-4927-fix-ws.txt


 When reviewing HBASE-4238 backporting, Jon found this issue.
 What happens if the split points are  (empty end key is the last key, empty 
 start key is the first key)
 Parent [A,)
 L daughter [A,B), 
 R daughter [B,)
 When sorted, we gets to end key comparision which results in this incorrector 
 order:
 [A,B), [A,), [B,) 
 we wanted:
 [A,), [A,B), [B,)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4729) Clash between region unassign and splitting kills the master

2011-12-07 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164280#comment-13164280
 ] 

Hudson commented on HBASE-4729:
---

Integrated in HBase-TRUNK-security #24 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/24/])
HBASE-4729 Clash between region unassign and splitting kills the master

stack : 
Files : 
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKTable.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java


 Clash between region unassign and splitting kills the master
 

 Key: HBASE-4729
 URL: https://issues.apache.org/jira/browse/HBASE-4729
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Assignee: stack
Priority: Critical
 Fix For: 0.92.0, 0.94.0

 Attachments: 4729-v2.txt, 4729-v3.txt, 4729-v4.txt, 4729-v5.txt, 
 4729-v6-092.txt, 4729-v6-trunk.txt, 4729.txt


 I was running an online alter while regions were splitting, and suddenly the 
 master died and left my table half-altered (haven't restarted the master yet).
 What killed the master:
 {quote}
 2011-11-02 17:06:44,428 FATAL org.apache.hadoop.hbase.master.HMaster: 
 Unexpected ZK exception creating node CLOSING
 org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = 
 NodeExists for /hbase/unassigned/f7e1783e65ea8d621a4bc96ad310f101
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:110)
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
 at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637)
 at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:459)
 at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:441)
 at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:769)
 at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.createNodeClosing(ZKAssign.java:568)
 at 
 org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1722)
 at 
 org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1661)
 at org.apache.hadoop.hbase.master.BulkReOpen$1.run(BulkReOpen.java:69)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 {quote}
 A znode was created because the region server was splitting the region 4 
 seconds before:
 {quote}
 2011-11-02 17:06:40,704 INFO 
 org.apache.hadoop.hbase.regionserver.SplitTransaction: Starting split of 
 region TestTable,0012469153,1320253135043.f7e1783e65ea8d621a4bc96ad310f101.
 2011-11-02 17:06:40,704 DEBUG 
 org.apache.hadoop.hbase.regionserver.SplitTransaction: 
 regionserver:62023-0x132f043bbde0710 Creating ephemeral node for 
 f7e1783e65ea8d621a4bc96ad310f101 in SPLITTING state
 2011-11-02 17:06:40,751 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 regionserver:62023-0x132f043bbde0710 Attempting to transition node 
 f7e1783e65ea8d621a4bc96ad310f101 from RS_ZK_REGION_SPLITTING to 
 RS_ZK_REGION_SPLITTING
 ...
 2011-11-02 17:06:44,061 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 regionserver:62023-0x132f043bbde0710 Successfully transitioned node 
 f7e1783e65ea8d621a4bc96ad310f101 from RS_ZK_REGION_SPLITTING to 
 RS_ZK_REGION_SPLIT
 2011-11-02 17:06:44,061 INFO 
 org.apache.hadoop.hbase.regionserver.SplitTransaction: Still waiting on the 
 master to process the split for f7e1783e65ea8d621a4bc96ad310f101
 {quote}
 Now that the master is dead the region server is spewing those last two lines 
 like mad.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4968) Add to troubleshooting workaround for direct buffer oome's.

2011-12-07 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164278#comment-13164278
 ] 

Hudson commented on HBASE-4968:
---

Integrated in HBase-TRUNK-security #24 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/24/])
HBASE-4968 Add to troubleshooting workaround for direct buffer oome's.

stack : 
Files : 
* /hbase/trunk/src/docbkx/troubleshooting.xml


 Add to troubleshooting workaround for direct buffer oome's.
 ---

 Key: HBASE-4968
 URL: https://issues.apache.org/jira/browse/HBASE-4968
 Project: HBase
  Issue Type: Task
Reporter: stack
Assignee: stack
 Fix For: 0.94.0

 Attachments: client.oome.txt


 Put into book workaround arrived at up on list discussing client oome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4376) Document login configuration when running on top of secure Hadoop with Kerberos auth enabled

2011-12-07 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164282#comment-13164282
 ] 

Hudson commented on HBASE-4376:
---

Integrated in HBase-TRUNK-security #24 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/24/])
HBASE-4376 Document mutual authentication between HBase and Zookeeper using 
SASL

stack : 
Files : 
* /hbase/trunk/src/docbkx/configuration.xml


 Document login configuration when running on top of secure Hadoop with 
 Kerberos auth enabled
 

 Key: HBASE-4376
 URL: https://issues.apache.org/jira/browse/HBASE-4376
 Project: HBase
  Issue Type: Task
  Components: documentation, security
Affects Versions: 0.90.4
Reporter: Gary Helmling

 We provide basic support for HBase to run on top of kerberos-authenticated 
 Hadoop, by providing configuration options to have HMaster and HRegionServer 
 login from a keytab on startup.  But this isn't documented anywhere outside 
 of hbase-default.xml.  We need to provide some basic guidance on setup in the 
 HBase docs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4712) Document rules for writing tests

2011-12-07 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164279#comment-13164279
 ] 

Hudson commented on HBASE-4712:
---

Integrated in HBase-TRUNK-security #24 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/24/])
HBASE-4712 Document rules for writing tests

stack : 
Files : 
* /hbase/trunk/src/docbkx/developer.xml


 Document rules for writing tests
 

 Key: HBASE-4712
 URL: https://issues.apache.org/jira/browse/HBASE-4712
 Project: HBase
  Issue Type: Task
  Components: test
Affects Versions: 0.92.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.94.0

 Attachments: 4712.txt


 We saw that some tests could be improved. Documenting the general rules could 
 help.
 Proposal:
 HBase tests are divided in three categories: small, medium and large, with 
 corresponding JUnit categories: SmallTest, MediumTest, LargeTest
 Small tests are executed in parallel in a shared JVM. They must last less 
 than 15 seconds. They must NOT use a cluster.
 Medium tests are executed in separate JVM. They must last less than 50 
 seconds. They can use a cluster. They must not fail occasionally.
 Small and medium tests must not need more than 30 minutes to run altogether.
 Small and medium tests should be executed by the developers before submitting 
 a patch.
 Large tests are everything else. They are typically integration tests, 
 non-regression tests for specific bugs, timeout tests, performance tests.
 Tests rules  hints are:
 - As most as possible, tests should be written as small tests.
 - All tests should be written to support parallel execution on the same 
 machine, hence should not use shared resources as fixed ports or fixed file 
 names.
 - All tests should be written to be as fast as possible.
 - Tests should not overlog. More than 100 lines/second makes the logs complex 
 to read and use i/o that are hence not available for the other tests.
 - Tests can be written with HBaseTestingUtility . This class offers helper 
 function to create a temp directory and do the cleanup, or to start a cluster.
 - Sleeps:
 - Tests should not do a 'Thread.sleep' without testing an ending 
 condition. This allows understanding what the test is waiting for. Moreover, 
 the test will work whatever the machine performances.
 - Sleep should be minimal to be as fast as possible. Waiting for a 
 variable should be done in a 40ms sleep loop. Waiting for a socket operation 
 should be done in a 200 ms sleep loop.
 - Tests using cluster:
 - Tests using a HRegion do not have to start a cluster: A region can use 
 the local file system.
 - Start/stopping a cluster cost around 10 seconds. They should not be 
 started per test method but per class.
 - Started cluster must be shutdown using 
 HBaseTestingUtility#shutdownMiniCluster, which cleans the directories.
 - As most as possible, tests should use the default settings for the 
 cluster. When they don't, they should document it. This will allow to share 
 the cluster later.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4964) Add builddate, make less sections in toc, and add header and footer customizations

2011-12-07 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164283#comment-13164283
 ] 

Hudson commented on HBASE-4964:
---

Integrated in HBase-TRUNK-security #24 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/24/])
HBASE-4964 Add builddate, make less sections in toc, and add header and 
footer customizations

stack : 
Files : 
* /hbase/trunk/pom.xml
* /hbase/trunk/src/docbkx/book.xml
* /hbase/trunk/src/docbkx/customization.xsl


 Add builddate, make less sections in toc, and add header and footer 
 customizations
 --

 Key: HBASE-4964
 URL: https://issues.apache.org/jira/browse/HBASE-4964
 Project: HBase
  Issue Type: Improvement
Reporter: stack
 Fix For: 0.94.0

 Attachments: 4964.txt


 The customizations are for adding facebook comments.  I tried it but not 
 working for me immediately; need some xsl jujitsu so I can get name of 
 current page into the current footer.
 Added a buildDate define in iso-8601 to the pom used in 'reference guide'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4936) Cached HRegionInterface connections crash when getting UnknownHost exceptions

2011-12-07 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164284#comment-13164284
 ] 

Hudson commented on HBASE-4936:
---

Integrated in HBase-TRUNK-security #24 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/24/])
HBASE-4936 Cached HRegionInterface connections crash when getting 
UnknownHost exceptions

stack : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java


 Cached HRegionInterface connections crash when getting UnknownHost exceptions
 -

 Key: HBASE-4936
 URL: https://issues.apache.org/jira/browse/HBASE-4936
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Andrei Dragomir
Assignee: Andrei Dragomir
 Fix For: 0.94.0

 Attachments: HBASE-4936-v2.patch, HBASE-4936.patch


 This isssue is unlikely to come up in a cluster test case. However, for 
 development, the following thing happens: 
 1. Start the HBase cluster locally, on network A (DNS A, etc)
 2. The region locations are cached using the hostname 
 (mycomputer.company.com, 211.x.y.z - real ip)
 3. Change network location (go home)
 4. Start the HBase cluster locally. My hostname / ips are not different 
 (mycomputer, 192.168.0.130 - new ip)
 If the region locations have been cached using the hostname, there is an 
 UnknownHostException in CatalogTracker.getCachedConnection(ServerName sn), 
 uncaught in the catch statements. The server will crash constantly. 
 The error should be caught and not rethrown, so that the cached connection 
 expires normally. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4937) Error in Quick Start Shell Exercises

2011-12-07 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164285#comment-13164285
 ] 

Hudson commented on HBASE-4937:
---

Integrated in HBase-TRUNK-security #24 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/24/])
HBASE-4937 Error in Quick Start Shell Exercises

stack : 
Files : 
* /hbase/trunk/src/docbkx/getting_started.xml


 Error in Quick Start Shell Exercises
 

 Key: HBASE-4937
 URL: https://issues.apache.org/jira/browse/HBASE-4937
 Project: HBase
  Issue Type: Bug
  Components: documentation
Reporter: Ryan Berdeen
Assignee: stack
 Fix For: 0.94.0

 Attachments: 4937.txt


 The shell exercises in the Quick Start 
 (http://hbase.apache.org/book/quickstart.html) starts
 {code}
 hbase(main):003:0 create 'test', 'cf'
 0 row(s) in 1.2200 seconds
 hbase(main):003:0 list 'table'
 test
 1 row(s) in 0.0550 seconds
 {code}
 It looks like the second command is wrong. Running it, the actual output is
 {code}
 hbase(main):001:0 create 'test', 'cf'
 0 row(s) in 0.3630 seconds
 hbase(main):002:0 list 'table'
 TABLE 
   
   
 0 row(s) in 0.0100 seconds
 {code}
 The argument to list should be 'test', not 'table', and the output in the 
 example is missing the {{TABLE}} line.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests

2011-12-07 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-4965:
---

Attachment: 4965_all.patch

 Monitor the open file descriptors and the threads counters during the unit 
 tests
 

 Key: HBASE-4965
 URL: https://issues.apache.org/jira/browse/HBASE-4965
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 4965_all.patch, ResourceChecker.java, 
 ResourceCheckerJUnitRule.java


 We're seeing a lot of issues with hadoop-qa related to threads or file 
 descriptors.
 Monitoring these counters would ease the analysis.
 Note as well that
  - if we want to execute the tests in the same jvm (because the test is small 
 or because we want to share the cluster) we can't afford to leak too many 
 resources
  - if the tests leak, it's more difficult to detect a leak in the software 
 itself.
 I attach piece of code that I used. It requires two lines in a unit test 
 class to:
 - before every test, count the threads and the open file descriptor
 - after every test, compare with the previous value.
 I ran it on some tests; we have for example:
 - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 
 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 
 232 threads!
 - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 
 283 file descriptors (was 282).
 - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 
 294), 815 file descriptors (was 461)
 - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 
 148), 310 file descriptors (was 307).
 It's not always leaks, we can expect some pooling effects. But still...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests

2011-12-07 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-4965:
---

Status: Patch Available  (was: Open)

 Monitor the open file descriptors and the threads counters during the unit 
 tests
 

 Key: HBASE-4965
 URL: https://issues.apache.org/jira/browse/HBASE-4965
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 4965_all.patch, ResourceChecker.java, 
 ResourceCheckerJUnitRule.java


 We're seeing a lot of issues with hadoop-qa related to threads or file 
 descriptors.
 Monitoring these counters would ease the analysis.
 Note as well that
  - if we want to execute the tests in the same jvm (because the test is small 
 or because we want to share the cluster) we can't afford to leak too many 
 resources
  - if the tests leak, it's more difficult to detect a leak in the software 
 itself.
 I attach piece of code that I used. It requires two lines in a unit test 
 class to:
 - before every test, count the threads and the open file descriptor
 - after every test, compare with the previous value.
 I ran it on some tests; we have for example:
 - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 
 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 
 232 threads!
 - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 
 283 file descriptors (was 282).
 - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 
 294), 815 file descriptors (was 461)
 - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 
 148), 310 file descriptors (was 307).
 It's not always leaks, we can expect some pooling effects. But still...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4971) Useless sleeps in TestTimestampsFilter and TestMultipleTimestamps

2011-12-07 Thread nkeywal (Created) (JIRA)
Useless sleeps in TestTimestampsFilter and TestMultipleTimestamps
-

 Key: HBASE-4971
 URL: https://issues.apache.org/jira/browse/HBASE-4971
 Project: HBase
  Issue Type: Improvement
  Components: test
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.94.0
 Attachments: 4971.patch

Comment says Flush tables. Since flushing is asynchronous, sleep for a bit., 
but the function is synchronous.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4971) Useless sleeps in TestTimestampsFilter and TestMultipleTimestamps

2011-12-07 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-4971:
---

Attachment: 4971.patch

 Useless sleeps in TestTimestampsFilter and TestMultipleTimestamps
 -

 Key: HBASE-4971
 URL: https://issues.apache.org/jira/browse/HBASE-4971
 Project: HBase
  Issue Type: Improvement
  Components: test
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.94.0

 Attachments: 4971.patch


 Comment says Flush tables. Since flushing is asynchronous, sleep for a 
 bit., but the function is synchronous.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4970) Add a parameter to change keepAliveTime of Htable thread pool.

2011-12-07 Thread gaojinchao (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gaojinchao updated HBASE-4970:
--

Attachment: HBASE-4970_Branch90_V1_trial.patch

 Add a parameter  to change keepAliveTime of Htable thread pool.
 ---

 Key: HBASE-4970
 URL: https://issues.apache.org/jira/browse/HBASE-4970
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.90.4
Reporter: gaojinchao
Assignee: gaojinchao
Priority: Trivial
 Fix For: 0.90.5

 Attachments: HBASE-4970_Branch90.patch, 
 HBASE-4970_Branch90_V1_trial.patch


 In my cluster, I changed keepAliveTime from 60 s to 3600 s.  Increasing RES 
 is slowed down.
 Why increasing keepAliveTime of HBase thread pool is slowing down our problem 
 occurance [RES value increase]?
 You can go through the source of sun.nio.ch.Util. Every thread hold 3 
 softreference of direct buffer(mustangsrc) for reusage. The code names the 3 
 softreferences buffercache. If the buffer was all occupied or none was 
 suitable in size, and new request comes, new direct buffer is allocated. 
 After the service, the bigger one replaces the smaller one in buffercache. 
 The replaced buffer is released.
 So I think we can add a parameter to change keepAliveTime of Htable thread 
 pool.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4971) Useless sleeps in TestTimestampsFilter and TestMultipleTimestamps

2011-12-07 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-4971:
---

Status: Patch Available  (was: Open)

 Useless sleeps in TestTimestampsFilter and TestMultipleTimestamps
 -

 Key: HBASE-4971
 URL: https://issues.apache.org/jira/browse/HBASE-4971
 Project: HBase
  Issue Type: Improvement
  Components: test
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.94.0

 Attachments: 4971.patch


 Comment says Flush tables. Since flushing is asynchronous, sleep for a 
 bit., but the function is synchronous.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4970) Add a parameter to change keepAliveTime of Htable thread pool.

2011-12-07 Thread gaojinchao (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164364#comment-13164364
 ] 

gaojinchao commented on HBASE-4970:
---

Fixed Lars's comment.

@Lars
Please review firstly, I will test it in real cluster tomorrow.


 Add a parameter  to change keepAliveTime of Htable thread pool.
 ---

 Key: HBASE-4970
 URL: https://issues.apache.org/jira/browse/HBASE-4970
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.90.4
Reporter: gaojinchao
Assignee: gaojinchao
Priority: Trivial
 Fix For: 0.90.5

 Attachments: HBASE-4970_Branch90.patch, 
 HBASE-4970_Branch90_V1_trial.patch


 In my cluster, I changed keepAliveTime from 60 s to 3600 s.  Increasing RES 
 is slowed down.
 Why increasing keepAliveTime of HBase thread pool is slowing down our problem 
 occurance [RES value increase]?
 You can go through the source of sun.nio.ch.Util. Every thread hold 3 
 softreference of direct buffer(mustangsrc) for reusage. The code names the 3 
 softreferences buffercache. If the buffer was all occupied or none was 
 suitable in size, and new request comes, new direct buffer is allocated. 
 After the service, the bigger one replaces the smaller one in buffercache. 
 The replaced buffer is released.
 So I think we can add a parameter to change keepAliveTime of Htable thread 
 pool.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4971) Useless sleeps in TestTimestampsFilter and TestMultipleTimestamps

2011-12-07 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164366#comment-13164366
 ] 

Hadoop QA commented on HBASE-4971:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12506434/4971.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/460//console

This message is automatically generated.

 Useless sleeps in TestTimestampsFilter and TestMultipleTimestamps
 -

 Key: HBASE-4971
 URL: https://issues.apache.org/jira/browse/HBASE-4971
 Project: HBase
  Issue Type: Improvement
  Components: test
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.94.0

 Attachments: 4971.patch


 Comment says Flush tables. Since flushing is asynchronous, sleep for a 
 bit., but the function is synchronous.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4971) Useless sleeps in TestTimestampsFilter and TestMultipleTimestamps

2011-12-07 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-4971:
---

Attachment: 4971_all.v2.patch

 Useless sleeps in TestTimestampsFilter and TestMultipleTimestamps
 -

 Key: HBASE-4971
 URL: https://issues.apache.org/jira/browse/HBASE-4971
 Project: HBase
  Issue Type: Improvement
  Components: test
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.94.0

 Attachments: 4971.patch, 4971_all.v2.patch


 Comment says Flush tables. Since flushing is asynchronous, sleep for a 
 bit., but the function is synchronous.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4971) Useless sleeps in TestTimestampsFilter and TestMultipleTimestamps

2011-12-07 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-4971:
---

Status: Patch Available  (was: Open)

 Useless sleeps in TestTimestampsFilter and TestMultipleTimestamps
 -

 Key: HBASE-4971
 URL: https://issues.apache.org/jira/browse/HBASE-4971
 Project: HBase
  Issue Type: Improvement
  Components: test
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.94.0

 Attachments: 4971.patch, 4971_all.v2.patch


 Comment says Flush tables. Since flushing is asynchronous, sleep for a 
 bit., but the function is synchronous.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4971) Useless sleeps in TestTimestampsFilter and TestMultipleTimestamps

2011-12-07 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-4971:
---

Status: Open  (was: Patch Available)

 Useless sleeps in TestTimestampsFilter and TestMultipleTimestamps
 -

 Key: HBASE-4971
 URL: https://issues.apache.org/jira/browse/HBASE-4971
 Project: HBase
  Issue Type: Improvement
  Components: test
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.94.0

 Attachments: 4971.patch, 4971_all.v2.patch


 Comment says Flush tables. Since flushing is asynchronous, sleep for a 
 bit., but the function is synchronous.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests

2011-12-07 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164391#comment-13164391
 ] 

Hadoop QA commented on HBASE-4965:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12506433/4965_all.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 755 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -160 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 72 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildHole
  org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildBase
  
org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildOverlap

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/459//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/459//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/459//console

This message is automatically generated.

 Monitor the open file descriptors and the threads counters during the unit 
 tests
 

 Key: HBASE-4965
 URL: https://issues.apache.org/jira/browse/HBASE-4965
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 4965_all.patch, ResourceChecker.java, 
 ResourceCheckerJUnitRule.java


 We're seeing a lot of issues with hadoop-qa related to threads or file 
 descriptors.
 Monitoring these counters would ease the analysis.
 Note as well that
  - if we want to execute the tests in the same jvm (because the test is small 
 or because we want to share the cluster) we can't afford to leak too many 
 resources
  - if the tests leak, it's more difficult to detect a leak in the software 
 itself.
 I attach piece of code that I used. It requires two lines in a unit test 
 class to:
 - before every test, count the threads and the open file descriptor
 - after every test, compare with the previous value.
 I ran it on some tests; we have for example:
 - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 
 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 
 232 threads!
 - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 
 283 file descriptors (was 282).
 - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 
 294), 815 file descriptors (was 461)
 - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 
 148), 310 file descriptors (was 307).
 It's not always leaks, we can expect some pooling effects. But still...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests

2011-12-07 Thread nkeywal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164396#comment-13164396
 ] 

nkeywal commented on HBASE-4965:


First, Hadoop QA seems to be configured with 1024 file descriptors:

{noformat}
2011-12-07 13:16:26,184 ERROR [main] hbase.ResourceChecker(122): Bad 
configuration: the operating systems file handles maximum is 1024 our is 1
{noformat}


 Monitor the open file descriptors and the threads counters during the unit 
 tests
 

 Key: HBASE-4965
 URL: https://issues.apache.org/jira/browse/HBASE-4965
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 4965_all.patch, ResourceChecker.java, 
 ResourceCheckerJUnitRule.java


 We're seeing a lot of issues with hadoop-qa related to threads or file 
 descriptors.
 Monitoring these counters would ease the analysis.
 Note as well that
  - if we want to execute the tests in the same jvm (because the test is small 
 or because we want to share the cluster) we can't afford to leak too many 
 resources
  - if the tests leak, it's more difficult to detect a leak in the software 
 itself.
 I attach piece of code that I used. It requires two lines in a unit test 
 class to:
 - before every test, count the threads and the open file descriptor
 - after every test, compare with the previous value.
 I ran it on some tests; we have for example:
 - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 
 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 
 232 threads!
 - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 
 283 file descriptors (was 282).
 - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 
 294), 815 file descriptors (was 461)
 - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 
 148), 310 file descriptors (was 307).
 It's not always leaks, we can expect some pooling effects. But still...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests

2011-12-07 Thread nkeywal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164401#comment-13164401
 ] 

nkeywal commented on HBASE-4965:


The error seems unrelated to my patch. It the same error for the 3 patches.

{noformat}
expected:[NOT_IN_META_OR_DEPLOYED, NOT_IN_META_OR_DEPLOYED, 
NOT_IN_META_OR_DEPLOYED, NOT_IN_META_OR_DEPLOYED] but was:[NOT_IN_META, 
NOT_IN_META_OR_DEPLOYED, NOT_IN_META_OR_DEPLOYED, NOT_IN_META_OR_DEPLOYED]
{noformat}

 Monitor the open file descriptors and the threads counters during the unit 
 tests
 

 Key: HBASE-4965
 URL: https://issues.apache.org/jira/browse/HBASE-4965
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 4965_all.patch, ResourceChecker.java, 
 ResourceCheckerJUnitRule.java


 We're seeing a lot of issues with hadoop-qa related to threads or file 
 descriptors.
 Monitoring these counters would ease the analysis.
 Note as well that
  - if we want to execute the tests in the same jvm (because the test is small 
 or because we want to share the cluster) we can't afford to leak too many 
 resources
  - if the tests leak, it's more difficult to detect a leak in the software 
 itself.
 I attach piece of code that I used. It requires two lines in a unit test 
 class to:
 - before every test, count the threads and the open file descriptor
 - after every test, compare with the previous value.
 I ran it on some tests; we have for example:
 - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 
 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 
 232 threads!
 - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 
 283 file descriptors (was 282).
 - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 
 294), 815 file descriptors (was 461)
 - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 
 148), 310 file descriptors (was 307).
 It's not always leaks, we can expect some pooling effects. But still...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

2011-12-07 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164412#comment-13164412
 ] 

ramkrishna.s.vasudevan commented on HBASE-4880:
---

The patch looks fine to me.. Checking the test failures. 
@Chenhui
Have you done some testing after this patch?
Nice work

 Region is on service before completing openRegionHanlder, may cause data loss
 -

 Key: HBASE-4880
 URL: https://issues.apache.org/jira/browse/HBASE-4880
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch


 OpenRegionHandler in regionserver is processed as the following steps:
 {code}
 1.openregion()(Through it, closed = false, closing = false)
 2.addToOnlineRegions(region)
 3.update .meta. table 
 4.update ZK's node state to RS_ZK_REGION_OPEND
 {code}
 We can find that region is on service before Step 4.
 It means client could put data to this region after step 3.
 What will happen if step 4 is failed processing?
 It will execute OpenRegionHandler#cleanupFailedOpen which will do closing 
 region, and master assign this region to another regionserver.
 If closing region is failed, the data which is put between step 3 and step 4 
 may loss, because the region has been opend on another regionserver and be 
 put new data. Therefore, it may not be recoverd through replayRecoveredEdit() 
 because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4971) Useless sleeps in TestTimestampsFilter and TestMultipleTimestamps

2011-12-07 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164414#comment-13164414
 ] 

Hadoop QA commented on HBASE-4971:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12506438/4971_all.v2.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -160 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 72 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildHole
  
org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildOverlap
  org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildBase

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/461//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/461//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/461//console

This message is automatically generated.

 Useless sleeps in TestTimestampsFilter and TestMultipleTimestamps
 -

 Key: HBASE-4971
 URL: https://issues.apache.org/jira/browse/HBASE-4971
 Project: HBase
  Issue Type: Improvement
  Components: test
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.94.0

 Attachments: 4971.patch, 4971_all.v2.patch


 Comment says Flush tables. Since flushing is asynchronous, sleep for a 
 bit., but the function is synchronous.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4971) Useless sleeps in TestTimestampsFilter and TestMultipleTimestamps

2011-12-07 Thread nkeywal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164416#comment-13164416
 ] 

nkeywal commented on HBASE-4971:


These 3 tests are not impacted by my change. They're likely to be broken on 
trunk as well.
imho, patch is ok.

 Useless sleeps in TestTimestampsFilter and TestMultipleTimestamps
 -

 Key: HBASE-4971
 URL: https://issues.apache.org/jira/browse/HBASE-4971
 Project: HBase
  Issue Type: Improvement
  Components: test
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.94.0

 Attachments: 4971.patch, 4971_all.v2.patch


 Comment says Flush tables. Since flushing is asynchronous, sleep for a 
 bit., but the function is synchronous.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4972) Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch.

2011-12-07 Thread Jonathan Hsieh (Created) (JIRA)
Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch.
--

 Key: HBASE-4972
 URL: https://issues.apache.org/jira/browse/HBASE-4972
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
Priority: Critical
 Fix For: 0.92.0


There are several issues that have been committed in the 0.90 branch but were 
not in trunk/0.92 branch.   These regressions should be forward ported.

HBASE-3320  ! 
HBASE-3380  ! - HBASE-4610 is a jira to backports this, but it is not done.
HBASE-3410  ! 
HBASE-3501  !
HBASE-3714  ! 
HBASE-3729  !! Maked in 0.92 but not committed there, committed in 0.90 branch.
HBASE-3848  !
HBASE-3892  ! * Comments say trunk does not need.
HBASE-3906  !
HBASE-3989  !
HBASE-4109  !
HBASE-4160  !! Marked resolved 0.90.5, but no corresponding commit in either 
0.90 or 0.92
HBASE-4423  ! 


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4972) Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch.

2011-12-07 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164419#comment-13164419
 ] 

Jonathan Hsieh commented on HBASE-4972:
---

Good news is that most of these patches are small.

 Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch.
 --

 Key: HBASE-4972
 URL: https://issues.apache.org/jira/browse/HBASE-4972
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
Priority: Critical
 Fix For: 0.92.0


 There are several issues that have been committed in the 0.90 branch but were 
 not in trunk/0.92 branch.   These regressions should be forward ported.
 HBASE-3320  ! 
 HBASE-3380  ! - HBASE-4610 is a jira to backports this, but it is not done.
 HBASE-3410  ! 
 HBASE-3501  !
 HBASE-3714  ! 
 HBASE-3729  !! Maked in 0.92 but not committed there, committed in 0.90 
 branch.
 HBASE-3848  !
 HBASE-3892  ! * Comments say trunk does not need.
 HBASE-3906  !
 HBASE-3989  !
 HBASE-4109  !
 HBASE-4160  !! Marked resolved 0.90.5, but no corresponding commit in either 
 0.90 or 0.92
 HBASE-4423  ! 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests

2011-12-07 Thread nkeywal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164427#comment-13164427
 ] 

nkeywal commented on HBASE-4965:


Here are the possible leaks. I am gonna fix some of them in a separate patch. 
Leaks on SmallTests are critical, because we the JVM is used for multiple 
tests.

This one should be studied:
client.TestAdmin#testCheckHBaseAvailableClosesConnection: 523 threads (was 
298), 913 file descriptors (was 488).  -thread leak?-  -file handle leak?- 

As the limit on hadoop-QA is 1024 open file descriptor, it's not far from 
hitting this limit. Especially is another test is ran after this one.


avro.TestAvroServer#testTableAdminAndMetadata: 140 threads (was 130), 255 file 
descriptors (was 253).  -thread leak?-  -file handle leak?- 
avro.TestAvroServer#testFamilyAdminAndMetadata: 144 threads (was 140), 255 file 
descriptors (was 255).  -thread leak?- 
avro.TestAvroServer#testDML: 146 threads (was 144), 255 file descriptors (was 
255).  -thread leak?- 
catalog.TestCatalogTrackerOnCluster#testBadOriginalRootLocation: 23 threads 
(was 4), 127 file descriptors (was 70).  -thread leak?-  -file handle leak?- 
catalog.TestCatalogTracker#testThatIfMETAMovesWeAreNotified: 9 threads (was 8), 
84 file descriptors (was 79).  -thread leak?-  -file handle leak?- 
catalog.TestCatalogTracker#testInterruptWaitOnMetaAndRoot: 10 threads (was 9), 
86 file descriptors (was 84).  -file handle leak?- 
catalog.TestCatalogTracker#testVerifyRootRegionLocationFails: 11 threads (was 
9), 89 file descriptors (was 85).  -thread leak?-  -file handle leak?- 
catalog.TestMetaReaderEditorNoCluster#testRideOverServerNotRunning: 7 threads 
(was 4), 85 file descriptors (was 70).  -thread leak?-  -file handle leak?- 
catalog.TestMetaReaderEditor#testGetRegionsCatalogTables: 190 threads (was 
185), 360 file descriptors (was 354).  -thread leak?-  -file handle leak?- 
catalog.TestMetaReaderEditor#testTableExists: 191 threads (was 187), 365 file 
descriptors (was 360).  -thread leak?-  -file handle leak?- 
catalog.TestMetaReaderEditor#testGetRegion: 193 threads (was 191), 370 file 
descriptors (was 365).  -thread leak?-  -file handle leak?- 
client.TestAdmin#testDeleteEditUnknownColumnFamilyAndOrTable: 254 threads (was 
246), 423 file descriptors (was 417).  -thread leak?-  -file handle leak?- 
client.TestAdmin#testDisableAndEnableTable: 273 threads (was 254), 452 file 
descriptors (was 423).  -thread leak?-  -file handle leak?- 
client.TestAdmin#testDisableAndEnableTables: 294 threads (was 272), 482 file 
descriptors (was 452).  -thread leak?-  -file handle leak?- 
client.TestAdmin#testCreateTable: 294 threads (was 294), 491 file descriptors 
(was 482).  -file handle leak?- 
client.TestAdmin#testOnlineChangeTableSchema: 295 threads (was 294), 494 file 
descriptors (was 491).  -thread leak?-  -file handle leak?- 
client.TestAdmin#testCreateTableWithRegions: 296 threads (was 294), 490 file 
descriptors (was 490).  -thread leak?- 
client.TestAdmin#testTableExist: 297 threads (was 296), 494 file descriptors 
(was 490).  -thread leak?-  -file handle leak?- 
client.TestAdmin#testForceSplit: 303 threads (was 297), 487 file descriptors 
(was 494).  -thread leak?- 
client.TestAdmin#testForceSplitMultiFamily: 309 threads (was 293), 499 file 
descriptors (was 464).  -thread leak?-  -file handle leak?- 
client.TestAdmin#testEnableDisableAddColumnDeleteColumn: 312 threads (was 309), 
505 file descriptors (was 499).  -thread leak?-  -file handle leak?- 
client.TestAdmin#testCreateBadTables: 313 threads (was 312), 507 file 
descriptors (was 505).  -thread leak?-  -file handle leak?- 
client.TestAdmin#testCreateTableRPCTimeOut: 312 threads (was 313), 526 file 
descriptors (was 507).  -file handle leak?- 
client.TestAdmin#testReadOnlyTable: 314 threads (was 312), 530 file descriptors 
(was 526).  -thread leak?-  -file handle leak?- 
client.TestAdmin#testCloseRegionThatFetchesTheHRIFromMeta: 315 threads (was 
312), 513 file descriptors (was 507).  -thread leak?-  -file handle leak?- 
client.TestAdmin#testGetTableRegions: 309 threads (was 308), 512 file 
descriptors (was 499).  -thread leak?-  -file handle leak?- 
client.TestAdmin#testCheckHBaseAvailableClosesConnection: 523 threads (was 
298), 913 file descriptors (was 488).  -thread leak?-  -file handle leak?- 
client.TestFromClientSide#testKeepDeletedCells: 261 threads (was 246), 437 file 
descriptors (was 414).  -thread leak?-  -file handle leak?- 
client.TestFromClientSide#testRegionCacheDeSerialization: 276 threads (was 
261), 485 file descriptors (was 437).  -thread leak?-  -file handle leak?- 
client.TestFromClientSide#testRegionCachePreWarm: 277 threads (was 276), 488 
file descriptors (was 485).  -thread leak?-  -file handle leak?- 
client.TestFromClientSide#testWeirdCacheBehaviour: 285 threads (was 277), 500 
file descriptors (was 488).  -thread leak?-  -file handle leak?- 

[jira] [Commented] (HBASE-2675) Quick smoke tests testsuite

2011-12-07 Thread nkeywal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164430#comment-13164430
 ] 

nkeywal commented on HBASE-2675:


'mvn test -P runSmallTests' runs about 400 tests out the the 1200 tests today, 
in about 3 minutes. Is this ok for you Benoit?


 Quick smoke tests testsuite
 -

 Key: HBASE-2675
 URL: https://issues.apache.org/jira/browse/HBASE-2675
 Project: HBase
  Issue Type: Test
Reporter: Benoit Sigoure
Assignee: nkeywal
Priority: Minor

 It would be nice if there was a known subset of the tests that run fast (e.g. 
 not more than a few seconds) and quickly help us check whether the code isn't 
 horribly broken.  This way one could run those tests at a frequent interval 
 when iterating and only run the entire testsuite at the end, when they think 
 they're done, since doing so is very time consuming.
 Someone would need to identify which tests really focus on the core 
 functionality and add a target in the build system to just run those tests.  
 As a bonus, it would be awesome++ if the core tests ran, say, 10x faster than 
 they currently do.  There's a lot of sleep-based synchronization in the 
 tests and it would be nice to remove some of that where possible to make the 
 tests run as fast as the machine can handle them.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4972) Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch.

2011-12-07 Thread Jonathan Hsieh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-4972:
--

Description: 
There are several issues that have been committed in the 0.90 branch but were 
not in trunk/0.92 branch.   These regressions should be forward ported.

HBASE-3320  ! 
HBASE-3380  ! - HBASE-4610 is a jira to backports this, but it is not done.
HBASE-3410  ! 
HBASE-3501  !
HBASE-3714  ! 
HBASE-3729  !! Marked in 0.92 but not committed there, committed in 0.90 branch.
HBASE-3848  !
HBASE-3892  ! * Comments say trunk does not need.
HBASE-3906  !
HBASE-3989  !
HBASE-4109  !
HBASE-4160  !! Marked resolved 0.90.5, but no corresponding commit in either 
0.90 or 0.92
HBASE-4423  ! 


  was:
There are several issues that have been committed in the 0.90 branch but were 
not in trunk/0.92 branch.   These regressions should be forward ported.

HBASE-3320  ! 
HBASE-3380  ! - HBASE-4610 is a jira to backports this, but it is not done.
HBASE-3410  ! 
HBASE-3501  !
HBASE-3714  ! 
HBASE-3729  !! Maked in 0.92 but not committed there, committed in 0.90 branch.
HBASE-3848  !
HBASE-3892  ! * Comments say trunk does not need.
HBASE-3906  !
HBASE-3989  !
HBASE-4109  !
HBASE-4160  !! Marked resolved 0.90.5, but no corresponding commit in either 
0.90 or 0.92
HBASE-4423  ! 



 Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch.
 --

 Key: HBASE-4972
 URL: https://issues.apache.org/jira/browse/HBASE-4972
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
Priority: Critical
 Fix For: 0.92.0


 There are several issues that have been committed in the 0.90 branch but were 
 not in trunk/0.92 branch.   These regressions should be forward ported.
 HBASE-3320  ! 
 HBASE-3380  ! - HBASE-4610 is a jira to backports this, but it is not done.
 HBASE-3410  ! 
 HBASE-3501  !
 HBASE-3714  ! 
 HBASE-3729  !! Marked in 0.92 but not committed there, committed in 0.90 
 branch.
 HBASE-3848  !
 HBASE-3892  ! * Comments say trunk does not need.
 HBASE-3906  !
 HBASE-3989  !
 HBASE-4109  !
 HBASE-4160  !! Marked resolved 0.90.5, but no corresponding commit in either 
 0.90 or 0.92
 HBASE-4423  ! 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4973) On failure, HBaseAdmin sleeps one time too many

2011-12-07 Thread nkeywal (Created) (JIRA)
On failure, HBaseAdmin sleeps one time too many
---

 Key: HBASE-4973
 URL: https://issues.apache.org/jira/browse/HBASE-4973
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor


In this code last sleep is useless as we're not retrying. This can slow down 
failure scenarios by a few seconds (up to 32 second).

{noformat}
  public HBaseAdmin(Configuration c)
  throws MasterNotRunningException, ZooKeeperConnectionException {
this.conf = HBaseConfiguration.create(c);
  this.connection = HConnectionManager.getConnection(this.conf);
this.pause = this.conf.getLong(hbase.client.pause, 1000);
this.numRetries = this.conf.getInt(hbase.client.retries.number, 10);
this.retryLongerMultiplier = this.conf.getInt(
hbase.client.retries.longer.multiplier, 10);
int tries = 0;
for (; tries  numRetries; ++tries) {
  try {
this.connection.getMaster();
break;
  } catch (MasterNotRunningException mnre) {
HConnectionManager.deleteStaleConnection(this.connection);
this.connection = HConnectionManager.getConnection(this.conf);
  } catch (UndeclaredThrowableException ute) {
HConnectionManager.deleteStaleConnection(this.connection);
this.connection = HConnectionManager.getConnection(this.conf);
  }
  try { // Sleep
Thread.sleep(getPauseTime(tries));
  } catch (InterruptedException e) {
Thread.currentThread().interrupt();
// we should delete connection between client and zookeeper
HConnectionManager.deleteStaleConnection(this.connection);
throw new MasterNotRunningException(Interrupted);
  }
}
if (tries = numRetries) {
  // we should delete connection between client and zookeeper
  HConnectionManager.deleteStaleConnection(this.connection);
  throw new MasterNotRunningException(Retried  + numRetries +  times);
}
  }
{noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4974) Remove some resources leaks on the tests

2011-12-07 Thread nkeywal (Created) (JIRA)
Remove some resources leaks on the tests


 Key: HBASE-4974
 URL: https://issues.apache.org/jira/browse/HBASE-4974
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor


Cf. title and HBASE-4965

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4972) Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch.

2011-12-07 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164501#comment-13164501
 ] 

Jonathan Hsieh commented on HBASE-4972:
---

* HBASE-3848 This work has been idle since Jun/11
* HBASE-3892 Comments say trunk doesn't need, but no test case so can't verify 
without effort. Seems to have significant differences between 0.90 and 0.92.  
* HBASE-3906 Comments say doesn't make sense on trunk.
* HBASE-3989 Comments say not needed on trunk
* HBASE-4109 Comments say not needed on trunk
* HBASE-4160 Patch and commit present but does not contain name HBASE-4160.
* HBASE-4423 Contained in 0.92's HBASE-4238 


 Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch.
 --

 Key: HBASE-4972
 URL: https://issues.apache.org/jira/browse/HBASE-4972
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
Priority: Critical
 Fix For: 0.92.0


 There are several issues that have been committed in the 0.90 branch but were 
 not in trunk/0.92 branch.   These regressions should be forward ported.
 HBASE-3320  ! 
 HBASE-3380  ! - HBASE-4610 is a jira to backports this, but it is not done.
 HBASE-3410  ! 
 HBASE-3501  !
 HBASE-3714  ! 
 HBASE-3729  !! Marked in 0.92 but not committed there, committed in 0.90 
 branch.
 HBASE-3848  !
 HBASE-3892  ! * Comments say trunk does not need.
 HBASE-3906  !
 HBASE-3989  !
 HBASE-4109  !
 HBASE-4160  !! Marked resolved 0.90.5, but no corresponding commit in either 
 0.90 or 0.92
 HBASE-4423  ! 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4974) Remove some resources leaks on the tests

2011-12-07 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-4974:
---

Status: Patch Available  (was: Open)

 Remove some resources leaks on the tests
 

 Key: HBASE-4974
 URL: https://issues.apache.org/jira/browse/HBASE-4974
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 4974_all.patch


 Cf. title and HBASE-4965

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4974) Remove some resources leaks on the tests

2011-12-07 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-4974:
---

Attachment: 4974_all.patch

 Remove some resources leaks on the tests
 

 Key: HBASE-4974
 URL: https://issues.apache.org/jira/browse/HBASE-4974
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 4974_all.patch


 Cf. title and HBASE-4965

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4972) Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch.

2011-12-07 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164512#comment-13164512
 ] 

Jonathan Hsieh commented on HBASE-4972:
---

So two issue remain, 
* HBASE-4610 which is explicitly a forward porting issue. 
* HBASE-3848 which is open -- currently with a commit on 0.90 branch but not 
trunk/0.92.  Maybe this should be closed on 0.90 and a new forward porting 
issue should be created?

The other issues are basically non-issues code-wise: 
* subsequent patches picked up the fix.
* patch is not relevant to 0.92/trunk branches. (would be nice to have this in 
title).
* typos in commit messages.  

 Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch.
 --

 Key: HBASE-4972
 URL: https://issues.apache.org/jira/browse/HBASE-4972
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
Priority: Critical
 Fix For: 0.92.0


 There are several issues that have been committed in the 0.90 branch but were 
 not in trunk/0.92 branch.   These regressions should be forward ported.
 HBASE-3320  ! 
 HBASE-3380  ! - HBASE-4610 is a jira to backports this, but it is not done.
 HBASE-3410  ! 
 HBASE-3501  !
 HBASE-3714  ! 
 HBASE-3729  !! Marked in 0.92 but not committed there, committed in 0.90 
 branch.
 HBASE-3848  !
 HBASE-3892  ! * Comments say trunk does not need.
 HBASE-3906  !
 HBASE-3989  !
 HBASE-4109  !
 HBASE-4160  !! Marked resolved 0.90.5, but no corresponding commit in either 
 0.90 or 0.92
 HBASE-4423  ! 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4972) Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch.

2011-12-07 Thread Jonathan Hsieh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-4972:
--

Issue Type: Task  (was: Bug)

 Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch.
 --

 Key: HBASE-4972
 URL: https://issues.apache.org/jira/browse/HBASE-4972
 Project: HBase
  Issue Type: Task
Affects Versions: 0.92.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
Priority: Critical
 Fix For: 0.92.0


 There are several issues that have been committed in the 0.90 branch but were 
 not in trunk/0.92 branch.   These regressions should be forward ported.
 HBASE-3320  ! 
 HBASE-3380  ! - HBASE-4610 is a jira to backports this, but it is not done.
 HBASE-3410  ! 
 HBASE-3501  !
 HBASE-3714  ! 
 HBASE-3729  !! Marked in 0.92 but not committed there, committed in 0.90 
 branch.
 HBASE-3848  !
 HBASE-3892  ! * Comments say trunk does not need.
 HBASE-3906  !
 HBASE-3989  !
 HBASE-4109  !
 HBASE-4160  !! Marked resolved 0.90.5, but no corresponding commit in either 
 0.90 or 0.92
 HBASE-4423  ! 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4974) Remove some resources leaks on the tests

2011-12-07 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164581#comment-13164581
 ] 

Hadoop QA commented on HBASE-4974:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12506481/4974_all.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 12 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -160 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 72 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildOverlap
  org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildHole
  org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildBase

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/462//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/462//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/462//console

This message is automatically generated.

 Remove some resources leaks on the tests
 

 Key: HBASE-4974
 URL: https://issues.apache.org/jira/browse/HBASE-4974
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 4974_all.patch


 Cf. title and HBASE-4965

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4970) Add a parameter to change keepAliveTime of Htable thread pool.

2011-12-07 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164580#comment-13164580
 ] 

Zhihong Yu commented on HBASE-4970:
---

Patch v2 is a backport and doesn't change keepAliveTime.
I feel we should address the needs of HTable users.

I am fine with the backport - we may want to modify the title of this JIRA 
accordingly.

 Add a parameter  to change keepAliveTime of Htable thread pool.
 ---

 Key: HBASE-4970
 URL: https://issues.apache.org/jira/browse/HBASE-4970
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.90.4
Reporter: gaojinchao
Assignee: gaojinchao
Priority: Trivial
 Fix For: 0.90.5

 Attachments: HBASE-4970_Branch90.patch, 
 HBASE-4970_Branch90_V1_trial.patch


 In my cluster, I changed keepAliveTime from 60 s to 3600 s.  Increasing RES 
 is slowed down.
 Why increasing keepAliveTime of HBase thread pool is slowing down our problem 
 occurance [RES value increase]?
 You can go through the source of sun.nio.ch.Util. Every thread hold 3 
 softreference of direct buffer(mustangsrc) for reusage. The code names the 3 
 softreferences buffercache. If the buffer was all occupied or none was 
 suitable in size, and new request comes, new direct buffer is allocated. 
 After the service, the bigger one replaces the smaller one in buffercache. 
 The replaced buffer is released.
 So I think we can add a parameter to change keepAliveTime of Htable thread 
 pool.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4974) Remove some resources leaks on the tests

2011-12-07 Thread nkeywal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164583#comment-13164583
 ] 

nkeywal commented on HBASE-4974:


The 3 tests fails on trunk as well. However, it means that the large tests have 
not been tested, and I have some strange errors on these ones locally...

 Remove some resources leaks on the tests
 

 Key: HBASE-4974
 URL: https://issues.apache.org/jira/browse/HBASE-4974
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 4974_all.patch


 Cf. title and HBASE-4965

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Reopened] (HBASE-4927) CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty

2011-12-07 Thread Ted Yu (Reopened) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reopened HBASE-4927:
---


As we can see from the report here:
https://issues.apache.org/jira/browse/HBASE-4927?focusedCommentId=13163785page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13163785

there were 3 failed tests.
These test failures rippled through all 0.92 builds.

 CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the 
 last region when the endkey is empty
 ---

 Key: HBASE-4927
 URL: https://issues.apache.org/jira/browse/HBASE-4927
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.92.0

 Attachments: 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator-.patch, 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator_v2.patch, 
 hbase-4927-fix-ws.txt


 When reviewing HBASE-4238 backporting, Jon found this issue.
 What happens if the split points are  (empty end key is the last key, empty 
 start key is the first key)
 Parent [A,)
 L daughter [A,B), 
 R daughter [B,)
 When sorted, we gets to end key comparision which results in this incorrector 
 order:
 [A,B), [A,), [B,) 
 we wanted:
 [A,), [A,B), [B,)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4224) Need a flush by regionserver rather than by table option

2011-12-07 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164599#comment-13164599
 ] 

Ted Yu commented on HBASE-4224:
---

@Akash:
Do you have a newer patch ?
If so, please upload to this JIRA.

 Need a flush by regionserver rather than by table option
 

 Key: HBASE-4224
 URL: https://issues.apache.org/jira/browse/HBASE-4224
 Project: HBase
  Issue Type: Bug
  Components: shell
Reporter: stack
Assignee: Akash Ashok
 Attachments: HBase-4224.patch


 This evening needed to clean out logs on the cluster.  logs are by 
 regionserver.  to let go of logs, we need to have all edits emptied from 
 memory.  only flush is by table or region.  We need to be able to flush the 
 regionserver.  Need to add this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

2011-12-07 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164605#comment-13164605
 ] 

Zhihong Yu commented on HBASE-4880:
---

The three test failures would be fixed by addendum to HBASE-4927.

 Region is on service before completing openRegionHanlder, may cause data loss
 -

 Key: HBASE-4880
 URL: https://issues.apache.org/jira/browse/HBASE-4880
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch


 OpenRegionHandler in regionserver is processed as the following steps:
 {code}
 1.openregion()(Through it, closed = false, closing = false)
 2.addToOnlineRegions(region)
 3.update .meta. table 
 4.update ZK's node state to RS_ZK_REGION_OPEND
 {code}
 We can find that region is on service before Step 4.
 It means client could put data to this region after step 3.
 What will happen if step 4 is failed processing?
 It will execute OpenRegionHandler#cleanupFailedOpen which will do closing 
 region, and master assign this region to another regionserver.
 If closing region is failed, the data which is put between step 3 and step 4 
 may loss, because the region has been opend on another regionserver and be 
 put new data. Therefore, it may not be recoverd through replayRecoveredEdit() 
 because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4880) Region is on service before openRegionHandler completes, may cause data loss

2011-12-07 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-4880:
--

Affects Version/s: 0.94.0
   0.19.2
  Summary: Region is on service before openRegionHandler completes, 
may cause data loss  (was: Region is on service before completing 
openRegionHanlder, may cause data loss)

 Region is on service before openRegionHandler completes, may cause data loss
 

 Key: HBASE-4880
 URL: https://issues.apache.org/jira/browse/HBASE-4880
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.19.2, 0.94.0
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch


 OpenRegionHandler in regionserver is processed as the following steps:
 {code}
 1.openregion()(Through it, closed = false, closing = false)
 2.addToOnlineRegions(region)
 3.update .meta. table 
 4.update ZK's node state to RS_ZK_REGION_OPEND
 {code}
 We can find that region is on service before Step 4.
 It means client could put data to this region after step 3.
 What will happen if step 4 is failed processing?
 It will execute OpenRegionHandler#cleanupFailedOpen which will do closing 
 region, and master assign this region to another regionserver.
 If closing region is failed, the data which is put between step 3 and step 4 
 may loss, because the region has been opend on another regionserver and be 
 put new data. Therefore, it may not be recoverd through replayRecoveredEdit() 
 because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4975) fix spurious -1's from Hadoop QA

2011-12-07 Thread Eugene Koontz (Created) (JIRA)
fix spurious -1's from Hadoop QA


 Key: HBASE-4975
 URL: https://issues.apache.org/jira/browse/HBASE-4975
 Project: HBase
  Issue Type: Bug
  Components: build
Reporter: Eugene Koontz
Priority: Minor


Hadoop QA generated comments based on patches submitted to JIRAs; for example:

https://issues.apache.org/jira/browse/HBASE-4960?focusedCommentId=13163191page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13163191

There are some spurious -1's given to the patch. The patch only affects 
documentation, not source code, but Hadoop QA says that:
{noformat}
-1 findbugs. The patch appears to introduce 72 new Findbugs (version 1.3.9)
warnings.
{noformat}

Evidently Hadoop QA is not able to recall the set of Findbugs warnings from the 
previous build.

(Of course the Findbugs warnings themselves should be addressed, but this patch 
could not have added to them).

{noformat}
-1 javadoc. The javadoc tool appears to have generated -160 warning
messages.
{noformat}

This should be 160 warning messages, not -160 warning messages.

Thanks to NKeywal for suggesting that the relevant file is 
{{dev-support/test-patch.sh}}.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4880) Region is on service before openRegionHandler completes, may cause data loss

2011-12-07 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-4880:
--

Affects Version/s: (was: 0.19.2)
   0.92.0

 Region is on service before openRegionHandler completes, may cause data loss
 

 Key: HBASE-4880
 URL: https://issues.apache.org/jira/browse/HBASE-4880
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0, 0.94.0
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch


 OpenRegionHandler in regionserver is processed as the following steps:
 {code}
 1.openregion()(Through it, closed = false, closing = false)
 2.addToOnlineRegions(region)
 3.update .meta. table 
 4.update ZK's node state to RS_ZK_REGION_OPEND
 {code}
 We can find that region is on service before Step 4.
 It means client could put data to this region after step 3.
 What will happen if step 4 is failed processing?
 It will execute OpenRegionHandler#cleanupFailedOpen which will do closing 
 region, and master assign this region to another regionserver.
 If closing region is failed, the data which is put between step 3 and step 4 
 may loss, because the region has been opend on another regionserver and be 
 put new data. Therefore, it may not be recoverd through replayRecoveredEdit() 
 because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3857) Change the HFile Format

2011-12-07 Thread Jean-Daniel Cryans (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164613#comment-13164613
 ] 

Jean-Daniel Cryans commented on HBASE-3857:
---

Any reason why this patch is removing compaction and flush queue sizes?

{code}
-this.metrics.compactionQueueSize.set(compactSplitThread
-.getCompactionQueueSize());
-this.metrics.flushQueueSize.set(cacheFlusher
-.getFlushQueueSize());
{code}

If it was intentional, there's a bunch of dead code that also needs to be 
removed like those methods that were called. If it wasn't, meaning there's 
currently no way in 0.92 to get the compaction queue size, then this would be 
sufficient for me to kill the RC.

 Change the HFile Format
 ---

 Key: HBASE-3857
 URL: https://issues.apache.org/jira/browse/HBASE-3857
 Project: HBase
  Issue Type: New Feature
Affects Versions: 0.90.4
Reporter: Liyin Tang
Assignee: Mikhail Bautin
 Attachments: 0001-Adding-release-notes-for-HBASE-3857.patch, 
 0001-Fix-TestHFileBlock.testBlockHeapSize.patch, 
 0001-review_hfile-v2-r1144693_2011-07-15_11_14_44.patch, 
 0001-review_hfile-v2-r1147350_2011-07-26_11_55_59.patch, 
 0001-review_hfile-v2-r1152122-2011_08_01_03_18_00.patch, 
 0001-review_hfile-v2-r1153300-git-1152532-2011_08_02_19_4.patch, 
 0001-review_hfile-v2-r1153300-git-1152532-2011_08_03_12_4.patch, 
 hfile_format_v2_design_draft_0.1.pdf, hfile_format_v2_design_draft_0.3.pdf, 
 hfile_format_v2_design_draft_0.4.odt


 In order to support HBASE-3763 and HBASE-3856, we need to change the format 
 of the HFile. The new format proposal is attached here. Thanks for Mikhail 
 Bautin for the documentation. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3857) Change the HFile Format

2011-12-07 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164618#comment-13164618
 ] 

Zhihong Yu commented on HBASE-3857:
---

@J-D:
Nice catch.

We should open another JIRA to deal with queue sizes.

 Change the HFile Format
 ---

 Key: HBASE-3857
 URL: https://issues.apache.org/jira/browse/HBASE-3857
 Project: HBase
  Issue Type: New Feature
Affects Versions: 0.90.4
Reporter: Liyin Tang
Assignee: Mikhail Bautin
 Attachments: 0001-Adding-release-notes-for-HBASE-3857.patch, 
 0001-Fix-TestHFileBlock.testBlockHeapSize.patch, 
 0001-review_hfile-v2-r1144693_2011-07-15_11_14_44.patch, 
 0001-review_hfile-v2-r1147350_2011-07-26_11_55_59.patch, 
 0001-review_hfile-v2-r1152122-2011_08_01_03_18_00.patch, 
 0001-review_hfile-v2-r1153300-git-1152532-2011_08_02_19_4.patch, 
 0001-review_hfile-v2-r1153300-git-1152532-2011_08_03_12_4.patch, 
 hfile_format_v2_design_draft_0.1.pdf, hfile_format_v2_design_draft_0.3.pdf, 
 hfile_format_v2_design_draft_0.4.odt


 In order to support HBASE-3763 and HBASE-3856, we need to change the format 
 of the HFile. The new format proposal is attached here. Thanks for Mikhail 
 Bautin for the documentation. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3857) Change the HFile Format

2011-12-07 Thread Jean-Daniel Cryans (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164619#comment-13164619
 ] 

Jean-Daniel Cryans commented on HBASE-3857:
---

Yeah I just want to verify first what's the situation, maybe I'm missing 
something.

 Change the HFile Format
 ---

 Key: HBASE-3857
 URL: https://issues.apache.org/jira/browse/HBASE-3857
 Project: HBase
  Issue Type: New Feature
Affects Versions: 0.90.4
Reporter: Liyin Tang
Assignee: Mikhail Bautin
 Attachments: 0001-Adding-release-notes-for-HBASE-3857.patch, 
 0001-Fix-TestHFileBlock.testBlockHeapSize.patch, 
 0001-review_hfile-v2-r1144693_2011-07-15_11_14_44.patch, 
 0001-review_hfile-v2-r1147350_2011-07-26_11_55_59.patch, 
 0001-review_hfile-v2-r1152122-2011_08_01_03_18_00.patch, 
 0001-review_hfile-v2-r1153300-git-1152532-2011_08_02_19_4.patch, 
 0001-review_hfile-v2-r1153300-git-1152532-2011_08_03_12_4.patch, 
 hfile_format_v2_design_draft_0.1.pdf, hfile_format_v2_design_draft_0.3.pdf, 
 hfile_format_v2_design_draft_0.4.odt


 In order to support HBASE-3763 and HBASE-3856, we need to change the format 
 of the HFile. The new format proposal is attached here. Thanks for Mikhail 
 Bautin for the documentation. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4938) Create a HRegion.getScanner public method that allows reading from a specified readPoint

2011-12-07 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164620#comment-13164620
 ] 

Todd Lipcon commented on HBASE-4938:


Dhruba, can you clarify a little more what the purpose of this change is? I 
didn't quite understand what you meant by We have some internal HRegion API 
that needs to scan based on a external readPoint. You have some other 
non-HBase software which is using HBase's storage engine components?

 Create a HRegion.getScanner public method that allows reading from a 
 specified readPoint
 

 Key: HBASE-4938
 URL: https://issues.apache.org/jira/browse/HBASE-4938
 Project: HBase
  Issue Type: Improvement
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Priority: Minor

 There is an existing api HRegion.getScanner(Scan) that allows scanning a 
 table. My proposal is to extend it to HRegion.getScanner(Scan, long readPoint)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4976) Add compaction/flush queue size metrics mistakenly removed by HFile v2

2011-12-07 Thread Mikhail Bautin (Created) (JIRA)
Add compaction/flush queue size metrics mistakenly removed by HFile v2
--

 Key: HBASE-4976
 URL: https://issues.apache.org/jira/browse/HBASE-4976
 Project: HBase
  Issue Type: Bug
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4610) Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug)

2011-12-07 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-4610:
--

Attachment: 4610.txt

Jonathan's patch from HBASE-3380, rebased for TRUNK.

 Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk 
 (definitely bring in config params, decide if we need to do more to fix the 
 bug)
 -

 Key: HBASE-4610
 URL: https://issues.apache.org/jira/browse/HBASE-4610
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0, 0.94.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
 Fix For: 0.92.1

 Attachments: 4610.txt


 Over in HBASE-3380 we were having some TestMasterFailover flakiness.  We 
 added some more config parameters to better control the master startup loop 
 where it waits for RS to heartbeat in.  We had thought at the time that 92 
 would have a different solution but it is still relying on heartbeats to 
 learn about RSs.
 For now, we should definitely bring these config params into 92/trunk.  
 Otherwise this is an incompatible regression and adding these will also make 
 things like what was just reported over in HBASE-4603 trivial to fix in an 
 optimal way.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4610) Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug)

2011-12-07 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-4610:
--

Status: Patch Available  (was: Open)

Patch testing.

 Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk 
 (definitely bring in config params, decide if we need to do more to fix the 
 bug)
 -

 Key: HBASE-4610
 URL: https://issues.apache.org/jira/browse/HBASE-4610
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0, 0.94.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
 Fix For: 0.92.1

 Attachments: 4610.txt


 Over in HBASE-3380 we were having some TestMasterFailover flakiness.  We 
 added some more config parameters to better control the master startup loop 
 where it waits for RS to heartbeat in.  We had thought at the time that 92 
 would have a different solution but it is still relying on heartbeats to 
 learn about RSs.
 For now, we should definitely bring these config params into 92/trunk.  
 Otherwise this is an incompatible regression and adding these will also make 
 things like what was just reported over in HBASE-4603 trivial to fix in an 
 optimal way.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-3848) request is always zero in WebUI for region server

2011-12-07 Thread Ted Yu (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved HBASE-3848.
---

Resolution: Fixed

The remaining work would be completed by HBASE-4977

 request is always zero in WebUI for region server
 -

 Key: HBASE-3848
 URL: https://issues.apache.org/jira/browse/HBASE-3848
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.2
Reporter: gaojinchao
Assignee: gaojinchao
Priority: Minor
 Attachments: RegionServer_90PatchV2.patch, 
 RegionseverMetric_TrunkPathV2.patch


 request is always zero in WebUI for region server
 
  Metrics request=0.0, regions=36, stores=36, storefiles=148, 
  storefileIndexSize=29, memstoreSize=253, compactionQueueSize=24, 
  flushQueueSize=0, usedHeap=655, maxHeap=8175, blockCacheSize=14230920, 
  blockCacheFree=1700269560, blockCacheCount=21, 
  blockCacheHitCount=2887, blockCacheMissCount=204829, 
  blockCacheEvictedCount=0, blockCacheHitRatio=1, 
  blockCacheHitCachingRatio=99
 
  requests is not zero in WebUI for Hmaster requests=15000, regions=35, 
  usedHeap=513, maxHeap=8175
 
  Is there any different for these metrics?
  How do I use it?
  Thanks.
 
 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4977) Forward port HBASE-3848 to 0.92 and TRUNK

2011-12-07 Thread Ted Yu (Created) (JIRA)
Forward port HBASE-3848 to 0.92 and TRUNK
-

 Key: HBASE-4977
 URL: https://issues.apache.org/jira/browse/HBASE-4977
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu


HBASE-3848, request is always zero in WebUI for region server, was integrated 
to 0.90

This JIRA is a forward port.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4336) Convert source tree into maven modules

2011-12-07 Thread Jesse Yates (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164654#comment-13164654
 ] 

Jesse Yates commented on HBASE-4336:


Started working on this. I have a fork up on github with hbase split into 
multiple modules (https://github.com/jyates/hbase) - a patch is just too 
massive to reasonably look at. Currently, the fork compiles and tests. 
Packaging is coming next.

How do we want to bundle each of the pieces? I was thinking having jars for 
core, core-tests (so the minicluster can be used across modules), security, 
server and test. 

The test module will be where we have the 'api level tests' discussed on dev@ 
recently. These are things that are run against a cluster and just test the 
interfaces. Here is where we would use failsafe to spin up a minicluster for 
local testing or connect out to a real cluster (all of this would in follow-on 
JIRA(s)). Its _not_ intended as the place to put all the tests. 

The assemble module would then combine all of these into a tar, rpm, etc. as 
needed.

Profiles would necessarily be split across multiple modules as each module will 
require different things and I don't want to add in the same dependency 
multiple times in different modules. This works nicely with Gary's original 
comment about just having the secure hadoop stuff in the security module 
(translates to having a the profile just in that module). The alternative would 
be to exclude certain dependencies in modules that don't need them, amounting 
to about the same amount of work across modules, but harder to reason about.

Feedback is appreciated.

 Convert source tree into maven modules
 --

 Key: HBASE-4336
 URL: https://issues.apache.org/jira/browse/HBASE-4336
 Project: HBase
  Issue Type: Task
  Components: build
Reporter: Gary Helmling
Priority: Critical
 Fix For: 0.94.0


 When we originally converted the build to maven we had a single core module 
 defined, but later reverted this to a module-less build for the sake of 
 simplicity.
 It now looks like it's time to re-address this, as we have an actual need for 
 modules to:
 * provide a trimmed down client library that applications can make use of
 * more cleanly support building against different versions of Hadoop, in 
 place of some of the reflection machinations currently required
 * incorporate the secure RPC engine that depends on some secure Hadoop classes
 I propose we start simply by refactoring into two initial modules:
 * core - common classes and utilities, and client-side code and interfaces
 * server - master and region server implementations and supporting code
 This would also lay the groundwork for incorporating the HBase security 
 features that have been developed.  Once the module structure is in place, 
 security-related features could then be incorporated into a third module -- 
 security -- after normal review and approval.  The security module could 
 then depend on secure Hadoop, without modifying the dependencies of the rest 
 of the HBase code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4729) Clash between region unassign and splitting kills the master

2011-12-07 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164658#comment-13164658
 ] 

Ted Yu commented on HBASE-4729:
---

The HadoopQA report @ 
https://builds.apache.org/job/PreCommit-HBASE-Build/405//testReport/ showed 
basically no tests were run.

A manual test suite execution should have been performed.

 Clash between region unassign and splitting kills the master
 

 Key: HBASE-4729
 URL: https://issues.apache.org/jira/browse/HBASE-4729
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Assignee: stack
Priority: Critical
 Fix For: 0.92.0, 0.94.0

 Attachments: 4729-v2.txt, 4729-v3.txt, 4729-v4.txt, 4729-v5.txt, 
 4729-v6-092.txt, 4729-v6-trunk.txt, 4729.txt


 I was running an online alter while regions were splitting, and suddenly the 
 master died and left my table half-altered (haven't restarted the master yet).
 What killed the master:
 {quote}
 2011-11-02 17:06:44,428 FATAL org.apache.hadoop.hbase.master.HMaster: 
 Unexpected ZK exception creating node CLOSING
 org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = 
 NodeExists for /hbase/unassigned/f7e1783e65ea8d621a4bc96ad310f101
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:110)
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
 at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637)
 at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:459)
 at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:441)
 at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:769)
 at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.createNodeClosing(ZKAssign.java:568)
 at 
 org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1722)
 at 
 org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1661)
 at org.apache.hadoop.hbase.master.BulkReOpen$1.run(BulkReOpen.java:69)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 {quote}
 A znode was created because the region server was splitting the region 4 
 seconds before:
 {quote}
 2011-11-02 17:06:40,704 INFO 
 org.apache.hadoop.hbase.regionserver.SplitTransaction: Starting split of 
 region TestTable,0012469153,1320253135043.f7e1783e65ea8d621a4bc96ad310f101.
 2011-11-02 17:06:40,704 DEBUG 
 org.apache.hadoop.hbase.regionserver.SplitTransaction: 
 regionserver:62023-0x132f043bbde0710 Creating ephemeral node for 
 f7e1783e65ea8d621a4bc96ad310f101 in SPLITTING state
 2011-11-02 17:06:40,751 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 regionserver:62023-0x132f043bbde0710 Attempting to transition node 
 f7e1783e65ea8d621a4bc96ad310f101 from RS_ZK_REGION_SPLITTING to 
 RS_ZK_REGION_SPLITTING
 ...
 2011-11-02 17:06:44,061 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 regionserver:62023-0x132f043bbde0710 Successfully transitioned node 
 f7e1783e65ea8d621a4bc96ad310f101 from RS_ZK_REGION_SPLITTING to 
 RS_ZK_REGION_SPLIT
 2011-11-02 17:06:44,061 INFO 
 org.apache.hadoop.hbase.regionserver.SplitTransaction: Still waiting on the 
 master to process the split for f7e1783e65ea8d621a4bc96ad310f101
 {quote}
 Now that the master is dead the region server is spewing those last two lines 
 like mad.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4927) CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty

2011-12-07 Thread Jimmy Xiang (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-4927:
---

Attachment: 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch
0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch

 CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the 
 last region when the endkey is empty
 ---

 Key: HBASE-4927
 URL: https://issues.apache.org/jira/browse/HBASE-4927
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.92.0

 Attachments: 
 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch, 
 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch, 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator-.patch, 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator_v2.patch, 
 hbase-4927-fix-ws.txt


 When reviewing HBASE-4238 backporting, Jon found this issue.
 What happens if the split points are  (empty end key is the last key, empty 
 start key is the first key)
 Parent [A,)
 L daughter [A,B), 
 R daughter [B,)
 When sorted, we gets to end key comparision which results in this incorrector 
 order:
 [A,B), [A,), [B,) 
 we wanted:
 [A,), [A,B), [B,)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4610) Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug)

2011-12-07 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164682#comment-13164682
 ] 

Hadoop QA commented on HBASE-4610:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12506499/4610.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -160 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 72 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildOverlap
  org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildHole
  org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildBase

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/463//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/463//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/463//console

This message is automatically generated.

 Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk 
 (definitely bring in config params, decide if we need to do more to fix the 
 bug)
 -

 Key: HBASE-4610
 URL: https://issues.apache.org/jira/browse/HBASE-4610
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0, 0.94.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
 Fix For: 0.92.1

 Attachments: 4610.txt


 Over in HBASE-3380 we were having some TestMasterFailover flakiness.  We 
 added some more config parameters to better control the master startup loop 
 where it waits for RS to heartbeat in.  We had thought at the time that 92 
 would have a different solution but it is still relying on heartbeats to 
 learn about RSs.
 For now, we should definitely bring these config params into 92/trunk.  
 Otherwise this is an incompatible regression and adding these will also make 
 things like what was just reported over in HBASE-4603 trivial to fix in an 
 optimal way.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4927) CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty

2011-12-07 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164686#comment-13164686
 ] 

Jonathan Hsieh commented on HBASE-4927:
---

Verison initially committed with this patch made the HRegionInfo's comparator 
declare  region ['','') smaller  ['', 'A').  Previously it was the other way 
around.  

In the TestOfflineMeta* tests, disableTable call eventually calls 
AssignmentManager#getRegionsOfTable(table).  This returns 3 regions instead of 
4.  This is because this uses a boundary region with has [startkey='', 
endkey='').  The change likely left either the begin or end region out with 
this call.

The core problem is because the definintion of greater than or less than 
regions is inconsistent wrt to '' start and end keys.   


 CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the 
 last region when the endkey is empty
 ---

 Key: HBASE-4927
 URL: https://issues.apache.org/jira/browse/HBASE-4927
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.92.0

 Attachments: 
 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch, 
 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch, 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator-.patch, 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator_v2.patch, 
 hbase-4927-fix-ws.txt


 When reviewing HBASE-4238 backporting, Jon found this issue.
 What happens if the split points are  (empty end key is the last key, empty 
 start key is the first key)
 Parent [A,)
 L daughter [A,B), 
 R daughter [B,)
 When sorted, we gets to end key comparision which results in this incorrector 
 order:
 [A,B), [A,), [B,) 
 we wanted:
 [A,), [A,B), [B,)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4927) CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty

2011-12-07 Thread Jimmy Xiang (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-4927:
---

Status: Patch Available  (was: Reopened)

 CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the 
 last region when the endkey is empty
 ---

 Key: HBASE-4927
 URL: https://issues.apache.org/jira/browse/HBASE-4927
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.92.0

 Attachments: 
 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch, 
 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch, 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator-.patch, 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator_v2.patch, 
 hbase-4927-fix-ws.txt


 When reviewing HBASE-4238 backporting, Jon found this issue.
 What happens if the split points are  (empty end key is the last key, empty 
 start key is the first key)
 Parent [A,)
 L daughter [A,B), 
 R daughter [B,)
 When sorted, we gets to end key comparision which results in this incorrector 
 order:
 [A,B), [A,), [B,) 
 we wanted:
 [A,), [A,B), [B,)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4927) CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty

2011-12-07 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164695#comment-13164695
 ] 

Ted Yu commented on HBASE-4927:
---

Ran through the previously failing tests:
{code}
 1010  mt -Dtest=TestMasterRestartAfterDisablingTable 
 1012  mt -Dtest=TestOfflineMetaRebuildBase#testMetaRebuild
 1013  mt -Dtest=TestOfflineMetaRebuildHole
{code}
They pass now.

Going to commit to 0.92 and TRUNK.

 CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the 
 last region when the endkey is empty
 ---

 Key: HBASE-4927
 URL: https://issues.apache.org/jira/browse/HBASE-4927
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.92.0

 Attachments: 
 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch, 
 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch, 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator-.patch, 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator_v2.patch, 
 hbase-4927-fix-ws.txt


 When reviewing HBASE-4238 backporting, Jon found this issue.
 What happens if the split points are  (empty end key is the last key, empty 
 start key is the first key)
 Parent [A,)
 L daughter [A,B), 
 R daughter [B,)
 When sorted, we gets to end key comparision which results in this incorrector 
 order:
 [A,B), [A,), [B,) 
 we wanted:
 [A,), [A,B), [B,)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4927) CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty

2011-12-07 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164699#comment-13164699
 ] 

Ted Yu commented on HBASE-4927:
---

Also ran through the two tests in original patch:
{code}
 1302  mt -Dtest=TestHRegionInfo
 1303  mt -Dtest=TestCatalogJanitor
{code}
They passed as well.

Integrated to 0.92 and TRUNK.

Thanks for the addendum, Jimmy.

Thanks for the help, Jonathan.

 CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the 
 last region when the endkey is empty
 ---

 Key: HBASE-4927
 URL: https://issues.apache.org/jira/browse/HBASE-4927
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.92.0

 Attachments: 
 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch, 
 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch, 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator-.patch, 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator_v2.patch, 
 hbase-4927-fix-ws.txt


 When reviewing HBASE-4238 backporting, Jon found this issue.
 What happens if the split points are  (empty end key is the last key, empty 
 start key is the first key)
 Parent [A,)
 L daughter [A,B), 
 R daughter [B,)
 When sorted, we gets to end key comparision which results in this incorrector 
 order:
 [A,B), [A,), [B,) 
 we wanted:
 [A,), [A,B), [B,)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4610) Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug)

2011-12-07 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-4610:
--

Attachment: (was: 4610.txt)

 Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk 
 (definitely bring in config params, decide if we need to do more to fix the 
 bug)
 -

 Key: HBASE-4610
 URL: https://issues.apache.org/jira/browse/HBASE-4610
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0, 0.94.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
 Fix For: 0.92.1


 Over in HBASE-3380 we were having some TestMasterFailover flakiness.  We 
 added some more config parameters to better control the master startup loop 
 where it waits for RS to heartbeat in.  We had thought at the time that 92 
 would have a different solution but it is still relying on heartbeats to 
 learn about RSs.
 For now, we should definitely bring these config params into 92/trunk.  
 Otherwise this is an incompatible regression and adding these will also make 
 things like what was just reported over in HBASE-4603 trivial to fix in an 
 optimal way.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4610) Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug)

2011-12-07 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-4610:
--

Status: Patch Available  (was: Open)

Patch testing now that HBASE-4927 addendum has been integrated

 Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk 
 (definitely bring in config params, decide if we need to do more to fix the 
 bug)
 -

 Key: HBASE-4610
 URL: https://issues.apache.org/jira/browse/HBASE-4610
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0, 0.94.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
 Fix For: 0.92.1

 Attachments: 4610.txt


 Over in HBASE-3380 we were having some TestMasterFailover flakiness.  We 
 added some more config parameters to better control the master startup loop 
 where it waits for RS to heartbeat in.  We had thought at the time that 92 
 would have a different solution but it is still relying on heartbeats to 
 learn about RSs.
 For now, we should definitely bring these config params into 92/trunk.  
 Otherwise this is an incompatible regression and adding these will also make 
 things like what was just reported over in HBASE-4603 trivial to fix in an 
 optimal way.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4880) Region is on service before openRegionHandler completes, may cause data loss

2011-12-07 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-4880:
--

Status: Open  (was: Patch Available)

 Region is on service before openRegionHandler completes, may cause data loss
 

 Key: HBASE-4880
 URL: https://issues.apache.org/jira/browse/HBASE-4880
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0, 0.94.0
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch


 OpenRegionHandler in regionserver is processed as the following steps:
 {code}
 1.openregion()(Through it, closed = false, closing = false)
 2.addToOnlineRegions(region)
 3.update .meta. table 
 4.update ZK's node state to RS_ZK_REGION_OPEND
 {code}
 We can find that region is on service before Step 4.
 It means client could put data to this region after step 3.
 What will happen if step 4 is failed processing?
 It will execute OpenRegionHandler#cleanupFailedOpen which will do closing 
 region, and master assign this region to another regionserver.
 If closing region is failed, the data which is put between step 3 and step 4 
 may loss, because the region has been opend on another regionserver and be 
 put new data. Therefore, it may not be recoverd through replayRecoveredEdit() 
 because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4610) Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug)

2011-12-07 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-4610:
--

Attachment: 4610.txt

 Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk 
 (definitely bring in config params, decide if we need to do more to fix the 
 bug)
 -

 Key: HBASE-4610
 URL: https://issues.apache.org/jira/browse/HBASE-4610
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0, 0.94.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
 Fix For: 0.92.1

 Attachments: 4610.txt


 Over in HBASE-3380 we were having some TestMasterFailover flakiness.  We 
 added some more config parameters to better control the master startup loop 
 where it waits for RS to heartbeat in.  We had thought at the time that 92 
 would have a different solution but it is still relying on heartbeats to 
 learn about RSs.
 For now, we should definitely bring these config params into 92/trunk.  
 Otherwise this is an incompatible regression and adding these will also make 
 things like what was just reported over in HBASE-4603 trivial to fix in an 
 optimal way.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4880) Region is on service before openRegionHandler completes, may cause data loss

2011-12-07 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-4880:
--

Attachment: 4880.txt

 Region is on service before openRegionHandler completes, may cause data loss
 

 Key: HBASE-4880
 URL: https://issues.apache.org/jira/browse/HBASE-4880
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0, 0.94.0
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: 4880.txt, hbase-4880.patch, hbase-4880v2.patch, 
 hbase-4880v3.patch


 OpenRegionHandler in regionserver is processed as the following steps:
 {code}
 1.openregion()(Through it, closed = false, closing = false)
 2.addToOnlineRegions(region)
 3.update .meta. table 
 4.update ZK's node state to RS_ZK_REGION_OPEND
 {code}
 We can find that region is on service before Step 4.
 It means client could put data to this region after step 3.
 What will happen if step 4 is failed processing?
 It will execute OpenRegionHandler#cleanupFailedOpen which will do closing 
 region, and master assign this region to another regionserver.
 If closing region is failed, the data which is put between step 3 and step 4 
 may loss, because the region has been opend on another regionserver and be 
 put new data. Therefore, it may not be recoverd through replayRecoveredEdit() 
 because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4880) Region is on service before openRegionHandler completes, may cause data loss

2011-12-07 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-4880:
--

Status: Patch Available  (was: Open)

Patch testing again.

 Region is on service before openRegionHandler completes, may cause data loss
 

 Key: HBASE-4880
 URL: https://issues.apache.org/jira/browse/HBASE-4880
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0, 0.94.0
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: 4880.txt, hbase-4880.patch, hbase-4880v2.patch, 
 hbase-4880v3.patch


 OpenRegionHandler in regionserver is processed as the following steps:
 {code}
 1.openregion()(Through it, closed = false, closing = false)
 2.addToOnlineRegions(region)
 3.update .meta. table 
 4.update ZK's node state to RS_ZK_REGION_OPEND
 {code}
 We can find that region is on service before Step 4.
 It means client could put data to this region after step 3.
 What will happen if step 4 is failed processing?
 It will execute OpenRegionHandler#cleanupFailedOpen which will do closing 
 region, and master assign this region to another regionserver.
 If closing region is failed, the data which is put between step 3 and step 4 
 may loss, because the region has been opend on another regionserver and be 
 put new data. Therefore, it may not be recoverd through replayRecoveredEdit() 
 because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4927) CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty

2011-12-07 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164733#comment-13164733
 ] 

Hadoop QA commented on HBASE-4927:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12506510/0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 javadoc.  The javadoc tool appears to have generated -160 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 72 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/464//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/464//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/464//console

This message is automatically generated.

 CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the 
 last region when the endkey is empty
 ---

 Key: HBASE-4927
 URL: https://issues.apache.org/jira/browse/HBASE-4927
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.92.0

 Attachments: 
 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch, 
 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch, 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator-.patch, 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator_v2.patch, 
 hbase-4927-fix-ws.txt


 When reviewing HBASE-4238 backporting, Jon found this issue.
 What happens if the split points are  (empty end key is the last key, empty 
 start key is the first key)
 Parent [A,)
 L daughter [A,B), 
 R daughter [B,)
 When sorted, we gets to end key comparision which results in this incorrector 
 order:
 [A,B), [A,), [B,) 
 we wanted:
 [A,), [A,B), [B,)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4946) HTable.coprocessorExec (and possibly coprocessorProxy) does not work with dynamically loaded coprocessors (from hdfs or local system), because the RPC system tries to des

2011-12-07 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-4946:
--

Attachment: 4946-v4.txt

Patch v4 removes eager instantiation.

 HTable.coprocessorExec (and possibly coprocessorProxy) does not work with 
 dynamically loaded coprocessors (from hdfs or local system), because the RPC 
 system tries to deserialize an unknown class. 
 -

 Key: HBASE-4946
 URL: https://issues.apache.org/jira/browse/HBASE-4946
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Affects Versions: 0.92.0
Reporter: Andrei Dragomir
Assignee: Andrei Dragomir
 Attachments: 4946-v4.txt, HBASE-4946-v2.patch, HBASE-4946-v3.patch, 
 HBASE-4946.patch


 Loading coprocessors jars from hdfs works fine. I load it from the shell, 
 after setting the attribute, and it gets loaded:
 {noformat}
 INFO org.apache.hadoop.hbase.regionserver.HRegion: Setting up tabledescriptor 
 config now ...
 INFO org.apache.hadoop.hbase.coprocessor.CoprocessorHost: Class 
 com.MyCoprocessorClass needs to be loaded from a file - 
 hdfs://localhost:9000/coproc/rt-  0.0.1-SNAPSHOT.jar.
 INFO org.apache.hadoop.hbase.coprocessor.CoprocessorHost: loadInstance: 
 com.MyCoprocessorClass
 INFO org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost: 
 RegionEnvironment createEnvironment
 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Registered protocol 
 handler: region=t1,,1322572939753.6409aee1726d31f5e5671a59fe6e384f. 
 protocol=com.MyCoprocessorClassProtocol
 INFO org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost: Load 
 coprocessor com.MyCoprocessorClass from HTD of t1 successfully.
 {noformat}
 The problem is that this coprocessors simply extends BaseEndpointCoprocessor, 
 with a dynamic method. When calling this method from the client with 
 HTable.coprocessorExec, I get errors on the HRegionServer, because the call 
 cannot be deserialized from writables. 
 The problem is that Exec tries to do an early resolve of the coprocessor 
 class. The coprocessor class is loaded, but it is in the context of the 
 HRegionServer / HRegion. So, the call fails:
 {noformat}
 2011-12-02 00:34:17,348 ERROR org.apache.hadoop.hbase.io.HbaseObjectWritable: 
 Error in readFields
 java.io.IOException: Protocol class com.MyCoprocessorClassProtocol not found
   at org.apache.hadoop.hbase.client.coprocessor.Exec.readFields(Exec.java:125)
   at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:575)
   at org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:105)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1237)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1167)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:703)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:495)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:470)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:680)
 Caused by: java.lang.ClassNotFoundException: com.MyCoprocessorClassProtocol
   at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
   at java.lang.Class.forName0(Native Method)
   at java.lang.Class.forName(Class.java:247)
   at 
 org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:943)
   at org.apache.hadoop.hbase.client.coprocessor.Exec.readFields(Exec.java:122)
   ... 10 more
 {noformat}
 Probably the correct way to fix this is to make Exec really smart, so that it 
 knows all the class definitions loaded in CoprocessorHost(s).
 I created a small patch that simply doesn't resolve the class definition in 
 the Exec, instead passing it as string down to the HRegion layer. This layer 
 knows all the definitions, and simply loads it by name. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: 

[jira] [Updated] (HBASE-4946) HTable.coprocessorExec (and possibly coprocessorProxy) does not work with dynamically loaded coprocessors (from hdfs or local system), because the RPC system tries to des

2011-12-07 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-4946:
--

Status: Open  (was: Patch Available)

 HTable.coprocessorExec (and possibly coprocessorProxy) does not work with 
 dynamically loaded coprocessors (from hdfs or local system), because the RPC 
 system tries to deserialize an unknown class. 
 -

 Key: HBASE-4946
 URL: https://issues.apache.org/jira/browse/HBASE-4946
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Affects Versions: 0.92.0
Reporter: Andrei Dragomir
Assignee: Andrei Dragomir
 Attachments: 4946-v4.txt, HBASE-4946-v2.patch, HBASE-4946-v3.patch, 
 HBASE-4946.patch


 Loading coprocessors jars from hdfs works fine. I load it from the shell, 
 after setting the attribute, and it gets loaded:
 {noformat}
 INFO org.apache.hadoop.hbase.regionserver.HRegion: Setting up tabledescriptor 
 config now ...
 INFO org.apache.hadoop.hbase.coprocessor.CoprocessorHost: Class 
 com.MyCoprocessorClass needs to be loaded from a file - 
 hdfs://localhost:9000/coproc/rt-  0.0.1-SNAPSHOT.jar.
 INFO org.apache.hadoop.hbase.coprocessor.CoprocessorHost: loadInstance: 
 com.MyCoprocessorClass
 INFO org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost: 
 RegionEnvironment createEnvironment
 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Registered protocol 
 handler: region=t1,,1322572939753.6409aee1726d31f5e5671a59fe6e384f. 
 protocol=com.MyCoprocessorClassProtocol
 INFO org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost: Load 
 coprocessor com.MyCoprocessorClass from HTD of t1 successfully.
 {noformat}
 The problem is that this coprocessors simply extends BaseEndpointCoprocessor, 
 with a dynamic method. When calling this method from the client with 
 HTable.coprocessorExec, I get errors on the HRegionServer, because the call 
 cannot be deserialized from writables. 
 The problem is that Exec tries to do an early resolve of the coprocessor 
 class. The coprocessor class is loaded, but it is in the context of the 
 HRegionServer / HRegion. So, the call fails:
 {noformat}
 2011-12-02 00:34:17,348 ERROR org.apache.hadoop.hbase.io.HbaseObjectWritable: 
 Error in readFields
 java.io.IOException: Protocol class com.MyCoprocessorClassProtocol not found
   at org.apache.hadoop.hbase.client.coprocessor.Exec.readFields(Exec.java:125)
   at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:575)
   at org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:105)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1237)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1167)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:703)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:495)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:470)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:680)
 Caused by: java.lang.ClassNotFoundException: com.MyCoprocessorClassProtocol
   at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
   at java.lang.Class.forName0(Native Method)
   at java.lang.Class.forName(Class.java:247)
   at 
 org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:943)
   at org.apache.hadoop.hbase.client.coprocessor.Exec.readFields(Exec.java:122)
   ... 10 more
 {noformat}
 Probably the correct way to fix this is to make Exec really smart, so that it 
 knows all the class definitions loaded in CoprocessorHost(s).
 I created a small patch that simply doesn't resolve the class definition in 
 the Exec, instead passing it as string down to the HRegion layer. This layer 
 knows all the definitions, and simply loads it by name. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4712) Document rules for writing tests

2011-12-07 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4712:
-

Attachment: test-doc-cleanup.txt

Corrections from Jesse Yates

 Document rules for writing tests
 

 Key: HBASE-4712
 URL: https://issues.apache.org/jira/browse/HBASE-4712
 Project: HBase
  Issue Type: Task
  Components: test
Affects Versions: 0.92.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.94.0

 Attachments: 4712.txt, test-doc-cleanup.txt


 We saw that some tests could be improved. Documenting the general rules could 
 help.
 Proposal:
 HBase tests are divided in three categories: small, medium and large, with 
 corresponding JUnit categories: SmallTest, MediumTest, LargeTest
 Small tests are executed in parallel in a shared JVM. They must last less 
 than 15 seconds. They must NOT use a cluster.
 Medium tests are executed in separate JVM. They must last less than 50 
 seconds. They can use a cluster. They must not fail occasionally.
 Small and medium tests must not need more than 30 minutes to run altogether.
 Small and medium tests should be executed by the developers before submitting 
 a patch.
 Large tests are everything else. They are typically integration tests, 
 non-regression tests for specific bugs, timeout tests, performance tests.
 Tests rules  hints are:
 - As most as possible, tests should be written as small tests.
 - All tests should be written to support parallel execution on the same 
 machine, hence should not use shared resources as fixed ports or fixed file 
 names.
 - All tests should be written to be as fast as possible.
 - Tests should not overlog. More than 100 lines/second makes the logs complex 
 to read and use i/o that are hence not available for the other tests.
 - Tests can be written with HBaseTestingUtility . This class offers helper 
 function to create a temp directory and do the cleanup, or to start a cluster.
 - Sleeps:
 - Tests should not do a 'Thread.sleep' without testing an ending 
 condition. This allows understanding what the test is waiting for. Moreover, 
 the test will work whatever the machine performances.
 - Sleep should be minimal to be as fast as possible. Waiting for a 
 variable should be done in a 40ms sleep loop. Waiting for a socket operation 
 should be done in a 200 ms sleep loop.
 - Tests using cluster:
 - Tests using a HRegion do not have to start a cluster: A region can use 
 the local file system.
 - Start/stopping a cluster cost around 10 seconds. They should not be 
 started per test method but per class.
 - Started cluster must be shutdown using 
 HBaseTestingUtility#shutdownMiniCluster, which cleans the directories.
 - As most as possible, tests should use the default settings for the 
 cluster. When they don't, they should document it. This will allow to share 
 the cluster later.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4712) Document rules for writing tests

2011-12-07 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164748#comment-13164748
 ] 

stack commented on HBASE-4712:
--

I committed Jesse's addendum to TRUNK.

 Document rules for writing tests
 

 Key: HBASE-4712
 URL: https://issues.apache.org/jira/browse/HBASE-4712
 Project: HBase
  Issue Type: Task
  Components: test
Affects Versions: 0.92.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.94.0

 Attachments: 4712.txt, test-doc-cleanup.txt


 We saw that some tests could be improved. Documenting the general rules could 
 help.
 Proposal:
 HBase tests are divided in three categories: small, medium and large, with 
 corresponding JUnit categories: SmallTest, MediumTest, LargeTest
 Small tests are executed in parallel in a shared JVM. They must last less 
 than 15 seconds. They must NOT use a cluster.
 Medium tests are executed in separate JVM. They must last less than 50 
 seconds. They can use a cluster. They must not fail occasionally.
 Small and medium tests must not need more than 30 minutes to run altogether.
 Small and medium tests should be executed by the developers before submitting 
 a patch.
 Large tests are everything else. They are typically integration tests, 
 non-regression tests for specific bugs, timeout tests, performance tests.
 Tests rules  hints are:
 - As most as possible, tests should be written as small tests.
 - All tests should be written to support parallel execution on the same 
 machine, hence should not use shared resources as fixed ports or fixed file 
 names.
 - All tests should be written to be as fast as possible.
 - Tests should not overlog. More than 100 lines/second makes the logs complex 
 to read and use i/o that are hence not available for the other tests.
 - Tests can be written with HBaseTestingUtility . This class offers helper 
 function to create a temp directory and do the cleanup, or to start a cluster.
 - Sleeps:
 - Tests should not do a 'Thread.sleep' without testing an ending 
 condition. This allows understanding what the test is waiting for. Moreover, 
 the test will work whatever the machine performances.
 - Sleep should be minimal to be as fast as possible. Waiting for a 
 variable should be done in a 40ms sleep loop. Waiting for a socket operation 
 should be done in a 200 ms sleep loop.
 - Tests using cluster:
 - Tests using a HRegion do not have to start a cluster: A region can use 
 the local file system.
 - Start/stopping a cluster cost around 10 seconds. They should not be 
 started per test method but per class.
 - Started cluster must be shutdown using 
 HBaseTestingUtility#shutdownMiniCluster, which cleans the directories.
 - As most as possible, tests should use the default settings for the 
 cluster. When they don't, they should document it. This will allow to share 
 the cluster later.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4976) Add compaction/flush queue size metrics mistakenly removed by HFile v2

2011-12-07 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4976:
-

  Description: Upping priority, and putting it against 0.92 since J-D 
fingered it as blocker.  Which metrics in particular are missing?  Hard to 
patch?
 Priority: Blocker  (was: Major)
Fix Version/s: 0.92.0

 Add compaction/flush queue size metrics mistakenly removed by HFile v2
 --

 Key: HBASE-4976
 URL: https://issues.apache.org/jira/browse/HBASE-4976
 Project: HBase
  Issue Type: Bug
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
Priority: Blocker
 Fix For: 0.92.0


 Upping priority, and putting it against 0.92 since J-D fingered it as 
 blocker.  Which metrics in particular are missing?  Hard to patch?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4605) Constraints

2011-12-07 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164765#comment-13164765
 ] 

Hadoop QA commented on HBASE-4605:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12505280/java_HBASE-4605_v3.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 18 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -160 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 74 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/465//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/465//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/465//console

This message is automatically generated.

 Constraints
 ---

 Key: HBASE-4605
 URL: https://issues.apache.org/jira/browse/HBASE-4605
 Project: HBase
  Issue Type: Improvement
  Components: client, coprocessors
Affects Versions: 0.94.0
Reporter: Jesse Yates
Assignee: Jesse Yates
 Attachments: 4605.v7, constraint_as_cp.txt, java_Constraint_v2.patch, 
 java_HBASE-4605_v1.patch, java_HBASE-4605_v2.patch, java_HBASE-4605_v3.patch


 From Jesse's comment on dev:
 {quote}
 What I would like to propose is a simple interface that people can use to 
 implement a 'constraint' (matching the classic database definition). This 
 would help ease of adoption by helping HBase more easily check that box, help 
 minimize code duplication across organizations, and lead to easier adoption.
 Essentially, people would implement a 'Constraint' interface for checking 
 keys before they are put into a table. Puts that are valid get written to the 
 table, but if not people can will throw an exception that gets propagated 
 back to the client explaining why the put was invalid.
 Constraints would be set on a per-table basis and the user would be expected 
 to ensure the jars containing the constraint are present on the machines 
 serving that table.
 Yes, people could roll their own mechanism for doing this via coprocessors 
 each time, but this would make it easier to do so, so you only have to 
 implement a very minimal interface and not worry about the specifics.
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4610) Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug)

2011-12-07 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164772#comment-13164772
 ] 

Hadoop QA commented on HBASE-4610:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12506517/4610.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -160 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 72 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/467//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/467//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/467//console

This message is automatically generated.

 Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk 
 (definitely bring in config params, decide if we need to do more to fix the 
 bug)
 -

 Key: HBASE-4610
 URL: https://issues.apache.org/jira/browse/HBASE-4610
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0, 0.94.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
 Fix For: 0.92.1

 Attachments: 4610.txt


 Over in HBASE-3380 we were having some TestMasterFailover flakiness.  We 
 added some more config parameters to better control the master startup loop 
 where it waits for RS to heartbeat in.  We had thought at the time that 92 
 would have a different solution but it is still relying on heartbeats to 
 learn about RSs.
 For now, we should definitely bring these config params into 92/trunk.  
 Otherwise this is an incompatible regression and adding these will also make 
 things like what was just reported over in HBASE-4603 trivial to fix in an 
 optimal way.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4880) Region is on service before openRegionHandler completes, may cause data loss

2011-12-07 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164773#comment-13164773
 ] 

Hadoop QA commented on HBASE-4880:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12506518/4880.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -160 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 72 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/466//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/466//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/466//console

This message is automatically generated.

 Region is on service before openRegionHandler completes, may cause data loss
 

 Key: HBASE-4880
 URL: https://issues.apache.org/jira/browse/HBASE-4880
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0, 0.94.0
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: 4880.txt, hbase-4880.patch, hbase-4880v2.patch, 
 hbase-4880v3.patch


 OpenRegionHandler in regionserver is processed as the following steps:
 {code}
 1.openregion()(Through it, closed = false, closing = false)
 2.addToOnlineRegions(region)
 3.update .meta. table 
 4.update ZK's node state to RS_ZK_REGION_OPEND
 {code}
 We can find that region is on service before Step 4.
 It means client could put data to this region after step 3.
 What will happen if step 4 is failed processing?
 It will execute OpenRegionHandler#cleanupFailedOpen which will do closing 
 region, and master assign this region to another regionserver.
 If closing region is failed, the data which is put between step 3 and step 4 
 may loss, because the region has been opend on another regionserver and be 
 put new data. Therefore, it may not be recoverd through replayRecoveredEdit() 
 because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4880) Region is on service before openRegionHandler completes, may cause data loss

2011-12-07 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164777#comment-13164777
 ] 

Zhihong Yu commented on HBASE-4880:
---

Latest patch passes all tests.

+1.

 Region is on service before openRegionHandler completes, may cause data loss
 

 Key: HBASE-4880
 URL: https://issues.apache.org/jira/browse/HBASE-4880
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0, 0.94.0
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: 4880.txt, hbase-4880.patch, hbase-4880v2.patch, 
 hbase-4880v3.patch


 OpenRegionHandler in regionserver is processed as the following steps:
 {code}
 1.openregion()(Through it, closed = false, closing = false)
 2.addToOnlineRegions(region)
 3.update .meta. table 
 4.update ZK's node state to RS_ZK_REGION_OPEND
 {code}
 We can find that region is on service before Step 4.
 It means client could put data to this region after step 3.
 What will happen if step 4 is failed processing?
 It will execute OpenRegionHandler#cleanupFailedOpen which will do closing 
 region, and master assign this region to another regionserver.
 If closing region is failed, the data which is put between step 3 and step 4 
 may loss, because the region has been opend on another regionserver and be 
 put new data. Therefore, it may not be recoverd through replayRecoveredEdit() 
 because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4610) Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug)

2011-12-07 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164785#comment-13164785
 ] 

Zhihong Yu commented on HBASE-4610:
---

Test suite passes.

Will commit later today if no objections.

 Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk 
 (definitely bring in config params, decide if we need to do more to fix the 
 bug)
 -

 Key: HBASE-4610
 URL: https://issues.apache.org/jira/browse/HBASE-4610
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0, 0.94.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
 Fix For: 0.92.1

 Attachments: 4610.txt


 Over in HBASE-3380 we were having some TestMasterFailover flakiness.  We 
 added some more config parameters to better control the master startup loop 
 where it waits for RS to heartbeat in.  We had thought at the time that 92 
 would have a different solution but it is still relying on heartbeats to 
 learn about RSs.
 For now, we should definitely bring these config params into 92/trunk.  
 Otherwise this is an incompatible regression and adding these will also make 
 things like what was just reported over in HBASE-4603 trivial to fix in an 
 optimal way.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4927) CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty

2011-12-07 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164797#comment-13164797
 ] 

Hudson commented on HBASE-4927:
---

Integrated in HBase-TRUNK #2524 (See 
[https://builds.apache.org/job/HBase-TRUNK/2524/])
HBASE-4927 Addendum fixes case where start key is empty and end key is empty

tedyu : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HRegionInfo.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/CatalogJanitor.java


 CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the 
 last region when the endkey is empty
 ---

 Key: HBASE-4927
 URL: https://issues.apache.org/jira/browse/HBASE-4927
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.92.0

 Attachments: 
 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch, 
 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch, 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator-.patch, 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator_v2.patch, 
 hbase-4927-fix-ws.txt


 When reviewing HBASE-4238 backporting, Jon found this issue.
 What happens if the split points are  (empty end key is the last key, empty 
 start key is the first key)
 Parent [A,)
 L daughter [A,B), 
 R daughter [B,)
 When sorted, we gets to end key comparision which results in this incorrector 
 order:
 [A,B), [A,), [B,) 
 we wanted:
 [A,), [A,B), [B,)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4956) Control direct memory buffer consumption by HBaseClient

2011-12-07 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164798#comment-13164798
 ] 

Zhihong Yu commented on HBASE-4956:
---

Since the proposal involves asynchronous communication, we should devise new 
API which can be used to validate the reduction in use of direct memory buffer.

 Control direct memory buffer consumption by HBaseClient
 ---

 Key: HBASE-4956
 URL: https://issues.apache.org/jira/browse/HBASE-4956
 Project: HBase
  Issue Type: New Feature
Reporter: Ted Yu

 As Jonathan explained here 
 https://groups.google.com/group/asynchbase/browse_thread/thread/c45bc7ba788b2357?pli=1
  , standard hbase client inadvertently consumes large amount of direct memory.
 We should consider using netty for NIO-related tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4927) CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty

2011-12-07 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-4927:
--

Resolution: Fixed
Status: Resolved  (was: Patch Available)

TRUNK build is back to normal.

Resolving again.

 CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the 
 last region when the endkey is empty
 ---

 Key: HBASE-4927
 URL: https://issues.apache.org/jira/browse/HBASE-4927
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.92.0

 Attachments: 
 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch, 
 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch, 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator-.patch, 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator_v2.patch, 
 hbase-4927-fix-ws.txt


 When reviewing HBASE-4238 backporting, Jon found this issue.
 What happens if the split points are  (empty end key is the last key, empty 
 start key is the first key)
 Parent [A,)
 L daughter [A,B), 
 R daughter [B,)
 When sorted, we gets to end key comparision which results in this incorrector 
 order:
 [A,B), [A,), [B,) 
 we wanted:
 [A,), [A,B), [B,)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4946) HTable.coprocessorExec (and possibly coprocessorProxy) does not work with dynamically loaded coprocessors (from hdfs or local system), because the RPC system tries to d

2011-12-07 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164805#comment-13164805
 ] 

Hadoop QA commented on HBASE-4946:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12506521/4946-v4.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -160 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 73 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/468//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/468//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/468//console

This message is automatically generated.

 HTable.coprocessorExec (and possibly coprocessorProxy) does not work with 
 dynamically loaded coprocessors (from hdfs or local system), because the RPC 
 system tries to deserialize an unknown class. 
 -

 Key: HBASE-4946
 URL: https://issues.apache.org/jira/browse/HBASE-4946
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Affects Versions: 0.92.0
Reporter: Andrei Dragomir
Assignee: Andrei Dragomir
 Attachments: 4946-v4.txt, HBASE-4946-v2.patch, HBASE-4946-v3.patch, 
 HBASE-4946.patch


 Loading coprocessors jars from hdfs works fine. I load it from the shell, 
 after setting the attribute, and it gets loaded:
 {noformat}
 INFO org.apache.hadoop.hbase.regionserver.HRegion: Setting up tabledescriptor 
 config now ...
 INFO org.apache.hadoop.hbase.coprocessor.CoprocessorHost: Class 
 com.MyCoprocessorClass needs to be loaded from a file - 
 hdfs://localhost:9000/coproc/rt-  0.0.1-SNAPSHOT.jar.
 INFO org.apache.hadoop.hbase.coprocessor.CoprocessorHost: loadInstance: 
 com.MyCoprocessorClass
 INFO org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost: 
 RegionEnvironment createEnvironment
 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Registered protocol 
 handler: region=t1,,1322572939753.6409aee1726d31f5e5671a59fe6e384f. 
 protocol=com.MyCoprocessorClassProtocol
 INFO org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost: Load 
 coprocessor com.MyCoprocessorClass from HTD of t1 successfully.
 {noformat}
 The problem is that this coprocessors simply extends BaseEndpointCoprocessor, 
 with a dynamic method. When calling this method from the client with 
 HTable.coprocessorExec, I get errors on the HRegionServer, because the call 
 cannot be deserialized from writables. 
 The problem is that Exec tries to do an early resolve of the coprocessor 
 class. The coprocessor class is loaded, but it is in the context of the 
 HRegionServer / HRegion. So, the call fails:
 {noformat}
 2011-12-02 00:34:17,348 ERROR org.apache.hadoop.hbase.io.HbaseObjectWritable: 
 Error in readFields
 java.io.IOException: Protocol class com.MyCoprocessorClassProtocol not found
   at org.apache.hadoop.hbase.client.coprocessor.Exec.readFields(Exec.java:125)
   at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:575)
   at org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:105)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1237)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1167)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:703)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:495)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:470)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:680)
 Caused by: java.lang.ClassNotFoundException: com.MyCoprocessorClassProtocol
   at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
   at 

[jira] [Commented] (HBASE-4974) Remove some resources leaks on the tests

2011-12-07 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164827#comment-13164827
 ] 

Jonathan Hsieh commented on HBASE-4974:
---

The test failures are related to a problem in HBASE-4927.  An addendum was 
added and those 3 tests should pass now.

 Remove some resources leaks on the tests
 

 Key: HBASE-4974
 URL: https://issues.apache.org/jira/browse/HBASE-4974
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 4974_all.patch


 Cf. title and HBASE-4965

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4972) Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch.

2011-12-07 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164830#comment-13164830
 ] 

Jonathan Hsieh commented on HBASE-4972:
---

Ted has filed HBASE-4977 and closed HBASE-3848.   I will resolving this issue 
as Not a bug

 Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch.
 --

 Key: HBASE-4972
 URL: https://issues.apache.org/jira/browse/HBASE-4972
 Project: HBase
  Issue Type: Task
Affects Versions: 0.92.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
Priority: Critical
 Fix For: 0.92.0


 There are several issues that have been committed in the 0.90 branch but were 
 not in trunk/0.92 branch.   These regressions should be forward ported.
 HBASE-3320  ! 
 HBASE-3380  ! - HBASE-4610 is a jira to backports this, but it is not done.
 HBASE-3410  ! 
 HBASE-3501  !
 HBASE-3714  ! 
 HBASE-3729  !! Marked in 0.92 but not committed there, committed in 0.90 
 branch.
 HBASE-3848  !
 HBASE-3892  ! * Comments say trunk does not need.
 HBASE-3906  !
 HBASE-3989  !
 HBASE-4109  !
 HBASE-4160  !! Marked resolved 0.90.5, but no corresponding commit in either 
 0.90 or 0.92
 HBASE-4423  ! 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-4972) Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch.

2011-12-07 Thread Jonathan Hsieh (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh resolved HBASE-4972.
---

Resolution: Not A Problem

 Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch.
 --

 Key: HBASE-4972
 URL: https://issues.apache.org/jira/browse/HBASE-4972
 Project: HBase
  Issue Type: Task
Affects Versions: 0.92.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
Priority: Critical
 Fix For: 0.92.0


 There are several issues that have been committed in the 0.90 branch but were 
 not in trunk/0.92 branch.   These regressions should be forward ported.
 HBASE-3320  ! 
 HBASE-3380  ! - HBASE-4610 is a jira to backports this, but it is not done.
 HBASE-3410  ! 
 HBASE-3501  !
 HBASE-3714  ! 
 HBASE-3729  !! Marked in 0.92 but not committed there, committed in 0.90 
 branch.
 HBASE-3848  !
 HBASE-3892  ! * Comments say trunk does not need.
 HBASE-3906  !
 HBASE-3989  !
 HBASE-4109  !
 HBASE-4160  !! Marked resolved 0.90.5, but no corresponding commit in either 
 0.90 or 0.92
 HBASE-4423  ! 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4610) Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug)

2011-12-07 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164835#comment-13164835
 ] 

Jonathan Hsieh commented on HBASE-4610:
---

I had started doing this also -- are you sure you want to keep the 'if (count 
== oldcount  count  0) break' line?  It was removed on the 0.90 version.

{code}
+long slept = 0;
 for (int oldcount = countOfRegionServers(); !this.master.isStopped();) {
   Thread.sleep(interval);
+  slept += interval;
   count = countOfRegionServers();
   if (count == oldcount  count  0) break;
 
   String msg;
+  if (count == oldcount  count = minToStart  slept = timeout) {
+LOG.info(Finished waiting for regionserver count to settle;  +
+count= + count + , sleptFor= + slept);
+break;
{code}

Before and after test, TestMasterFailover seemed flaky for me on the 0.92 
branch.  

Is the plan for this 0.92.0 or 0.92.1?

 Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk 
 (definitely bring in config params, decide if we need to do more to fix the 
 bug)
 -

 Key: HBASE-4610
 URL: https://issues.apache.org/jira/browse/HBASE-4610
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0, 0.94.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
 Fix For: 0.92.1

 Attachments: 4610.txt


 Over in HBASE-3380 we were having some TestMasterFailover flakiness.  We 
 added some more config parameters to better control the master startup loop 
 where it waits for RS to heartbeat in.  We had thought at the time that 92 
 would have a different solution but it is still relying on heartbeats to 
 learn about RSs.
 For now, we should definitely bring these config params into 92/trunk.  
 Otherwise this is an incompatible regression and adding these will also make 
 things like what was just reported over in HBASE-4603 trivial to fix in an 
 optimal way.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4610) Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug)

2011-12-07 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164839#comment-13164839
 ] 

Zhihong Yu commented on HBASE-4610:
---

Thanks for the review Jonathan.
The first break statement should be removed.

I ran TestMasterFailover on MacBook and didn't see failure.

I think this should go to 0.92.0

 Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk 
 (definitely bring in config params, decide if we need to do more to fix the 
 bug)
 -

 Key: HBASE-4610
 URL: https://issues.apache.org/jira/browse/HBASE-4610
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0, 0.94.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
 Fix For: 0.92.1

 Attachments: 4610.txt


 Over in HBASE-3380 we were having some TestMasterFailover flakiness.  We 
 added some more config parameters to better control the master startup loop 
 where it waits for RS to heartbeat in.  We had thought at the time that 92 
 would have a different solution but it is still relying on heartbeats to 
 learn about RSs.
 For now, we should definitely bring these config params into 92/trunk.  
 Otherwise this is an incompatible regression and adding these will also make 
 things like what was just reported over in HBASE-4603 trivial to fix in an 
 optimal way.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4974) Remove some resources leaks on the tests

2011-12-07 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-4974:
---

Status: Open  (was: Patch Available)

 Remove some resources leaks on the tests
 

 Key: HBASE-4974
 URL: https://issues.apache.org/jira/browse/HBASE-4974
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 4974_all.patch, 4974_all.v2.patch


 Cf. title and HBASE-4965

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4610) Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug)

2011-12-07 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-4610:
--

Fix Version/s: (was: 0.92.1)
   0.94.0
   0.92.0

 Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk 
 (definitely bring in config params, decide if we need to do more to fix the 
 bug)
 -

 Key: HBASE-4610
 URL: https://issues.apache.org/jira/browse/HBASE-4610
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0, 0.94.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
 Fix For: 0.92.0, 0.94.0

 Attachments: 4610.txt


 Over in HBASE-3380 we were having some TestMasterFailover flakiness.  We 
 added some more config parameters to better control the master startup loop 
 where it waits for RS to heartbeat in.  We had thought at the time that 92 
 would have a different solution but it is still relying on heartbeats to 
 learn about RSs.
 For now, we should definitely bring these config params into 92/trunk.  
 Otherwise this is an incompatible regression and adding these will also make 
 things like what was just reported over in HBASE-4603 trivial to fix in an 
 optimal way.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4972) Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch.

2011-12-07 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164841#comment-13164841
 ] 

stack commented on HBASE-4972:
--

Nice work Jon 'Auditor' Hsieh.  Thanks. 

 Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch.
 --

 Key: HBASE-4972
 URL: https://issues.apache.org/jira/browse/HBASE-4972
 Project: HBase
  Issue Type: Task
Affects Versions: 0.92.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
Priority: Critical
 Fix For: 0.92.0


 There are several issues that have been committed in the 0.90 branch but were 
 not in trunk/0.92 branch.   These regressions should be forward ported.
 HBASE-3320  ! 
 HBASE-3380  ! - HBASE-4610 is a jira to backports this, but it is not done.
 HBASE-3410  ! 
 HBASE-3501  !
 HBASE-3714  ! 
 HBASE-3729  !! Marked in 0.92 but not committed there, committed in 0.90 
 branch.
 HBASE-3848  !
 HBASE-3892  ! * Comments say trunk does not need.
 HBASE-3906  !
 HBASE-3989  !
 HBASE-4109  !
 HBASE-4160  !! Marked resolved 0.90.5, but no corresponding commit in either 
 0.90 or 0.92
 HBASE-4423  ! 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4972) Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch.

2011-12-07 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164842#comment-13164842
 ] 

stack commented on HBASE-4972:
--

Nice work Jon 'Auditor' Hsieh.  Thanks. 

 Investigate and port patches on 0.90 branch that are not on 0.92/trunk branch.
 --

 Key: HBASE-4972
 URL: https://issues.apache.org/jira/browse/HBASE-4972
 Project: HBase
  Issue Type: Task
Affects Versions: 0.92.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
Priority: Critical
 Fix For: 0.92.0


 There are several issues that have been committed in the 0.90 branch but were 
 not in trunk/0.92 branch.   These regressions should be forward ported.
 HBASE-3320  ! 
 HBASE-3380  ! - HBASE-4610 is a jira to backports this, but it is not done.
 HBASE-3410  ! 
 HBASE-3501  !
 HBASE-3714  ! 
 HBASE-3729  !! Marked in 0.92 but not committed there, committed in 0.90 
 branch.
 HBASE-3848  !
 HBASE-3892  ! * Comments say trunk does not need.
 HBASE-3906  !
 HBASE-3989  !
 HBASE-4109  !
 HBASE-4160  !! Marked resolved 0.90.5, but no corresponding commit in either 
 0.90 or 0.92
 HBASE-4423  ! 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4974) Remove some resources leaks on the tests

2011-12-07 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-4974:
---

Status: Patch Available  (was: Open)

Thanks for the info, Jon. Let's retry then.

 Remove some resources leaks on the tests
 

 Key: HBASE-4974
 URL: https://issues.apache.org/jira/browse/HBASE-4974
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 4974_all.patch, 4974_all.v2.patch


 Cf. title and HBASE-4965

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4610) Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug)

2011-12-07 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164844#comment-13164844
 ] 

Jonathan Hsieh commented on HBASE-4610:
---

I think if the tests are no worse than before, 0.92.0 sounds reasonable to me.

 Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk 
 (definitely bring in config params, decide if we need to do more to fix the 
 bug)
 -

 Key: HBASE-4610
 URL: https://issues.apache.org/jira/browse/HBASE-4610
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0, 0.94.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
 Fix For: 0.92.0, 0.94.0

 Attachments: 4610.txt


 Over in HBASE-3380 we were having some TestMasterFailover flakiness.  We 
 added some more config parameters to better control the master startup loop 
 where it waits for RS to heartbeat in.  We had thought at the time that 92 
 would have a different solution but it is still relying on heartbeats to 
 learn about RSs.
 For now, we should definitely bring these config params into 92/trunk.  
 Otherwise this is an incompatible regression and adding these will also make 
 things like what was just reported over in HBASE-4603 trivial to fix in an 
 optimal way.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




  1   2   >