[jira] [Commented] (HBASE-4964) Add builddate, make less sections in toc, and add header and footer customizations

2011-12-06 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13163463#comment-13163463
 ] 

Hudson commented on HBASE-4964:
---

Integrated in HBase-TRUNK #2521 (See 
[https://builds.apache.org/job/HBase-TRUNK/2521/])
HBASE-4964 Add builddate, make less sections in toc, and add header and 
footer customizations

stack : 
Files : 
* /hbase/trunk/pom.xml
* /hbase/trunk/src/docbkx/book.xml
* /hbase/trunk/src/docbkx/customization.xsl


 Add builddate, make less sections in toc, and add header and footer 
 customizations
 --

 Key: HBASE-4964
 URL: https://issues.apache.org/jira/browse/HBASE-4964
 Project: HBase
  Issue Type: Improvement
Reporter: stack
 Fix For: 0.94.0

 Attachments: 4964.txt


 The customizations are for adding facebook comments.  I tried it but not 
 working for me immediately; need some xsl jujitsu so I can get name of 
 current page into the current footer.
 Added a buildDate define in iso-8601 to the pom used in 'reference guide'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4908) HBase cluster test tool (port from 0.89-fb)

2011-12-06 Thread Mikhail Bautin (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-4908:
--

Status: Patch Available  (was: Open)

 HBase cluster test tool (port from 0.89-fb)
 ---

 Key: HBASE-4908
 URL: https://issues.apache.org/jira/browse/HBASE-4908
 Project: HBase
  Issue Type: Sub-task
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: 0001-HBase-cluster-test-tool.patch, D549.1.patch, 
 D549.2.patch, D549.3.patch, D549.4.patch, D549.5.patch, D549.6.patch, 
 D549.7.patch


 Porting one of our HBase cluster test tools (a single-process multi-threaded 
 load generator and verifier) from 0.89-fb to trunk.
 I cleaned up the code a bit compared to what's in 0.89-fb, and discovered 
 that it has some features that I have not tried yet (some kind of a kill 
 test, and some way to run HBase as multiple processes on one machine).
 The main utility of this piece of code for us has been the HBaseClusterTest 
 command-line tool (called HBaseTest in 0.89-fb), which we usually invoke as a 
 load test in our five-node dev cluster testing, e.g.:
 hbase org.apache.hadoop.hbase.manual.HBaseTest -load 10:50:100:20 -tn 
 load_test -read 1:10:50:20 -zk zk_quorum -bloom ROWCOL -compression 
 GZIP
 I will be using this code to load-test the delta encoding patch and making 
 fixes, but I am submitting the patch for early feedback. I will probably try 
 out its other functionality and comment on how it works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4908) HBase cluster test tool (port from 0.89-fb)

2011-12-06 Thread Mikhail Bautin (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-4908:
--

Status: Open  (was: Patch Available)

 HBase cluster test tool (port from 0.89-fb)
 ---

 Key: HBASE-4908
 URL: https://issues.apache.org/jira/browse/HBASE-4908
 Project: HBase
  Issue Type: Sub-task
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: 0001-HBase-cluster-test-tool.patch, D549.1.patch, 
 D549.2.patch, D549.3.patch, D549.4.patch, D549.5.patch, D549.6.patch, 
 D549.7.patch


 Porting one of our HBase cluster test tools (a single-process multi-threaded 
 load generator and verifier) from 0.89-fb to trunk.
 I cleaned up the code a bit compared to what's in 0.89-fb, and discovered 
 that it has some features that I have not tried yet (some kind of a kill 
 test, and some way to run HBase as multiple processes on one machine).
 The main utility of this piece of code for us has been the HBaseClusterTest 
 command-line tool (called HBaseTest in 0.89-fb), which we usually invoke as a 
 load test in our five-node dev cluster testing, e.g.:
 hbase org.apache.hadoop.hbase.manual.HBaseTest -load 10:50:100:20 -tn 
 load_test -read 1:10:50:20 -zk zk_quorum -bloom ROWCOL -compression 
 GZIP
 I will be using this code to load-test the delta encoding patch and making 
 fixes, but I am submitting the patch for early feedback. I will probably try 
 out its other functionality and comment on how it works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests

2011-12-06 Thread nkeywal (Created) (JIRA)
Monitor the open file descriptors and the threads counters during the unit tests


 Key: HBASE-4965
 URL: https://issues.apache.org/jira/browse/HBASE-4965
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor


We're seeing a lot of issues with hadoop-qa related to threads or file 
descriptors.
Monitoring these counters would ease the analysis.

Note as well that
 - if we want to execute the tests in the same jvm (because the test is small 
or because we want to share the cluster) we can't afford to leak too many 
resources
 - if the tests leak, it's more difficult to detect a leak in the software 
itself.


I attach piece of code that I used. It requires two lines in a unit test class 
to:
- before every test, count the threads and the open file descriptor
- after every test, compare with the previous value.

I ran it on some tests; we have for example:
- client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 threads 
(was 231), 390 file descriptors (was 390). = TestMultiParallel uses 232 
threads!
- client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 
283 file descriptors (was 282).
- client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 
294), 815 file descriptors (was 461)
- client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 148), 
310 file descriptors (was 307).

It's not always leaks, we can expect some pooling effects. But still...


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4927) CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty

2011-12-06 Thread Jimmy Xiang (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-4927:
---

Status: Open  (was: Patch Available)

 CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the 
 last region when the endkey is empty
 ---

 Key: HBASE-4927
 URL: https://issues.apache.org/jira/browse/HBASE-4927
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.94.0

 Attachments: 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator-.patch


 When reviewing HBASE-4238 backporting, Jon found this issue.
 What happens if the split points are  (empty end key is the last key, empty 
 start key is the first key)
 Parent [A,)
 L daughter [A,B), 
 R daughter [B,)
 When sorted, we gets to end key comparision which results in this incorrector 
 order:
 [A,B), [A,), [B,) 
 we wanted:
 [A,), [A,B), [B,)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4927) CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty

2011-12-06 Thread Jimmy Xiang (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-4927:
---

Status: Patch Available  (was: Open)

0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator_v2.patch

is the latest patch.  It replaced tabs with spaces.

 CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the 
 last region when the endkey is empty
 ---

 Key: HBASE-4927
 URL: https://issues.apache.org/jira/browse/HBASE-4927
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.94.0

 Attachments: 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator-.patch, 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator_v2.patch


 When reviewing HBASE-4238 backporting, Jon found this issue.
 What happens if the split points are  (empty end key is the last key, empty 
 start key is the first key)
 Parent [A,)
 L daughter [A,B), 
 R daughter [B,)
 When sorted, we gets to end key comparision which results in this incorrector 
 order:
 [A,B), [A,), [B,) 
 we wanted:
 [A,), [A,B), [B,)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4927) CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty

2011-12-06 Thread Jimmy Xiang (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-4927:
---

Attachment: 
0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator_v2.patch

 CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the 
 last region when the endkey is empty
 ---

 Key: HBASE-4927
 URL: https://issues.apache.org/jira/browse/HBASE-4927
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.94.0

 Attachments: 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator-.patch, 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator_v2.patch


 When reviewing HBASE-4238 backporting, Jon found this issue.
 What happens if the split points are  (empty end key is the last key, empty 
 start key is the first key)
 Parent [A,)
 L daughter [A,B), 
 R daughter [B,)
 When sorted, we gets to end key comparision which results in this incorrector 
 order:
 [A,B), [A,), [B,) 
 we wanted:
 [A,), [A,B), [B,)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4966) Put/Delete values cannot be tested with MRUnit

2011-12-06 Thread Nicholas Telford (Created) (JIRA)
Put/Delete values cannot be tested with MRUnit
--

 Key: HBASE-4966
 URL: https://issues.apache.org/jira/browse/HBASE-4966
 Project: HBase
  Issue Type: Bug
  Components: client, mapreduce
Affects Versions: 0.90.4
Reporter: Nicholas Telford
Priority: Minor


When using the IdentityTableReducer, which expects input values of either a Put 
or Delete object, testing with MRUnit the Mapper with MRUnit is not possible 
because neither Put nor Delete implement equals().

We should implement equals() on both such that equality means:
* Both objects are of the same class (in this case, Put or Delete)
* Both objects are for the same key.
* Both objects contain an equal set of KeyValues (applicable only to Put)

KeyValue.equals() appears to already be implemented, but only checks for 
equality of row key, column family and column qualifier - two KeyValues can be 
considered equal if they contain different values. This won't work for 
testing.

Instead, the Put.equals() and Delete.equals() implementations should do a 
deep equality check on their KeyValues, like this:

myKv.equals(theirKv)  Bytes.equals(myKv.getValue(), theirKv.getValue());

NOTE: This would impact any code that relies on the existing identity 
implementation of Put.equals() and Delete.equals(), therefore cannot be 
guaranteed to be backwards-compatible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4966) Put/Delete values cannot be tested with MRUnit

2011-12-06 Thread Nicholas Telford (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicholas Telford updated HBASE-4966:


Description: 
When using the IdentityTableReducer, which expects input values of either a Put 
or Delete object, testing with MRUnit the Mapper with MRUnit is not possible 
because neither Put nor Delete implement equals().

We should implement equals() on both such that equality means:
* Both objects are of the same class (in this case, Put or Delete)
* Both objects are for the same key.
* Both objects contain an equal set of KeyValues (applicable only to Put)

KeyValue.equals() appears to already be implemented, but only checks for 
equality of row key, column family and column qualifier - two KeyValues can be 
considered equal if they contain different values. This won't work for 
testing.

Instead, the Put.equals() and Delete.equals() implementations should do a 
deep equality check on their KeyValues, like this:

{code:java}
myKv.equals(theirKv)  Bytes.equals(myKv.getValue(), theirKv.getValue());
{code}

NOTE: This would impact any code that relies on the existing identity 
implementation of Put.equals() and Delete.equals(), therefore cannot be 
guaranteed to be backwards-compatible.

  was:
When using the IdentityTableReducer, which expects input values of either a Put 
or Delete object, testing with MRUnit the Mapper with MRUnit is not possible 
because neither Put nor Delete implement equals().

We should implement equals() on both such that equality means:
* Both objects are of the same class (in this case, Put or Delete)
* Both objects are for the same key.
* Both objects contain an equal set of KeyValues (applicable only to Put)

KeyValue.equals() appears to already be implemented, but only checks for 
equality of row key, column family and column qualifier - two KeyValues can be 
considered equal if they contain different values. This won't work for 
testing.

Instead, the Put.equals() and Delete.equals() implementations should do a 
deep equality check on their KeyValues, like this:

myKv.equals(theirKv)  Bytes.equals(myKv.getValue(), theirKv.getValue());

NOTE: This would impact any code that relies on the existing identity 
implementation of Put.equals() and Delete.equals(), therefore cannot be 
guaranteed to be backwards-compatible.


 Put/Delete values cannot be tested with MRUnit
 --

 Key: HBASE-4966
 URL: https://issues.apache.org/jira/browse/HBASE-4966
 Project: HBase
  Issue Type: Bug
  Components: client, mapreduce
Affects Versions: 0.90.4
Reporter: Nicholas Telford
Priority: Minor

 When using the IdentityTableReducer, which expects input values of either a 
 Put or Delete object, testing with MRUnit the Mapper with MRUnit is not 
 possible because neither Put nor Delete implement equals().
 We should implement equals() on both such that equality means:
 * Both objects are of the same class (in this case, Put or Delete)
 * Both objects are for the same key.
 * Both objects contain an equal set of KeyValues (applicable only to Put)
 KeyValue.equals() appears to already be implemented, but only checks for 
 equality of row key, column family and column qualifier - two KeyValues can 
 be considered equal if they contain different values. This won't work for 
 testing.
 Instead, the Put.equals() and Delete.equals() implementations should do a 
 deep equality check on their KeyValues, like this:
 {code:java}
 myKv.equals(theirKv)  Bytes.equals(myKv.getValue(), theirKv.getValue());
 {code}
 NOTE: This would impact any code that relies on the existing identity 
 implementation of Put.equals() and Delete.equals(), therefore cannot be 
 guaranteed to be backwards-compatible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4967) connected client thrift sockets should have a server side read timeout

2011-12-06 Thread Prakash Khemani (Created) (JIRA)
connected client thrift sockets should have a server side read timeout
--

 Key: HBASE-4967
 URL: https://issues.apache.org/jira/browse/HBASE-4967
 Project: HBase
  Issue Type: Improvement
Reporter: Prakash Khemani


If there is no socket read timeout and if the Thrift server is a 
ThreadPoolServer then server side threads will be used up waiting for dead 
unresponsive clients.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4967) connected client thrift sockets should have a server side read timeout

2011-12-06 Thread Prakash Khemani (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prakash Khemani reassigned HBASE-4967:
--

Assignee: Prakash Khemani

 connected client thrift sockets should have a server side read timeout
 --

 Key: HBASE-4967
 URL: https://issues.apache.org/jira/browse/HBASE-4967
 Project: HBase
  Issue Type: Improvement
Reporter: Prakash Khemani
Assignee: Prakash Khemani

 If there is no socket read timeout and if the Thrift server is a 
 ThreadPoolServer then server side threads will be used up waiting for dead 
 unresponsive clients.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4966) Put/Delete values cannot be tested with MRUnit

2011-12-06 Thread Todd Lipcon (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HBASE-4966:
---

Hadoop Flags: Incompatible change

 Put/Delete values cannot be tested with MRUnit
 --

 Key: HBASE-4966
 URL: https://issues.apache.org/jira/browse/HBASE-4966
 Project: HBase
  Issue Type: Bug
  Components: client, mapreduce
Affects Versions: 0.90.4
Reporter: Nicholas Telford
Priority: Minor

 When using the IdentityTableReducer, which expects input values of either a 
 Put or Delete object, testing with MRUnit the Mapper with MRUnit is not 
 possible because neither Put nor Delete implement equals().
 We should implement equals() on both such that equality means:
 * Both objects are of the same class (in this case, Put or Delete)
 * Both objects are for the same key.
 * Both objects contain an equal set of KeyValues (applicable only to Put)
 KeyValue.equals() appears to already be implemented, but only checks for 
 equality of row key, column family and column qualifier - two KeyValues can 
 be considered equal if they contain different values. This won't work for 
 testing.
 Instead, the Put.equals() and Delete.equals() implementations should do a 
 deep equality check on their KeyValues, like this:
 {code:java}
 myKv.equals(theirKv)  Bytes.equals(myKv.getValue(), theirKv.getValue());
 {code}
 NOTE: This would impact any code that relies on the existing identity 
 implementation of Put.equals() and Delete.equals(), therefore cannot be 
 guaranteed to be backwards-compatible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4927) CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty

2011-12-06 Thread Todd Lipcon (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HBASE-4927:
---

Attachment: hbase-4927-fix-ws.txt

there were a few more hard tabs in the tests, and a couple places with 4 space 
indentation instead of 2. here's a fixed up version.

 CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the 
 last region when the endkey is empty
 ---

 Key: HBASE-4927
 URL: https://issues.apache.org/jira/browse/HBASE-4927
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.94.0

 Attachments: 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator-.patch, 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator_v2.patch, 
 hbase-4927-fix-ws.txt


 When reviewing HBASE-4238 backporting, Jon found this issue.
 What happens if the split points are  (empty end key is the last key, empty 
 start key is the first key)
 Parent [A,)
 L daughter [A,B), 
 R daughter [B,)
 When sorted, we gets to end key comparision which results in this incorrector 
 order:
 [A,B), [A,), [B,) 
 we wanted:
 [A,), [A,B), [B,)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4927) CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty

2011-12-06 Thread Todd Lipcon (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HBASE-4927:
---

Status: Open  (was: Patch Available)

 CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the 
 last region when the endkey is empty
 ---

 Key: HBASE-4927
 URL: https://issues.apache.org/jira/browse/HBASE-4927
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.94.0

 Attachments: 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator-.patch, 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator_v2.patch, 
 hbase-4927-fix-ws.txt


 When reviewing HBASE-4238 backporting, Jon found this issue.
 What happens if the split points are  (empty end key is the last key, empty 
 start key is the first key)
 Parent [A,)
 L daughter [A,B), 
 R daughter [B,)
 When sorted, we gets to end key comparision which results in this incorrector 
 order:
 [A,B), [A,), [B,) 
 we wanted:
 [A,), [A,B), [B,)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4927) CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty

2011-12-06 Thread Todd Lipcon (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HBASE-4927:
---

Status: Patch Available  (was: Open)

 CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the 
 last region when the endkey is empty
 ---

 Key: HBASE-4927
 URL: https://issues.apache.org/jira/browse/HBASE-4927
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.94.0

 Attachments: 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator-.patch, 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator_v2.patch, 
 hbase-4927-fix-ws.txt


 When reviewing HBASE-4238 backporting, Jon found this issue.
 What happens if the split points are  (empty end key is the last key, empty 
 start key is the first key)
 Parent [A,)
 L daughter [A,B), 
 R daughter [B,)
 When sorted, we gets to end key comparision which results in this incorrector 
 order:
 [A,B), [A,), [B,) 
 we wanted:
 [A,), [A,B), [B,)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4109) Hostname returned via reverse dns lookup contains trailing period if configured interface is not default

2011-12-06 Thread Jean-Daniel Cryans (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Daniel Cryans reassigned HBASE-4109:
-

Assignee: Shrijeet Paliwal

 Hostname returned via reverse dns lookup contains trailing period if 
 configured interface is not default
 --

 Key: HBASE-4109
 URL: https://issues.apache.org/jira/browse/HBASE-4109
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver
Affects Versions: 0.90.3
Reporter: Shrijeet Paliwal
Assignee: Shrijeet Paliwal
 Fix For: 0.90.4

 Attachments: 
 0001-HBASE-4109-Sanitize-hostname-returned-from-DNS-class.patch


 If you are using an interface anything other than 'default' (literally that 
 keyword) DNS.java 's getDefaultHost will return a string which will 
 have a trailing period at the end. It seems javadoc of reverseDns in DNS.java 
 (see below) is conflicting with what that function is actually doing. 
 It is returning a PTR record while claims it returns a hostname. The PTR 
 record always has period at the end , RFC:  
 http://irbs.net/bog-4.9.5/bog47.html 
 We make call to DNS.getDefaultHost at more than one places and treat that as 
 actual hostname.
 Quoting HRegionServer for example
 {code}
 String machineName = DNS.getDefaultHost(conf.get(
 hbase.regionserver.dns.interface, default), conf.get(
 hbase.regionserver.dns.nameserver, default));
 {code}
 This causes inconsistencies. An example of such inconsistency was observed 
 while debugging the issue Regions not getting reassigned if RS is brought 
 down. More here 
 http://search-hadoop.com/m/CANUA1qRCkQ1 
 We may want to sanitize the string returned from DNS class. Or better we can 
 take a path of overhauling the way we do DNS name matching all over.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4927) CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty

2011-12-06 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13163765#comment-13163765
 ] 

Hadoop QA commented on HBASE-4927:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12506284/0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator_v2.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 8 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -160 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 71 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildHole
  org.apache.hadoop.hbase.util.TestFSUtils
  org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint
  org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildBase

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/453//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/453//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/453//console

This message is automatically generated.

 CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the 
 last region when the endkey is empty
 ---

 Key: HBASE-4927
 URL: https://issues.apache.org/jira/browse/HBASE-4927
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.94.0

 Attachments: 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator-.patch, 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator_v2.patch, 
 hbase-4927-fix-ws.txt


 When reviewing HBASE-4238 backporting, Jon found this issue.
 What happens if the split points are  (empty end key is the last key, empty 
 start key is the first key)
 Parent [A,)
 L daughter [A,B), 
 R daughter [B,)
 When sorted, we gets to end key comparision which results in this incorrector 
 order:
 [A,B), [A,), [B,) 
 we wanted:
 [A,), [A,B), [B,)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4927) CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty

2011-12-06 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13163785#comment-13163785
 ] 

Hadoop QA commented on HBASE-4927:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12506287/hbase-4927-fix-ws.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -160 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 71 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildHole
  
org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildOverlap
  org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildBase

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/454//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/454//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/454//console

This message is automatically generated.

 CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the 
 last region when the endkey is empty
 ---

 Key: HBASE-4927
 URL: https://issues.apache.org/jira/browse/HBASE-4927
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.94.0

 Attachments: 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator-.patch, 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator_v2.patch, 
 hbase-4927-fix-ws.txt


 When reviewing HBASE-4238 backporting, Jon found this issue.
 What happens if the split points are  (empty end key is the last key, empty 
 start key is the first key)
 Parent [A,)
 L daughter [A,B), 
 R daughter [B,)
 When sorted, we gets to end key comparision which results in this incorrector 
 order:
 [A,B), [A,), [B,) 
 we wanted:
 [A,), [A,B), [B,)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4946) HTable.coprocessorExec (and possibly coprocessorProxy) does not work with dynamically loaded coprocessors (from hdfs or local system), because the RPC system tries to d

2011-12-06 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13163819#comment-13163819
 ] 

Ted Yu commented on HBASE-4946:
---

The failed tests were due to 'unable to create new native thread'

 HTable.coprocessorExec (and possibly coprocessorProxy) does not work with 
 dynamically loaded coprocessors (from hdfs or local system), because the RPC 
 system tries to deserialize an unknown class. 
 -

 Key: HBASE-4946
 URL: https://issues.apache.org/jira/browse/HBASE-4946
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Affects Versions: 0.92.0
Reporter: Andrei Dragomir
Assignee: Andrei Dragomir
 Attachments: HBASE-4946-v2.patch, HBASE-4946-v3.patch, 
 HBASE-4946.patch


 Loading coprocessors jars from hdfs works fine. I load it from the shell, 
 after setting the attribute, and it gets loaded:
 {noformat}
 INFO org.apache.hadoop.hbase.regionserver.HRegion: Setting up tabledescriptor 
 config now ...
 INFO org.apache.hadoop.hbase.coprocessor.CoprocessorHost: Class 
 com.MyCoprocessorClass needs to be loaded from a file - 
 hdfs://localhost:9000/coproc/rt-  0.0.1-SNAPSHOT.jar.
 INFO org.apache.hadoop.hbase.coprocessor.CoprocessorHost: loadInstance: 
 com.MyCoprocessorClass
 INFO org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost: 
 RegionEnvironment createEnvironment
 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Registered protocol 
 handler: region=t1,,1322572939753.6409aee1726d31f5e5671a59fe6e384f. 
 protocol=com.MyCoprocessorClassProtocol
 INFO org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost: Load 
 coprocessor com.MyCoprocessorClass from HTD of t1 successfully.
 {noformat}
 The problem is that this coprocessors simply extends BaseEndpointCoprocessor, 
 with a dynamic method. When calling this method from the client with 
 HTable.coprocessorExec, I get errors on the HRegionServer, because the call 
 cannot be deserialized from writables. 
 The problem is that Exec tries to do an early resolve of the coprocessor 
 class. The coprocessor class is loaded, but it is in the context of the 
 HRegionServer / HRegion. So, the call fails:
 {noformat}
 2011-12-02 00:34:17,348 ERROR org.apache.hadoop.hbase.io.HbaseObjectWritable: 
 Error in readFields
 java.io.IOException: Protocol class com.MyCoprocessorClassProtocol not found
   at org.apache.hadoop.hbase.client.coprocessor.Exec.readFields(Exec.java:125)
   at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:575)
   at org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:105)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1237)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1167)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:703)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:495)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:470)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:680)
 Caused by: java.lang.ClassNotFoundException: com.MyCoprocessorClassProtocol
   at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
   at java.lang.Class.forName0(Native Method)
   at java.lang.Class.forName(Class.java:247)
   at 
 org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:943)
   at org.apache.hadoop.hbase.client.coprocessor.Exec.readFields(Exec.java:122)
   ... 10 more
 {noformat}
 Probably the correct way to fix this is to make Exec really smart, so that it 
 knows all the class definitions loaded in CoprocessorHost(s).
 I created a small patch that simply doesn't resolve the class definition in 
 the Exec, instead passing it as string down to the HRegion layer. This layer 
 knows all the definitions, and simply loads it by name. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: 

[jira] [Commented] (HBASE-4946) HTable.coprocessorExec (and possibly coprocessorProxy) does not work with dynamically loaded coprocessors (from hdfs or local system), because the RPC system tries to d

2011-12-06 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13163840#comment-13163840
 ] 

Ted Yu commented on HBASE-4946:
---

For coprocessor/Exec.java, javadoc doesn't match code:
{code}
 try {
   protocol = (ClassCoprocessorProtocol)conf.getClassByName(protocolName);
 }
 catch (ClassNotFoundException cnfe) {
-  throw new IOException(Protocol class +protocolName+ not found, cnfe);
+  // can't do eager instantiation. pass it as a string and try to 
deserialize later.
+  //throw new IOException(Protocol class +protocolName+ not found, 
cnfe);
{code}
I think the above try block should be commented out.
TestCoprocessorEndpoint passes without the above assignment to protocol.

 HTable.coprocessorExec (and possibly coprocessorProxy) does not work with 
 dynamically loaded coprocessors (from hdfs or local system), because the RPC 
 system tries to deserialize an unknown class. 
 -

 Key: HBASE-4946
 URL: https://issues.apache.org/jira/browse/HBASE-4946
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Affects Versions: 0.92.0
Reporter: Andrei Dragomir
Assignee: Andrei Dragomir
 Attachments: HBASE-4946-v2.patch, HBASE-4946-v3.patch, 
 HBASE-4946.patch


 Loading coprocessors jars from hdfs works fine. I load it from the shell, 
 after setting the attribute, and it gets loaded:
 {noformat}
 INFO org.apache.hadoop.hbase.regionserver.HRegion: Setting up tabledescriptor 
 config now ...
 INFO org.apache.hadoop.hbase.coprocessor.CoprocessorHost: Class 
 com.MyCoprocessorClass needs to be loaded from a file - 
 hdfs://localhost:9000/coproc/rt-  0.0.1-SNAPSHOT.jar.
 INFO org.apache.hadoop.hbase.coprocessor.CoprocessorHost: loadInstance: 
 com.MyCoprocessorClass
 INFO org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost: 
 RegionEnvironment createEnvironment
 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Registered protocol 
 handler: region=t1,,1322572939753.6409aee1726d31f5e5671a59fe6e384f. 
 protocol=com.MyCoprocessorClassProtocol
 INFO org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost: Load 
 coprocessor com.MyCoprocessorClass from HTD of t1 successfully.
 {noformat}
 The problem is that this coprocessors simply extends BaseEndpointCoprocessor, 
 with a dynamic method. When calling this method from the client with 
 HTable.coprocessorExec, I get errors on the HRegionServer, because the call 
 cannot be deserialized from writables. 
 The problem is that Exec tries to do an early resolve of the coprocessor 
 class. The coprocessor class is loaded, but it is in the context of the 
 HRegionServer / HRegion. So, the call fails:
 {noformat}
 2011-12-02 00:34:17,348 ERROR org.apache.hadoop.hbase.io.HbaseObjectWritable: 
 Error in readFields
 java.io.IOException: Protocol class com.MyCoprocessorClassProtocol not found
   at org.apache.hadoop.hbase.client.coprocessor.Exec.readFields(Exec.java:125)
   at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:575)
   at org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:105)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1237)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1167)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:703)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:495)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:470)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:680)
 Caused by: java.lang.ClassNotFoundException: com.MyCoprocessorClassProtocol
   at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
   at java.lang.Class.forName0(Native Method)
   at java.lang.Class.forName(Class.java:247)
   at 
 org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:943)
   at org.apache.hadoop.hbase.client.coprocessor.Exec.readFields(Exec.java:122)
   ... 10 more
 {noformat}
 Probably the correct way to fix this is to make Exec really smart, so that it 
 

[jira] [Commented] (HBASE-4120) isolation and allocation

2011-12-06 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13163853#comment-13163853
 ] 

Ted Yu commented on HBASE-4120:
---

From Andrew Purtell (see 'trip report for Hadoop In China' up on dev@):

After 0.92 is out I intend to champion / mentor / co-develop 4120 and the 
follow on table allocation work and target 0.94 for it. I think the RPC QoS 
aspect is not too controversial. The allocation/reservation aspects I'd like to 
aim for a coprocessor or at least master plugin based integration so they won't 
impact stability for users who don't enable it. Unlike RPC QoS I suspect the 
changes needed to core can be minimized to coprocessor framework additions. 
Follow up in new JIRAs soon.

 isolation and allocation
 

 Key: HBASE-4120
 URL: https://issues.apache.org/jira/browse/HBASE-4120
 Project: HBase
  Issue Type: New Feature
  Components: master, regionserver
Affects Versions: 0.90.2, 0.90.3, 0.90.4, 0.92.0
Reporter: Liu Jia
Assignee: Liu Jia
 Fix For: 0.94.0

 Attachments: Design_document_for_HBase_isolation_and_allocation.pdf, 
 Design_document_for_HBase_isolation_and_allocation_Revised.pdf, 
 HBase_isolation_and_allocation_user_guide.pdf, 
 Performance_of_Table_priority.pdf, System Structure.jpg, TablePriority.patch, 
 TablePriority_v12.patch, TablePriority_v12.patch, TablePriority_v8.patch, 
 TablePriority_v8.patch, TablePriority_v8_for_trunk.patch, 
 TablePrioriy_v9.patch


 The HBase isolation and allocation tool is designed to help users manage 
 cluster resource among different application and tables.
 When we have a large scale of HBase cluster with many applications running on 
 it, there will be lots of problems. In Taobao there is a cluster for many 
 departments to test their applications performance, these applications are 
 based on HBase. With one cluster which has 12 servers, there will be only one 
 application running exclusively on this server, and many other applications 
 must wait until the previous test finished.
 After we add allocation manage function to the cluster, applications can 
 share the cluster and run concurrently. Also if the Test Engineer wants to 
 make sure there is no interference, he/she can move out other tables from 
 this group.
 In groups we use table priority to allocate resource, when system is busy; we 
 can make sure high-priority tables are not affected lower-priority tables
 Different groups can have different region server configurations, some groups 
 optimized for reading can have large block cache size, and others optimized 
 for writing can have large memstore size. 
 Tables and region servers can be moved easily between groups; after changing 
 the configuration, a group can be restarted alone instead of restarting the 
 whole cluster.
 git entry : https://github.com/ICT-Ope/HBase_allocation .
 We hope our work is helpful.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4847) Activate single jvm for small tests on jenkins

2011-12-06 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4847:
-

   Resolution: Fixed
Fix Version/s: 0.94.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to TRUNK yesterday.

 Activate single jvm for small tests on jenkins
 --

 Key: HBASE-4847
 URL: https://issues.apache.org/jira/browse/HBASE-4847
 Project: HBase
  Issue Type: Improvement
  Components: build, test
Affects Versions: 0.94.0
 Environment: build
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.94.0

 Attachments: 4847_all.v10.patch, 4847_all.v10.patch, 
 4847_all.v10.patch, 4847_all.v11.patch, 4847_all.v11.patch, 
 4847_all.v12.patch, 4847_all.v4.patch, 4847_all.v5.patch, 4847_all.v6.patch, 
 4847_all.v6.patch, 4847_all.v7.patch, 4847_all.v7.patch, 4847_all.v7.patch, 
 4847_all.v7.patch, 4847_all.v8.patch, 4847_all.v8.patch, 4847_all.v9.patch, 
 4847_pom.patch, 4847_pom.v2.patch, 4847_pom.v2.patch, 4847_pom.v2.patch, 
 4847_pom.v3.patch


 This will not revolutionate performances alone. We will win between 1 to 4 
 minutes.
 But we win as well:
  - it's a step for parallelizing the tests
  - new tests are less expensive as they do not create a new jvm: it's a 
 continuous win
  - it will allow to push it on dev env while having the same env on dev  on 
 build, and 3 minutes are 10% of small + medium tests execution time.
 I will do a few submit patch to see if it works well before asking for the 
 real commit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4712) Document rules for writing tests

2011-12-06 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4712:
-

Attachment: 4712.txt

Here is the nkeywal doc hacked into docbook.  I also moved stuff around 
breaking out a 'Tests' subsection under developer chapter and stuck Jesse's 
integration stuff in here too (with some attempt at lead-in text distingushing 
the two)

 Document rules for writing tests
 

 Key: HBASE-4712
 URL: https://issues.apache.org/jira/browse/HBASE-4712
 Project: HBase
  Issue Type: Task
  Components: test
Affects Versions: 0.92.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 4712.txt


 We saw that some tests could be improved. Documenting the general rules could 
 help.
 Proposal:
 HBase tests are divided in three categories: small, medium and large, with 
 corresponding JUnit categories: SmallTest, MediumTest, LargeTest
 Small tests are executed in parallel in a shared JVM. They must last less 
 than 15 seconds. They must NOT use a cluster.
 Medium tests are executed in separate JVM. They must last less than 50 
 seconds. They can use a cluster. They must not fail occasionally.
 Small and medium tests must not need more than 30 minutes to run altogether.
 Small and medium tests should be executed by the developers before submitting 
 a patch.
 Large tests are everything else. They are typically integration tests, 
 non-regression tests for specific bugs, timeout tests, performance tests.
 Tests rules  hints are:
 - As most as possible, tests should be written as small tests.
 - All tests should be written to support parallel execution on the same 
 machine, hence should not use shared resources as fixed ports or fixed file 
 names.
 - All tests should be written to be as fast as possible.
 - Tests should not overlog. More than 100 lines/second makes the logs complex 
 to read and use i/o that are hence not available for the other tests.
 - Tests can be written with HBaseTestingUtility . This class offers helper 
 function to create a temp directory and do the cleanup, or to start a cluster.
 - Sleeps:
 - Tests should not do a 'Thread.sleep' without testing an ending 
 condition. This allows understanding what the test is waiting for. Moreover, 
 the test will work whatever the machine performances.
 - Sleep should be minimal to be as fast as possible. Waiting for a 
 variable should be done in a 40ms sleep loop. Waiting for a socket operation 
 should be done in a 200 ms sleep loop.
 - Tests using cluster:
 - Tests using a HRegion do not have to start a cluster: A region can use 
 the local file system.
 - Start/stopping a cluster cost around 10 seconds. They should not be 
 started per test method but per class.
 - Started cluster must be shutdown using 
 HBaseTestingUtility#shutdownMiniCluster, which cleans the directories.
 - As most as possible, tests should use the default settings for the 
 cluster. When they don't, they should document it. This will allow to share 
 the cluster later.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-4712) Document rules for writing tests

2011-12-06 Thread stack (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-4712.
--

   Resolution: Fixed
Fix Version/s: 0.94.0
 Hadoop Flags: Reviewed

Committed to trunk.  Thanks for the doc. N.

 Document rules for writing tests
 

 Key: HBASE-4712
 URL: https://issues.apache.org/jira/browse/HBASE-4712
 Project: HBase
  Issue Type: Task
  Components: test
Affects Versions: 0.92.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.94.0

 Attachments: 4712.txt


 We saw that some tests could be improved. Documenting the general rules could 
 help.
 Proposal:
 HBase tests are divided in three categories: small, medium and large, with 
 corresponding JUnit categories: SmallTest, MediumTest, LargeTest
 Small tests are executed in parallel in a shared JVM. They must last less 
 than 15 seconds. They must NOT use a cluster.
 Medium tests are executed in separate JVM. They must last less than 50 
 seconds. They can use a cluster. They must not fail occasionally.
 Small and medium tests must not need more than 30 minutes to run altogether.
 Small and medium tests should be executed by the developers before submitting 
 a patch.
 Large tests are everything else. They are typically integration tests, 
 non-regression tests for specific bugs, timeout tests, performance tests.
 Tests rules  hints are:
 - As most as possible, tests should be written as small tests.
 - All tests should be written to support parallel execution on the same 
 machine, hence should not use shared resources as fixed ports or fixed file 
 names.
 - All tests should be written to be as fast as possible.
 - Tests should not overlog. More than 100 lines/second makes the logs complex 
 to read and use i/o that are hence not available for the other tests.
 - Tests can be written with HBaseTestingUtility . This class offers helper 
 function to create a temp directory and do the cleanup, or to start a cluster.
 - Sleeps:
 - Tests should not do a 'Thread.sleep' without testing an ending 
 condition. This allows understanding what the test is waiting for. Moreover, 
 the test will work whatever the machine performances.
 - Sleep should be minimal to be as fast as possible. Waiting for a 
 variable should be done in a 40ms sleep loop. Waiting for a socket operation 
 should be done in a 200 ms sleep loop.
 - Tests using cluster:
 - Tests using a HRegion do not have to start a cluster: A region can use 
 the local file system.
 - Start/stopping a cluster cost around 10 seconds. They should not be 
 started per test method but per class.
 - Started cluster must be shutdown using 
 HBaseTestingUtility#shutdownMiniCluster, which cleans the directories.
 - As most as possible, tests should use the default settings for the 
 cluster. When they don't, they should document it. This will allow to share 
 the cluster later.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4729) Clash between region unassign and splitting kills the master

2011-12-06 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13163896#comment-13163896
 ] 

stack commented on HBASE-4729:
--

Actually commit the patches (Above I say I do but just noticed that I had not 
-- my schizophrenia is showing through again... pardon me).

 Clash between region unassign and splitting kills the master
 

 Key: HBASE-4729
 URL: https://issues.apache.org/jira/browse/HBASE-4729
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Assignee: stack
Priority: Critical
 Fix For: 0.92.0, 0.94.0

 Attachments: 4729-v2.txt, 4729-v3.txt, 4729-v4.txt, 4729-v5.txt, 
 4729-v6-092.txt, 4729-v6-trunk.txt, 4729.txt


 I was running an online alter while regions were splitting, and suddenly the 
 master died and left my table half-altered (haven't restarted the master yet).
 What killed the master:
 {quote}
 2011-11-02 17:06:44,428 FATAL org.apache.hadoop.hbase.master.HMaster: 
 Unexpected ZK exception creating node CLOSING
 org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = 
 NodeExists for /hbase/unassigned/f7e1783e65ea8d621a4bc96ad310f101
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:110)
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
 at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637)
 at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:459)
 at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:441)
 at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:769)
 at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.createNodeClosing(ZKAssign.java:568)
 at 
 org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1722)
 at 
 org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1661)
 at org.apache.hadoop.hbase.master.BulkReOpen$1.run(BulkReOpen.java:69)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 {quote}
 A znode was created because the region server was splitting the region 4 
 seconds before:
 {quote}
 2011-11-02 17:06:40,704 INFO 
 org.apache.hadoop.hbase.regionserver.SplitTransaction: Starting split of 
 region TestTable,0012469153,1320253135043.f7e1783e65ea8d621a4bc96ad310f101.
 2011-11-02 17:06:40,704 DEBUG 
 org.apache.hadoop.hbase.regionserver.SplitTransaction: 
 regionserver:62023-0x132f043bbde0710 Creating ephemeral node for 
 f7e1783e65ea8d621a4bc96ad310f101 in SPLITTING state
 2011-11-02 17:06:40,751 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 regionserver:62023-0x132f043bbde0710 Attempting to transition node 
 f7e1783e65ea8d621a4bc96ad310f101 from RS_ZK_REGION_SPLITTING to 
 RS_ZK_REGION_SPLITTING
 ...
 2011-11-02 17:06:44,061 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 regionserver:62023-0x132f043bbde0710 Successfully transitioned node 
 f7e1783e65ea8d621a4bc96ad310f101 from RS_ZK_REGION_SPLITTING to 
 RS_ZK_REGION_SPLIT
 2011-11-02 17:06:44,061 INFO 
 org.apache.hadoop.hbase.regionserver.SplitTransaction: Still waiting on the 
 master to process the split for f7e1783e65ea8d621a4bc96ad310f101
 {quote}
 Now that the master is dead the region server is spewing those last two lines 
 like mad.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4120) isolation and allocation

2011-12-06 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13163902#comment-13163902
 ] 

Ted Yu commented on HBASE-4120:
---

Andy suggested placing the PriorityFunction.initRegionPriority(region) call in 
RegionObserver.postOpen()

 isolation and allocation
 

 Key: HBASE-4120
 URL: https://issues.apache.org/jira/browse/HBASE-4120
 Project: HBase
  Issue Type: New Feature
  Components: master, regionserver
Affects Versions: 0.90.2, 0.90.3, 0.90.4, 0.92.0
Reporter: Liu Jia
Assignee: Liu Jia
 Fix For: 0.94.0

 Attachments: Design_document_for_HBase_isolation_and_allocation.pdf, 
 Design_document_for_HBase_isolation_and_allocation_Revised.pdf, 
 HBase_isolation_and_allocation_user_guide.pdf, 
 Performance_of_Table_priority.pdf, System Structure.jpg, TablePriority.patch, 
 TablePriority_v12.patch, TablePriority_v12.patch, TablePriority_v8.patch, 
 TablePriority_v8.patch, TablePriority_v8_for_trunk.patch, 
 TablePrioriy_v9.patch


 The HBase isolation and allocation tool is designed to help users manage 
 cluster resource among different application and tables.
 When we have a large scale of HBase cluster with many applications running on 
 it, there will be lots of problems. In Taobao there is a cluster for many 
 departments to test their applications performance, these applications are 
 based on HBase. With one cluster which has 12 servers, there will be only one 
 application running exclusively on this server, and many other applications 
 must wait until the previous test finished.
 After we add allocation manage function to the cluster, applications can 
 share the cluster and run concurrently. Also if the Test Engineer wants to 
 make sure there is no interference, he/she can move out other tables from 
 this group.
 In groups we use table priority to allocate resource, when system is busy; we 
 can make sure high-priority tables are not affected lower-priority tables
 Different groups can have different region server configurations, some groups 
 optimized for reading can have large block cache size, and others optimized 
 for writing can have large memstore size. 
 Tables and region servers can be moved easily between groups; after changing 
 the configuration, a group can be restarted alone instead of restarting the 
 whole cluster.
 git entry : https://github.com/ICT-Ope/HBase_allocation .
 We hope our work is helpful.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4927) CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty

2011-12-06 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4927:
-

   Resolution: Fixed
Fix Version/s: (was: 0.94.0)
   0.92.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Applied trunk and 0.92 branch.  Thanks for the patch Jimmy.

 CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the 
 last region when the endkey is empty
 ---

 Key: HBASE-4927
 URL: https://issues.apache.org/jira/browse/HBASE-4927
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.92.0

 Attachments: 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator-.patch, 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator_v2.patch, 
 hbase-4927-fix-ws.txt


 When reviewing HBASE-4238 backporting, Jon found this issue.
 What happens if the split points are  (empty end key is the last key, empty 
 start key is the first key)
 Parent [A,)
 L daughter [A,B), 
 R daughter [B,)
 When sorted, we gets to end key comparision which results in this incorrector 
 order:
 [A,B), [A,), [B,) 
 we wanted:
 [A,), [A,B), [B,)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests

2011-12-06 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13163916#comment-13163916
 ] 

stack commented on HBASE-4965:
--

Where is the code N?  Should we run this in all tests for a while till we nail 
some of the file descriptor issues up in hadoopqa patch build?

 Monitor the open file descriptors and the threads counters during the unit 
 tests
 

 Key: HBASE-4965
 URL: https://issues.apache.org/jira/browse/HBASE-4965
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor

 We're seeing a lot of issues with hadoop-qa related to threads or file 
 descriptors.
 Monitoring these counters would ease the analysis.
 Note as well that
  - if we want to execute the tests in the same jvm (because the test is small 
 or because we want to share the cluster) we can't afford to leak too many 
 resources
  - if the tests leak, it's more difficult to detect a leak in the software 
 itself.
 I attach piece of code that I used. It requires two lines in a unit test 
 class to:
 - before every test, count the threads and the open file descriptor
 - after every test, compare with the previous value.
 I ran it on some tests; we have for example:
 - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 
 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 
 232 threads!
 - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 
 283 file descriptors (was 282).
 - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 
 294), 815 file descriptors (was 461)
 - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 
 148), 310 file descriptors (was 307).
 It's not always leaks, we can expect some pooling effects. But still...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4605) Constraints

2011-12-06 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13163923#comment-13163923
 ] 

jirapos...@reviews.apache.org commented on HBASE-4605:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2579/#review3653
---


Answering most of the big issues on HBASE-4605 since there is a clean summary 
there by Ted. 


src/main/java/org/apache/hadoop/hbase/constraint/BaseConstraint.java
https://reviews.apache.org/r/2579/#comment8150

Done



src/main/java/org/apache/hadoop/hbase/constraint/Constraint.java
https://reviews.apache.org/r/2579/#comment8151

Done



src/main/java/org/apache/hadoop/hbase/constraint/Constraint.java
https://reviews.apache.org/r/2579/#comment8153

this could definitely be an extension. +1 on future considerations.



src/main/java/org/apache/hadoop/hbase/constraint/ConstraintException.java
https://reviews.apache.org/r/2579/#comment8152

Done



src/main/java/org/apache/hadoop/hbase/constraint/ConstraintProcessor.java
https://reviews.apache.org/r/2579/#comment8154

done



src/main/java/org/apache/hadoop/hbase/constraint/Constraints.java
https://reviews.apache.org/r/2579/#comment8147

I think we pulled getValueAsBytes() from HTD since this was the only use 
case. Agreed that it is a bit awkward, but we figured that since this is really 
the only use case for it, it didn't make sense to keep that function around. 

Adding this back in makes it easier to remove guava, since that ser/deser 
can be kept entirely under the covers. 



src/main/java/org/apache/hadoop/hbase/constraint/Constraints.java
https://reviews.apache.org/r/2579/#comment8148

+1 on using the configuration.

Originally, this was done to ensure that the configuration could be 
completely open to the user, but constraining (no pun intended) the conf by a 
couple values is not a big deal.

However, again we are at the question of keeping it human readable or not. 
See top comment.



src/main/java/org/apache/hadoop/hbase/constraint/IntegerConstraint.java
https://reviews.apache.org/r/2579/#comment8149

Moved to documentation.


- Jesse


On 2011-11-29 20:19:41, Jesse Yates wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2579/
bq.  ---
bq.  
bq.  (Updated 2011-11-29 20:19:41)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Most of the implementation for adding constraints as a coprocessor. 
bq.  
bq.  Looking for general comments on style/structure, though nitpicks are ok 
too. 
bq.  
bq.  Currently missing implementation for disableConstraints() since that will 
require adding removeCoprocessor() to HTD (also comments on if this is worth it 
would be good). 
bq.  
bq.  
bq.  This addresses bug HBASE-4605.
bq.  https://issues.apache.org/jira/browse/HBASE-4605
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.src/docbkx/book.xml 3c12169 
bq.src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 84a0d1a 
bq.src/main/java/org/apache/hadoop/hbase/constraint/BaseConstraint.java 
PRE-CREATION 
bq.src/main/java/org/apache/hadoop/hbase/constraint/Constraint.java 
PRE-CREATION 
bq.
src/main/java/org/apache/hadoop/hbase/constraint/ConstraintException.java 
PRE-CREATION 
bq.
src/main/java/org/apache/hadoop/hbase/constraint/ConstraintProcessor.java 
PRE-CREATION 
bq.src/main/java/org/apache/hadoop/hbase/constraint/Constraints.java 
PRE-CREATION 
bq.src/main/java/org/apache/hadoop/hbase/constraint/IntegerConstraint.java 
PRE-CREATION 
bq.src/main/java/org/apache/hadoop/hbase/constraint/package-info.java 
PRE-CREATION 
bq.src/test/java/org/apache/hadoop/hbase/TestHTableDescriptor.java 
PRE-CREATION 
bq.src/test/java/org/apache/hadoop/hbase/constraint/AllFailConstraint.java 
PRE-CREATION 
bq.src/test/java/org/apache/hadoop/hbase/constraint/AllPassConstraint.java 
PRE-CREATION 
bq.
src/test/java/org/apache/hadoop/hbase/constraint/CheckConfigurationConstraint.java
 PRE-CREATION 
bq.
src/test/java/org/apache/hadoop/hbase/constraint/IntegrationTestConstraint.java 
PRE-CREATION 
bq.
src/test/java/org/apache/hadoop/hbase/constraint/RuntimeFailConstraint.java 
PRE-CREATION 
bq.src/test/java/org/apache/hadoop/hbase/constraint/TestConstraints.java 
PRE-CREATION 
bq.
src/test/java/org/apache/hadoop/hbase/constraint/TestIntegerConstraint.java 
PRE-CREATION 
bq.src/test/java/org/apache/hadoop/hbase/constraint/WorksConstraint.java 
PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/2579/diff
bq.  
bq.  

[jira] [Commented] (HBASE-4605) Constraints

2011-12-06 Thread Jesse Yates (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13163926#comment-13163926
 ] 

Jesse Yates commented on HBASE-4605:


@Gary:
Guava-  I didn't know about the effort to get rid of guava client side, it can 
be pulled if it is really a big issue. I just read through 3264 - is it the 
case that only the client side dependencies are used for the MR stuff? For this 
it works out really nicely for doing the auto-serializing (see changes to the 
CP stuff). We could say that the set bytes for key is at your own risk, and 
make strings the default.

getValueAsBytes() - this was -1'ed in a previous iteration as constraints was 
the only use case for it - wrapping it with the explicit functions seemed a 
decent solution.

Configuration - As far as human readable, how often are people going to need to 
check those values? The HTD already just keeps around bytes, so people don't 
check those over the wire - constraints would remain in the same style.
How would you feel about making that configurable? I'm thinking setting a debug 
flag in Constraints/configuration for that value.

check(Put) - this could definitely be an extension. +1 on future considerations.

IntegerConstraint - In the end, I can let this go. You're right that having in 
the docs should be enough. Also, combined with the fact that we don't have any 
others by default (and that removing is far harder than adding) I'll drop it 
into just the docs

@Ted:

IntegrationTestConstraint - based on the recent discussion on dev@ about 
testing, this should be renamed to just TestConstraint and labeled as 
@MediumTest. The rest should be moved to @SmallTest

Atomicity - since per-row modifications are done essentially serially on the 
region, is this really that big of an issue? If not, then another jira isn't a 
necessary, but maybe something to consider going forward. Taking the row lock 
is going to be a big slowdown - it just depends on what the common use case is 
going to be.

(this is reproducing a some of what was commented on RB, but is nice to have 
succinctly). 

 Constraints
 ---

 Key: HBASE-4605
 URL: https://issues.apache.org/jira/browse/HBASE-4605
 Project: HBase
  Issue Type: Improvement
  Components: client, coprocessors
Affects Versions: 0.94.0
Reporter: Jesse Yates
Assignee: Jesse Yates
 Attachments: 4605.v7, constraint_as_cp.txt, java_Constraint_v2.patch, 
 java_HBASE-4605_v1.patch, java_HBASE-4605_v2.patch, java_HBASE-4605_v3.patch


 From Jesse's comment on dev:
 {quote}
 What I would like to propose is a simple interface that people can use to 
 implement a 'constraint' (matching the classic database definition). This 
 would help ease of adoption by helping HBase more easily check that box, help 
 minimize code duplication across organizations, and lead to easier adoption.
 Essentially, people would implement a 'Constraint' interface for checking 
 keys before they are put into a table. Puts that are valid get written to the 
 table, but if not people can will throw an exception that gets propagated 
 back to the client explaining why the put was invalid.
 Constraints would be set on a per-table basis and the user would be expected 
 to ensure the jars containing the constraint are present on the machines 
 serving that table.
 Yes, people could roll their own mechanism for doing this via coprocessors 
 each time, but this would make it easier to do so, so you only have to 
 implement a very minimal interface and not worry about the specifics.
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

2011-12-06 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13163925#comment-13163925
 ] 

stack commented on HBASE-4880:
--

Excellent detective work here lads.

I too would lean toward Rams' original suggestion, that we only online the 
region after zk update and edit to .meta. has gone in.



 Region is on service before completing openRegionHanlder, may cause data loss
 -

 Key: HBASE-4880
 URL: https://issues.apache.org/jira/browse/HBASE-4880
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: hbase-4880.patch


 OpenRegionHandler in regionserver is processed as the following steps:
 {code}
 1.openregion()(Through it, closed = false, closing = false)
 2.addToOnlineRegions(region)
 3.update .meta. table 
 4.update ZK's node state to RS_ZK_REGION_OPEND
 {code}
 We can find that region is on service before Step 4.
 It means client could put data to this region after step 3.
 What will happen if step 4 is failed processing?
 It will execute OpenRegionHandler#cleanupFailedOpen which will do closing 
 region, and master assign this region to another regionserver.
 If closing region is failed, the data which is put between step 3 and step 4 
 may loss, because the region has been opend on another regionserver and be 
 put new data. Therefore, it may not be recoverd through replayRecoveredEdit() 
 because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

2011-12-06 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13163928#comment-13163928
 ] 

stack commented on HBASE-4880:
--

I took a look at the patch.  Yeah, its a bit messy but nice first cut at the 
problem.

 Region is on service before completing openRegionHanlder, may cause data loss
 -

 Key: HBASE-4880
 URL: https://issues.apache.org/jira/browse/HBASE-4880
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: hbase-4880.patch


 OpenRegionHandler in regionserver is processed as the following steps:
 {code}
 1.openregion()(Through it, closed = false, closing = false)
 2.addToOnlineRegions(region)
 3.update .meta. table 
 4.update ZK's node state to RS_ZK_REGION_OPEND
 {code}
 We can find that region is on service before Step 4.
 It means client could put data to this region after step 3.
 What will happen if step 4 is failed processing?
 It will execute OpenRegionHandler#cleanupFailedOpen which will do closing 
 region, and master assign this region to another regionserver.
 If closing region is failed, the data which is put between step 3 and step 4 
 may loss, because the region has been opend on another regionserver and be 
 put new data. Therefore, it may not be recoverd through replayRecoveredEdit() 
 because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests

2011-12-06 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-4965:
---

Attachment: ResourceCheckerJUnitRule.java
ResourceChecker.java

 Monitor the open file descriptors and the threads counters during the unit 
 tests
 

 Key: HBASE-4965
 URL: https://issues.apache.org/jira/browse/HBASE-4965
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: ResourceChecker.java, ResourceCheckerJUnitRule.java


 We're seeing a lot of issues with hadoop-qa related to threads or file 
 descriptors.
 Monitoring these counters would ease the analysis.
 Note as well that
  - if we want to execute the tests in the same jvm (because the test is small 
 or because we want to share the cluster) we can't afford to leak too many 
 resources
  - if the tests leak, it's more difficult to detect a leak in the software 
 itself.
 I attach piece of code that I used. It requires two lines in a unit test 
 class to:
 - before every test, count the threads and the open file descriptor
 - after every test, compare with the previous value.
 I ran it on some tests; we have for example:
 - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 
 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 
 232 threads!
 - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 
 283 file descriptors (was 282).
 - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 
 294), 815 file descriptors (was 461)
 - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 
 148), 310 file descriptors (was 307).
 It's not always leaks, we can expect some pooling effects. But still...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests

2011-12-06 Thread nkeywal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13163937#comment-13163937
 ] 

nkeywal commented on HBASE-4965:


Oops. Sorry.
Here they are.

To use then, we need to add these following lines in the test code
{noformat}
  @Rule
  public ResourceCheckerJUnitRule cu = new ResourceCheckerJUnitRule();
{noformat}

It's the less intrusive way I found.

Before and after each code method, we count the number of threads and number of 
open file handles, and log them.

Unfortunately, I found a bug in surefire and these lines are not stored with 
redirect to file option activated. I fixed it locally.

Despite this, I think it makes sense to track these data. It will ease the 
analysis when something goes wrong, even if fixing all the current leaks would 
take quite a lot of time.










 Monitor the open file descriptors and the threads counters during the unit 
 tests
 

 Key: HBASE-4965
 URL: https://issues.apache.org/jira/browse/HBASE-4965
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: ResourceChecker.java, ResourceCheckerJUnitRule.java


 We're seeing a lot of issues with hadoop-qa related to threads or file 
 descriptors.
 Monitoring these counters would ease the analysis.
 Note as well that
  - if we want to execute the tests in the same jvm (because the test is small 
 or because we want to share the cluster) we can't afford to leak too many 
 resources
  - if the tests leak, it's more difficult to detect a leak in the software 
 itself.
 I attach piece of code that I used. It requires two lines in a unit test 
 class to:
 - before every test, count the threads and the open file descriptor
 - after every test, compare with the previous value.
 I ran it on some tests; we have for example:
 - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 
 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 
 232 threads!
 - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 
 283 file descriptors (was 282).
 - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 
 294), 815 file descriptors (was 461)
 - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 
 148), 310 file descriptors (was 307).
 It's not always leaks, we can expect some pooling effects. But still...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4605) Constraints

2011-12-06 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13163939#comment-13163939
 ] 

Ted Yu commented on HBASE-4605:
---

Renaming IntegrationTestConstraint as described above makes sense.

Thanks Jesse.

 Constraints
 ---

 Key: HBASE-4605
 URL: https://issues.apache.org/jira/browse/HBASE-4605
 Project: HBase
  Issue Type: Improvement
  Components: client, coprocessors
Affects Versions: 0.94.0
Reporter: Jesse Yates
Assignee: Jesse Yates
 Attachments: 4605.v7, constraint_as_cp.txt, java_Constraint_v2.patch, 
 java_HBASE-4605_v1.patch, java_HBASE-4605_v2.patch, java_HBASE-4605_v3.patch


 From Jesse's comment on dev:
 {quote}
 What I would like to propose is a simple interface that people can use to 
 implement a 'constraint' (matching the classic database definition). This 
 would help ease of adoption by helping HBase more easily check that box, help 
 minimize code duplication across organizations, and lead to easier adoption.
 Essentially, people would implement a 'Constraint' interface for checking 
 keys before they are put into a table. Puts that are valid get written to the 
 table, but if not people can will throw an exception that gets propagated 
 back to the client explaining why the put was invalid.
 Constraints would be set on a per-table basis and the user would be expected 
 to ensure the jars containing the constraint are present on the machines 
 serving that table.
 Yes, people could roll their own mechanism for doing this via coprocessors 
 each time, but this would make it easier to do so, so you only have to 
 implement a very minimal interface and not worry about the specifics.
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

2011-12-06 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13163943#comment-13163943
 ] 

Zhihong Yu commented on HBASE-4880:
---

TestAdmin#testForceSplit would fail if I comment out the following line in 
HRegionServer.postOpenDeployTasks():
{code}
-addToOnlineRegions(r);
+// addToOnlineRegions(r);
{code}
This is the error:
{code}
org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
action: NotServingRegionException: 1 time, servers with issues: 
lm-sjn-xx:65442, 
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1641)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1409)
at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:892)
at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:750)
at org.apache.hadoop.hbase.client.HTable.put(HTable.java:725)
at 
org.apache.hadoop.hbase.client.TestAdmin.splitTest(TestAdmin.java:805)
at 
org.apache.hadoop.hbase.client.TestAdmin.testForceSplit(TestAdmin.java:108)
{code}

 Region is on service before completing openRegionHanlder, may cause data loss
 -

 Key: HBASE-4880
 URL: https://issues.apache.org/jira/browse/HBASE-4880
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: hbase-4880.patch


 OpenRegionHandler in regionserver is processed as the following steps:
 {code}
 1.openregion()(Through it, closed = false, closing = false)
 2.addToOnlineRegions(region)
 3.update .meta. table 
 4.update ZK's node state to RS_ZK_REGION_OPEND
 {code}
 We can find that region is on service before Step 4.
 It means client could put data to this region after step 3.
 What will happen if step 4 is failed processing?
 It will execute OpenRegionHandler#cleanupFailedOpen which will do closing 
 region, and master assign this region to another regionserver.
 If closing region is failed, the data which is put between step 3 and step 4 
 may loss, because the region has been opend on another regionserver and be 
 put new data. Therefore, it may not be recoverd through replayRecoveredEdit() 
 because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4605) Constraints

2011-12-06 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13163944#comment-13163944
 ] 

Zhihong Yu commented on HBASE-4605:
---

w.r.t. atomicity, it is outside the initial scope of this JIRA.

 Constraints
 ---

 Key: HBASE-4605
 URL: https://issues.apache.org/jira/browse/HBASE-4605
 Project: HBase
  Issue Type: Improvement
  Components: client, coprocessors
Affects Versions: 0.94.0
Reporter: Jesse Yates
Assignee: Jesse Yates
 Attachments: 4605.v7, constraint_as_cp.txt, java_Constraint_v2.patch, 
 java_HBASE-4605_v1.patch, java_HBASE-4605_v2.patch, java_HBASE-4605_v3.patch


 From Jesse's comment on dev:
 {quote}
 What I would like to propose is a simple interface that people can use to 
 implement a 'constraint' (matching the classic database definition). This 
 would help ease of adoption by helping HBase more easily check that box, help 
 minimize code duplication across organizations, and lead to easier adoption.
 Essentially, people would implement a 'Constraint' interface for checking 
 keys before they are put into a table. Puts that are valid get written to the 
 table, but if not people can will throw an exception that gets propagated 
 back to the client explaining why the put was invalid.
 Constraints would be set on a per-table basis and the user would be expected 
 to ensure the jars containing the constraint are present on the machines 
 serving that table.
 Yes, people could roll their own mechanism for doing this via coprocessors 
 each time, but this would make it easier to do so, so you only have to 
 implement a very minimal interface and not worry about the specifics.
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4908) HBase cluster test tool (port from 0.89-fb)

2011-12-06 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13163946#comment-13163946
 ] 

Phabricator commented on HBASE-4908:


stack has commented on the revision [jira] [HBASE-4908] HBase cluster test 
tool (port from 0.89-fb).

  This is looking great Mikhail.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/EmptyWatcher.java:28 When would I want 
one of these?
  src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java:176 Is 
this from another issue Mikhail?  (No matter if it is... we can take care of it 
on commit).
  
src/main/java/org/apache/hadoop/hbase/regionserver/ConstantSizeRegionSplitPolicy.java:37
 Thanks for doing this.
  
src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java:111
 Thanks for fixing this.
  src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:1798 Good
  src/test/java/org/apache/hadoop/hbase/util/IntegrationTestTool.java:39 Will 
this be run as an IntegrationTest because it has the IntegrationTest prefix; 
(see the Integration Test section on this page, 
http://hbase.apache.org/book/hbase.tests.html, if you don't have cluse what I'm 
on about)?

  I like the idea of this class.  We need it.  Nice how you subclass tool.
  src/test/java/org/apache/hadoop/hbase/util/LoadTest.java:40 Should this class 
have the IntegrationTest prefix or you think this is just a tool not part of 
IntegrationTests?

  Since its sitting beside PerformanceEvaluation tool, should you say something 
on how it differs from it?
  src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java:37 Should 
this be an IntegrationTest?  (We can do the convertion in another issue).

REVISION DETAIL
  https://reviews.facebook.net/D549


 HBase cluster test tool (port from 0.89-fb)
 ---

 Key: HBASE-4908
 URL: https://issues.apache.org/jira/browse/HBASE-4908
 Project: HBase
  Issue Type: Sub-task
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: 0001-HBase-cluster-test-tool.patch, D549.1.patch, 
 D549.2.patch, D549.3.patch, D549.4.patch, D549.5.patch, D549.6.patch, 
 D549.7.patch


 Porting one of our HBase cluster test tools (a single-process multi-threaded 
 load generator and verifier) from 0.89-fb to trunk.
 I cleaned up the code a bit compared to what's in 0.89-fb, and discovered 
 that it has some features that I have not tried yet (some kind of a kill 
 test, and some way to run HBase as multiple processes on one machine).
 The main utility of this piece of code for us has been the HBaseClusterTest 
 command-line tool (called HBaseTest in 0.89-fb), which we usually invoke as a 
 load test in our five-node dev cluster testing, e.g.:
 hbase org.apache.hadoop.hbase.manual.HBaseTest -load 10:50:100:20 -tn 
 load_test -read 1:10:50:20 -zk zk_quorum -bloom ROWCOL -compression 
 GZIP
 I will be using this code to load-test the delta encoding patch and making 
 fixes, but I am submitting the patch for early feedback. I will probably try 
 out its other functionality and comment on how it works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests

2011-12-06 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13163947#comment-13163947
 ] 

stack commented on HBASE-4965:
--

So, you want to add it for running of all current tests?  Sounds good to me 
(Classes are missing license and class comments explaining what they are at).

We have to add the @Rule to every test method or just to each Test class?

 Monitor the open file descriptors and the threads counters during the unit 
 tests
 

 Key: HBASE-4965
 URL: https://issues.apache.org/jira/browse/HBASE-4965
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: ResourceChecker.java, ResourceCheckerJUnitRule.java


 We're seeing a lot of issues with hadoop-qa related to threads or file 
 descriptors.
 Monitoring these counters would ease the analysis.
 Note as well that
  - if we want to execute the tests in the same jvm (because the test is small 
 or because we want to share the cluster) we can't afford to leak too many 
 resources
  - if the tests leak, it's more difficult to detect a leak in the software 
 itself.
 I attach piece of code that I used. It requires two lines in a unit test 
 class to:
 - before every test, count the threads and the open file descriptor
 - after every test, compare with the previous value.
 I ran it on some tests; we have for example:
 - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 
 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 
 232 threads!
 - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 
 283 file descriptors (was 282).
 - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 
 294), 815 file descriptors (was 461)
 - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 
 148), 310 file descriptors (was 307).
 It's not always leaks, we can expect some pooling effects. But still...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests

2011-12-06 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13163951#comment-13163951
 ] 

stack commented on HBASE-4965:
--

Where does the output show?  In the test output?  Good stuff.

 Monitor the open file descriptors and the threads counters during the unit 
 tests
 

 Key: HBASE-4965
 URL: https://issues.apache.org/jira/browse/HBASE-4965
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: ResourceChecker.java, ResourceCheckerJUnitRule.java


 We're seeing a lot of issues with hadoop-qa related to threads or file 
 descriptors.
 Monitoring these counters would ease the analysis.
 Note as well that
  - if we want to execute the tests in the same jvm (because the test is small 
 or because we want to share the cluster) we can't afford to leak too many 
 resources
  - if the tests leak, it's more difficult to detect a leak in the software 
 itself.
 I attach piece of code that I used. It requires two lines in a unit test 
 class to:
 - before every test, count the threads and the open file descriptor
 - after every test, compare with the previous value.
 I ran it on some tests; we have for example:
 - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 
 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 
 232 threads!
 - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 
 283 file descriptors (was 282).
 - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 
 294), 815 file descriptors (was 461)
 - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 
 148), 310 file descriptors (was 307).
 It's not always leaks, we can expect some pooling effects. But still...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

2011-12-06 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13163953#comment-13163953
 ] 

stack commented on HBASE-4880:
--

@Zhihong We need to fix the above split issue if we refactor so we only put 
region online as last step?

 Region is on service before completing openRegionHanlder, may cause data loss
 -

 Key: HBASE-4880
 URL: https://issues.apache.org/jira/browse/HBASE-4880
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: hbase-4880.patch


 OpenRegionHandler in regionserver is processed as the following steps:
 {code}
 1.openregion()(Through it, closed = false, closing = false)
 2.addToOnlineRegions(region)
 3.update .meta. table 
 4.update ZK's node state to RS_ZK_REGION_OPEND
 {code}
 We can find that region is on service before Step 4.
 It means client could put data to this region after step 3.
 What will happen if step 4 is failed processing?
 It will execute OpenRegionHandler#cleanupFailedOpen which will do closing 
 region, and master assign this region to another regionserver.
 If closing region is failed, the data which is put between step 3 and step 4 
 may loss, because the region has been opend on another regionserver and be 
 put new data. Therefore, it may not be recoverd through replayRecoveredEdit() 
 because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4960) Document mutual authentication between HBase and Zookeeper using SASL

2011-12-06 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13163961#comment-13163961
 ] 

stack commented on HBASE-4960:
--

Thats some nice doc you made there Eugene.. Committing.

 Document mutual authentication between HBase and Zookeeper using SASL
 -

 Key: HBASE-4960
 URL: https://issues.apache.org/jira/browse/HBASE-4960
 Project: HBase
  Issue Type: Sub-task
  Components: documentation, security
Reporter: Eugene Koontz
Assignee: Eugene Koontz
  Labels: documentation, security
 Attachments: HBASE-4960.patch, HBASE-4960.patch


 Provide documentation for the work done in HBASE-2418 (add support for 
 ZooKeeper authentication).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4960) Document mutual authentication between HBase and Zookeeper using SASL

2011-12-06 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4960:
-

   Resolution: Fixed
Fix Version/s: 0.92.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to TRUNK though marked as 0.92.0.  Will copy the book over to branch 
just before cut next 0.92.0RC.  Thanks Eugene.  Nice doc.

 Document mutual authentication between HBase and Zookeeper using SASL
 -

 Key: HBASE-4960
 URL: https://issues.apache.org/jira/browse/HBASE-4960
 Project: HBase
  Issue Type: Sub-task
  Components: documentation, security
Reporter: Eugene Koontz
Assignee: Eugene Koontz
  Labels: documentation, security
 Fix For: 0.92.0

 Attachments: HBASE-4960.patch, HBASE-4960.patch


 Provide documentation for the work done in HBASE-2418 (add support for 
 ZooKeeper authentication).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests

2011-12-06 Thread nkeywal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13163966#comment-13163966
 ] 

nkeywal commented on HBASE-4965:


It's one rule for each test class.
With a fixed surefire, it shows as a standard log in the output. For example:

{noformat}
2011-12-06 15:03:32,982 INFO  [main] hbase.HBaseTestingUtility(518): Starting 
up minicluster with 1 master(s) and 3 regionserver(s) and 3 datanode(s)
2011-12-06 15:03:34,052 WARN  [main] impl.MetricsSystemImpl(137): Metrics 
system not started: Cannot locate configuration: tried 
hadoop-metrics2-namenode.properties, hadoop-metrics2.properties
[...]
2011-12-06 15:03:41,587 DEBUG [main] client.HTable$ClientScanner(1183): 
Finished with scanning at {NAME = '.META.,,1', STARTKEY = '', ENDKEY = '', 
ENCODED = 1028785192,}
2011-12-06 15:03:41,588 INFO  [main] hbase.HBaseTestingUtility(561): 
Minicluster is up
2011-12-06 15:03:41,588 INFO  [main] 
client.HConnectionManager$HConnectionImplementation(1805): Closed zookeeper 
sessionid=0x134159e31930008
2011-12-06 15:03:41,661 INFO  [main] hbase.ResourceChecker(117): before 
org.apache.hadoop.hbase.client.TestAdmin#testDeleteEditUnknownColumnFamilyAndOrTable:
 247 threads, 417 file descriptors 
[...]
2011-12-06 15:03:43,282 INFO  [main] 
client.HConnectionManager$HConnectionImplementation(1805): Closed zookeeper 
sessionid=0x134159e31930009
2011-12-06 15:03:43,313 INFO  [main] hbase.ResourceChecker(117): after 
org.apache.hadoop.hbase.client.TestAdmin#testDeleteEditUnknownColumnFamilyAndOrTable:
 265 threads (was 247), 450 file descriptors (was 417).  -thread leak?-  -file 
handle leak?- 
[...]
{noformat}

If you're ok with the idea, I will professionalize the code a little and 
propose it as a patch.




 Monitor the open file descriptors and the threads counters during the unit 
 tests
 

 Key: HBASE-4965
 URL: https://issues.apache.org/jira/browse/HBASE-4965
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: ResourceChecker.java, ResourceCheckerJUnitRule.java


 We're seeing a lot of issues with hadoop-qa related to threads or file 
 descriptors.
 Monitoring these counters would ease the analysis.
 Note as well that
  - if we want to execute the tests in the same jvm (because the test is small 
 or because we want to share the cluster) we can't afford to leak too many 
 resources
  - if the tests leak, it's more difficult to detect a leak in the software 
 itself.
 I attach piece of code that I used. It requires two lines in a unit test 
 class to:
 - before every test, count the threads and the open file descriptor
 - after every test, compare with the previous value.
 I ran it on some tests; we have for example:
 - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 
 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 
 232 threads!
 - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 
 283 file descriptors (was 282).
 - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 
 294), 815 file descriptors (was 461)
 - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 
 148), 310 file descriptors (was 307).
 It's not always leaks, we can expect some pooling effects. But still...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4937) Error in Quick Start Shell Exercises

2011-12-06 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4937:
-

Attachment: 4937.txt

Patch to fix Ryan's observation.

 Error in Quick Start Shell Exercises
 

 Key: HBASE-4937
 URL: https://issues.apache.org/jira/browse/HBASE-4937
 Project: HBase
  Issue Type: Bug
  Components: documentation
Reporter: Ryan Berdeen
 Attachments: 4937.txt


 The shell exercises in the Quick Start 
 (http://hbase.apache.org/book/quickstart.html) starts
 {code}
 hbase(main):003:0 create 'test', 'cf'
 0 row(s) in 1.2200 seconds
 hbase(main):003:0 list 'table'
 test
 1 row(s) in 0.0550 seconds
 {code}
 It looks like the second command is wrong. Running it, the actual output is
 {code}
 hbase(main):001:0 create 'test', 'cf'
 0 row(s) in 0.3630 seconds
 hbase(main):002:0 list 'table'
 TABLE 
   
   
 0 row(s) in 0.0100 seconds
 {code}
 The argument to list should be 'test', not 'table', and the output in the 
 example is missing the {{TABLE}} line.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-4937) Error in Quick Start Shell Exercises

2011-12-06 Thread stack (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-4937.
--

   Resolution: Fixed
Fix Version/s: 0.94.0
 Assignee: stack

Committed to trunk.  Thanks for the doc bug report Ryan.

 Error in Quick Start Shell Exercises
 

 Key: HBASE-4937
 URL: https://issues.apache.org/jira/browse/HBASE-4937
 Project: HBase
  Issue Type: Bug
  Components: documentation
Reporter: Ryan Berdeen
Assignee: stack
 Fix For: 0.94.0

 Attachments: 4937.txt


 The shell exercises in the Quick Start 
 (http://hbase.apache.org/book/quickstart.html) starts
 {code}
 hbase(main):003:0 create 'test', 'cf'
 0 row(s) in 1.2200 seconds
 hbase(main):003:0 list 'table'
 test
 1 row(s) in 0.0550 seconds
 {code}
 It looks like the second command is wrong. Running it, the actual output is
 {code}
 hbase(main):001:0 create 'test', 'cf'
 0 row(s) in 0.3630 seconds
 hbase(main):002:0 list 'table'
 TABLE 
   
   
 0 row(s) in 0.0100 seconds
 {code}
 The argument to list should be 'test', not 'table', and the output in the 
 example is missing the {{TABLE}} line.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4936) Cached HRegionInterface connections crash when getting UnknownHost exceptions

2011-12-06 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4936:
-

   Resolution: Fixed
Fix Version/s: 0.94.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to TRUNK.  Thanks Andrei.

 Cached HRegionInterface connections crash when getting UnknownHost exceptions
 -

 Key: HBASE-4936
 URL: https://issues.apache.org/jira/browse/HBASE-4936
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Andrei Dragomir
Assignee: Andrei Dragomir
 Fix For: 0.94.0

 Attachments: HBASE-4936-v2.patch, HBASE-4936.patch


 This isssue is unlikely to come up in a cluster test case. However, for 
 development, the following thing happens: 
 1. Start the HBase cluster locally, on network A (DNS A, etc)
 2. The region locations are cached using the hostname 
 (mycomputer.company.com, 211.x.y.z - real ip)
 3. Change network location (go home)
 4. Start the HBase cluster locally. My hostname / ips are not different 
 (mycomputer, 192.168.0.130 - new ip)
 If the region locations have been cached using the hostname, there is an 
 UnknownHostException in CatalogTracker.getCachedConnection(ServerName sn), 
 uncaught in the catch statements. The server will crash constantly. 
 The error should be caught and not rethrown, so that the cached connection 
 expires normally. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4951) master process can not be stopped when it is initializing

2011-12-06 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13163980#comment-13163980
 ] 

stack commented on HBASE-4951:
--

I believe this fixed in 0.92.  Won't close till prove it.

 master process can not be stopped when it is initializing
 -

 Key: HBASE-4951
 URL: https://issues.apache.org/jira/browse/HBASE-4951
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.3
Reporter: xufeng
Priority: Critical
 Fix For: 0.92.0, 0.90.5


 It is easy to reproduce by following step:
 step1:start master process.(do not start regionserver process in the cluster).
 the master will wait the regionserver to check in:
 org.apache.hadoop.hbase.master.ServerManager: Waiting on regionserver(s) to 
 checkin
 step2:stop the master by sh command bin/hbase master stop
 result:the master process will never die because catalogTracker.waitForRoot() 
 method will block unitl the root region assigned.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-4954) IllegalArgumentException in hfile2 blockseek

2011-12-06 Thread Roman Shaposhnik (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Shaposhnik resolved HBASE-4954.
-

Resolution: Won't Fix

 IllegalArgumentException in hfile2 blockseek
 

 Key: HBASE-4954
 URL: https://issues.apache.org/jira/browse/HBASE-4954
 Project: HBase
  Issue Type: Bug
Reporter: stack
Priority: Critical
 Fix For: 0.92.0


 On Tue, Nov 29, 2011 at 10:20 PM, Stack st...@duboce.net wrote:
  The first hbase 0.92.0 release candidate is available for download:
 
   http://people.apache.org/~stack/hbase-0.92.0-candidate-0/
 Here's another persistent issues that I'd appreciate somebody taking
 a quick look at:
 
 http://bigtop01.cloudera.org:8080/view/Hadoop%200.22/job/Bigtop-hadoop22-smoketest/28/testReport/org.apache.bigtop.itest.hbase.smoke/TestHFileOutputFormat/testMRIncrementalLoadWithSplit/
 Caused by: java.lang.IllegalArgumentException
at java.nio.Buffer.position(Buffer.java:218)
at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.blockSeek(HFileReaderV2.java:632)
at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.loadBlockAndSeekToKey(HFileReaderV2.java:545)
at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.seekTo(HFileReaderV2.java:503)
at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.seekTo(HFileReaderV2.java:511)
at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.seekTo(HFileReaderV2.java:475)
at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekTo(HalfStoreFileReader.java:157)
at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.copyHFileHalf(LoadIncrementalHFiles.java:544)
at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.splitStoreFile(LoadIncrementalHFiles.java:516)
at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.splitStoreFile(LoadIncrementalHFiles.java:377)
at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.groupOrSplit(LoadIncrementalHFiles.java:441)
at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:325)
at 
 org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:323)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4712) Document rules for writing tests

2011-12-06 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164013#comment-13164013
 ] 

Hudson commented on HBASE-4712:
---

Integrated in HBase-TRUNK #2522 (See 
[https://builds.apache.org/job/HBase-TRUNK/2522/])
HBASE-4712 Document rules for writing tests

stack : 
Files : 
* /hbase/trunk/src/docbkx/developer.xml


 Document rules for writing tests
 

 Key: HBASE-4712
 URL: https://issues.apache.org/jira/browse/HBASE-4712
 Project: HBase
  Issue Type: Task
  Components: test
Affects Versions: 0.92.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.94.0

 Attachments: 4712.txt


 We saw that some tests could be improved. Documenting the general rules could 
 help.
 Proposal:
 HBase tests are divided in three categories: small, medium and large, with 
 corresponding JUnit categories: SmallTest, MediumTest, LargeTest
 Small tests are executed in parallel in a shared JVM. They must last less 
 than 15 seconds. They must NOT use a cluster.
 Medium tests are executed in separate JVM. They must last less than 50 
 seconds. They can use a cluster. They must not fail occasionally.
 Small and medium tests must not need more than 30 minutes to run altogether.
 Small and medium tests should be executed by the developers before submitting 
 a patch.
 Large tests are everything else. They are typically integration tests, 
 non-regression tests for specific bugs, timeout tests, performance tests.
 Tests rules  hints are:
 - As most as possible, tests should be written as small tests.
 - All tests should be written to support parallel execution on the same 
 machine, hence should not use shared resources as fixed ports or fixed file 
 names.
 - All tests should be written to be as fast as possible.
 - Tests should not overlog. More than 100 lines/second makes the logs complex 
 to read and use i/o that are hence not available for the other tests.
 - Tests can be written with HBaseTestingUtility . This class offers helper 
 function to create a temp directory and do the cleanup, or to start a cluster.
 - Sleeps:
 - Tests should not do a 'Thread.sleep' without testing an ending 
 condition. This allows understanding what the test is waiting for. Moreover, 
 the test will work whatever the machine performances.
 - Sleep should be minimal to be as fast as possible. Waiting for a 
 variable should be done in a 40ms sleep loop. Waiting for a socket operation 
 should be done in a 200 ms sleep loop.
 - Tests using cluster:
 - Tests using a HRegion do not have to start a cluster: A region can use 
 the local file system.
 - Start/stopping a cluster cost around 10 seconds. They should not be 
 started per test method but per class.
 - Started cluster must be shutdown using 
 HBaseTestingUtility#shutdownMiniCluster, which cleans the directories.
 - As most as possible, tests should use the default settings for the 
 cluster. When they don't, they should document it. This will allow to share 
 the cluster later.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4927) CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty

2011-12-06 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164015#comment-13164015
 ] 

Hudson commented on HBASE-4927:
---

Integrated in HBase-TRUNK #2522 (See 
[https://builds.apache.org/job/HBase-TRUNK/2522/])
HBASE-4927 CatalogJanior:SplitParentFirstComparator doesn't sort as 
expected, for the last region when the endkey is empty

stack : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HRegionInfo.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/CatalogJanitor.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java


 CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the 
 last region when the endkey is empty
 ---

 Key: HBASE-4927
 URL: https://issues.apache.org/jira/browse/HBASE-4927
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.92.0

 Attachments: 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator-.patch, 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator_v2.patch, 
 hbase-4927-fix-ws.txt


 When reviewing HBASE-4238 backporting, Jon found this issue.
 What happens if the split points are  (empty end key is the last key, empty 
 start key is the first key)
 Parent [A,)
 L daughter [A,B), 
 R daughter [B,)
 When sorted, we gets to end key comparision which results in this incorrector 
 order:
 [A,B), [A,), [B,) 
 we wanted:
 [A,), [A,B), [B,)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4729) Clash between region unassign and splitting kills the master

2011-12-06 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164014#comment-13164014
 ] 

Hudson commented on HBASE-4729:
---

Integrated in HBase-TRUNK #2522 (See 
[https://builds.apache.org/job/HBase-TRUNK/2522/])
HBASE-4729 Clash between region unassign and splitting kills the master

stack : 
Files : 
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKTable.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java


 Clash between region unassign and splitting kills the master
 

 Key: HBASE-4729
 URL: https://issues.apache.org/jira/browse/HBASE-4729
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Assignee: stack
Priority: Critical
 Fix For: 0.92.0, 0.94.0

 Attachments: 4729-v2.txt, 4729-v3.txt, 4729-v4.txt, 4729-v5.txt, 
 4729-v6-092.txt, 4729-v6-trunk.txt, 4729.txt


 I was running an online alter while regions were splitting, and suddenly the 
 master died and left my table half-altered (haven't restarted the master yet).
 What killed the master:
 {quote}
 2011-11-02 17:06:44,428 FATAL org.apache.hadoop.hbase.master.HMaster: 
 Unexpected ZK exception creating node CLOSING
 org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = 
 NodeExists for /hbase/unassigned/f7e1783e65ea8d621a4bc96ad310f101
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:110)
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
 at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637)
 at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:459)
 at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:441)
 at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:769)
 at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.createNodeClosing(ZKAssign.java:568)
 at 
 org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1722)
 at 
 org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1661)
 at org.apache.hadoop.hbase.master.BulkReOpen$1.run(BulkReOpen.java:69)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 {quote}
 A znode was created because the region server was splitting the region 4 
 seconds before:
 {quote}
 2011-11-02 17:06:40,704 INFO 
 org.apache.hadoop.hbase.regionserver.SplitTransaction: Starting split of 
 region TestTable,0012469153,1320253135043.f7e1783e65ea8d621a4bc96ad310f101.
 2011-11-02 17:06:40,704 DEBUG 
 org.apache.hadoop.hbase.regionserver.SplitTransaction: 
 regionserver:62023-0x132f043bbde0710 Creating ephemeral node for 
 f7e1783e65ea8d621a4bc96ad310f101 in SPLITTING state
 2011-11-02 17:06:40,751 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 regionserver:62023-0x132f043bbde0710 Attempting to transition node 
 f7e1783e65ea8d621a4bc96ad310f101 from RS_ZK_REGION_SPLITTING to 
 RS_ZK_REGION_SPLITTING
 ...
 2011-11-02 17:06:44,061 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 regionserver:62023-0x132f043bbde0710 Successfully transitioned node 
 f7e1783e65ea8d621a4bc96ad310f101 from RS_ZK_REGION_SPLITTING to 
 RS_ZK_REGION_SPLIT
 2011-11-02 17:06:44,061 INFO 
 org.apache.hadoop.hbase.regionserver.SplitTransaction: Still waiting on the 
 master to process the split for f7e1783e65ea8d621a4bc96ad310f101
 {quote}
 Now that the master is dead the region server is spewing those last two lines 
 like mad.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4844) Coprocessor hooks for log rolling

2011-12-06 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164022#comment-13164022
 ] 

Lars Hofhansl commented on HBASE-4844:
--

bq. The logroller could signal new log to copy?

Right, and it could trigger a coprocessor hook to do the actual work of 
archiving. The coprocessor would get the path to the old file and then copy it 
somewhere else.
Looking at the code, there're races, though. Until the HLogs.writer is set to 
the new writer, all writes would still go to the old file. So if the 
coprocessor post hook is before that and it makes a copy of the file some edit 
might be missed (that go into the old file after it was copied, but before the 
writer was switched over).
Wouldn't it be nice if we had hardlinks in HDFS? :)

So I think the coprocessor post hook should be called after the HLog.writer 
assignment. If it did the copy synchronously it only needs to finish before the 
next log for the same regionserver is rolled (still a race, though).

I'll attach a very simple patch tonight or tomorrow morning and then folks can 
poke holes in it.


 Coprocessor hooks for log rolling
 -

 Key: HBASE-4844
 URL: https://issues.apache.org/jira/browse/HBASE-4844
 Project: HBase
  Issue Type: New Feature
Affects Versions: 0.94.0
Reporter: Lars Hofhansl
Priority: Minor

 In order to eventually do point in time recovery we need a way to reliably 
 back up the logs. Rather than adding some hard coded changes, we can provide 
 coprocessor hooks and folks can implement their own policies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4968) Add to troubleshooting workaround for direct buffer oome's.

2011-12-06 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4968:
-

Attachment: client.oome.txt

 Add to troubleshooting workaround for direct buffer oome's.
 ---

 Key: HBASE-4968
 URL: https://issues.apache.org/jira/browse/HBASE-4968
 Project: HBase
  Issue Type: Task
Reporter: stack
 Attachments: client.oome.txt


 Put into book workaround arrived at up on list discussing client oome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4968) Add to troubleshooting workaround for direct buffer oome's.

2011-12-06 Thread stack (Created) (JIRA)
Add to troubleshooting workaround for direct buffer oome's.
---

 Key: HBASE-4968
 URL: https://issues.apache.org/jira/browse/HBASE-4968
 Project: HBase
  Issue Type: Task
Reporter: stack
 Attachments: client.oome.txt

Put into book workaround arrived at up on list discussing client oome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-4968) Add to troubleshooting workaround for direct buffer oome's.

2011-12-06 Thread stack (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-4968.
--

   Resolution: Fixed
Fix Version/s: 0.94.0
 Assignee: stack

Committed to TRUNK.

 Add to troubleshooting workaround for direct buffer oome's.
 ---

 Key: HBASE-4968
 URL: https://issues.apache.org/jira/browse/HBASE-4968
 Project: HBase
  Issue Type: Task
Reporter: stack
Assignee: stack
 Fix For: 0.94.0

 Attachments: client.oome.txt


 Put into book workaround arrived at up on list discussing client oome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4969) tautology in HRegionInfo.readFields

2011-12-06 Thread Prakash Khemani (Created) (JIRA)
tautology in HRegionInfo.readFields
---

 Key: HBASE-4969
 URL: https://issues.apache.org/jira/browse/HBASE-4969
 Project: HBase
  Issue Type: Bug
Reporter: Prakash Khemani
Assignee: Prakash Khemani


In HRegionInfo.readFields() the following looks wrong to me

} else if (getVersion() == VERSION) {

it is always true.

Should it have been

} else if (getVersion() == version) {

version is a local variable where the deserialized-version is stored.

(I am struggling with another issue where after applying some patches - 
including HBASE-4388 Second start after migration from 90 to trunk crashes my 
version of hbase-92 HRegionInfo.readFields() tries to find HTD in HRegionInfo 
and fails)




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4940) hadoop-metrics.properties can include configuration of the rest context for ganglia

2011-12-06 Thread Mubarak Seyed (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164047#comment-13164047
 ] 

Mubarak Seyed commented on HBASE-4940:
--

Waiting for corporate approval to contribute this patch. Thanks.

 hadoop-metrics.properties can include configuration of the rest context for 
 ganglia
 -

 Key: HBASE-4940
 URL: https://issues.apache.org/jira/browse/HBASE-4940
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.90.5
 Environment: HBase-0.90.1
Reporter: Mubarak Seyed
Priority: Minor
  Labels: hbase-rest
 Fix For: 0.90.5


 It appears from hadoop-metrics.properties that configuration for rest context 
 is missing. It would be good if we add the rest context and commented out 
 them, if anyone is using rest-server and if they want to monitor using 
 ganglia context then they can uncomment the rest context and use them for 
 rest-server monitoring using ganglia.
 {code}
 # Configuration of the rest context for ganglia
 #rest.class=org.apache.hadoop.metrics.ganglia.GangliaContext
 #rest.period=10
 #rest.servers=ganglia-metad-hostname:port
 {code}
 Working on the patch, will submit it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4720) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server

2011-12-06 Thread Mubarak Seyed (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164046#comment-13164046
 ] 

Mubarak Seyed commented on HBASE-4720:
--

Waiting for corporate approval to contribute this patch. Thanks.

 Implement atomic update operations (checkAndPut, checkAndDelete) for REST 
 client/server 
 

 Key: HBASE-4720
 URL: https://issues.apache.org/jira/browse/HBASE-4720
 Project: HBase
  Issue Type: Improvement
Reporter: Daniel Lord

 I have several large application/HBase clusters where an application node 
 will occasionally need to talk to HBase from a different cluster.  In order 
 to help ensure some of my consistency guarantees I have a sentinel table that 
 is updated atomically as users interact with the system.  This works quite 
 well for the regular hbase client but the REST client does not implement 
 the checkAndPut and checkAndDelete operations.  This exposes the application 
 to some race conditions that have to be worked around.  It would be ideal if 
 the same checkAndPut/checkAndDelete operations could be supported by the REST 
 client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4957) Clean up some log messages, code in RecoverableZooKeeper

2011-12-06 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164050#comment-13164050
 ] 

ramkrishna.s.vasudevan commented on HBASE-4957:
---

+1 on patch

 Clean up some log messages, code in RecoverableZooKeeper
 

 Key: HBASE-4957
 URL: https://issues.apache.org/jira/browse/HBASE-4957
 Project: HBase
  Issue Type: Improvement
  Components: zookeeper
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Fix For: 0.94.0

 Attachments: hbase-4957.txt, hbase-4957.txt


 In RecoverableZooKeeper, there are a number of log messages and comments 
 which don't really read correctly, and some other pieces of code that can be 
 cleaned up. Simple cleanup - shouldn't be any actual behavioral changes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4951) master process can not be stopped when it is initializing

2011-12-06 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164072#comment-13164072
 ] 

ramkrishna.s.vasudevan commented on HBASE-4951:
---

The master is getting stopped in 0.92. I tried it. 
Correct me if am wrong.

If it is ok, Can we close this issue?

 master process can not be stopped when it is initializing
 -

 Key: HBASE-4951
 URL: https://issues.apache.org/jira/browse/HBASE-4951
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.3
Reporter: xufeng
Priority: Critical
 Fix For: 0.92.0, 0.90.5


 It is easy to reproduce by following step:
 step1:start master process.(do not start regionserver process in the cluster).
 the master will wait the regionserver to check in:
 org.apache.hadoop.hbase.master.ServerManager: Waiting on regionserver(s) to 
 checkin
 step2:stop the master by sh command bin/hbase master stop
 result:the master process will never die because catalogTracker.waitForRoot() 
 method will block unitl the root region assigned.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

2011-12-06 Thread chunhui shen (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chunhui shen updated HBASE-4880:


Attachment: hbase-4880v2.patch

 Region is on service before completing openRegionHanlder, may cause data loss
 -

 Key: HBASE-4880
 URL: https://issues.apache.org/jira/browse/HBASE-4880
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: hbase-4880.patch, hbase-4880v2.patch


 OpenRegionHandler in regionserver is processed as the following steps:
 {code}
 1.openregion()(Through it, closed = false, closing = false)
 2.addToOnlineRegions(region)
 3.update .meta. table 
 4.update ZK's node state to RS_ZK_REGION_OPEND
 {code}
 We can find that region is on service before Step 4.
 It means client could put data to this region after step 3.
 What will happen if step 4 is failed processing?
 It will execute OpenRegionHandler#cleanupFailedOpen which will do closing 
 region, and master assign this region to another regionserver.
 If closing region is failed, the data which is put between step 3 and step 4 
 may loss, because the region has been opend on another regionserver and be 
 put new data. Therefore, it may not be recoverd through replayRecoveredEdit() 
 because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4908) HBase cluster test tool (port from 0.89-fb)

2011-12-06 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164079#comment-13164079
 ] 

Phabricator commented on HBASE-4908:


mbautin has commented on the revision [jira] [HBASE-4908] HBase cluster test 
tool (port from 0.89-fb).

  Thanks for reviews, Nicolas and Stack! See responses below. A new version of 
the code will follow. Still have to re-run the unit tests (I've been having 
some trouble with those recently -- http://pastebin.com/1G1ZcPeV) and 
sanity-check the command-line load tester. On a side note, here are some stats 
from my recent load test run on a 5-node cluster:

  11/12/06 18:30:25 INFO util.MultiThreadedAction: [W:21] Keys=17180542, 
cols=819.4m, time=27:49:20 Overall: [keys/s= 171, latency=116 ms] Current: 
[keys/s=219, latency=90 ms], insertedUpTo=17180495, insertedQSize=26
  11/12/06 18:30:25 INFO util.MultiThreadedAction: [R:10] Keys=250690067, 
cols=11.8g, time=27:49:20 Overall: [keys/s= 2502, latency=3 ms] Current: 
[keys/s=261, latency=38 ms], verified=250690067

  (The number of writer's threads is reported as 21 because there is an 
inserted keys tracker thread that keeps track of the most recent contiguous 
key written by all writers.)

INLINE COMMENTS
  src/test/java/org/apache/hadoop/hbase/util/IntegrationTestTool.java:129 
Changed this to 80. There is no standard way to get terminal width in Java. 
http://stackoverflow.com/questions/1286461/can-i-find-the-console-width-with-java

  src/test/java/org/apache/hadoop/hbase/util/LoadTest.java:40 This is the 
command-line part of LoadTest, so it should not run as part of the test suite. 
The same multithreaded writer/verifier code is reused in a couple of large 
unit tests, namely TestMiniClusterLoad{Parallel,Sequential}.

  I added a line about the differences between this load tester and 
PerformanceEvaluation.

  Renamed this to LoadTestTool.
  src/test/java/org/apache/hadoop/hbase/util/LoadTest.java:318 We first wait 
for all writers to finish and then wait for all readers to finish, so we should 
exit when all of those threads stop.
  src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java:37 Yes, this 
can be converted to an IntegrationTest. I think making this a unit test was 
proposed in the past but the controversy was that it spawns a bunch of child 
processes and extra care should be taken to shut down all of them. We can leave 
this as a command-line tool for now and convert to a unit test when it is more 
stable.
  src/main/java/org/apache/hadoop/hbase/util/Bytes.java:1658-1680 Removed these 
methods and reused the existing RegionSetting.HexStringSplit class.
  src/main/java/org/apache/hadoop/hbase/EmptyWatcher.java:28 I moved this to 
from the test jar to the main jar because of the waitForBaseZNode method that I 
added to ZKUtil. We need this kind of watcher for quick ZK checks when we don't 
want to use a callback. Passing a null instead of a watcher does not work.
  src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java:176 
Not sure what changes you are talking about. I made a few conf options that I 
needed to use constants in HConstants.
  src/main/java/org/apache/hadoop/hbase/util/Keying.java:41-43 Removed this.
  src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:1820-1821 
Done. Thanks!
  src/test/java/org/apache/hadoop/hbase/util/IntegrationTestTool.java:39 
Renamed this to AbstractHBaseTool and moved to the main jar, since this is 
general-purpose command line processing functionality. This also avoids the 
confusion with the IntegrationTest notation (thanks for the link!)  I think we 
will be creating a lot of integration tests very soon.

REVISION DETAIL
  https://reviews.facebook.net/D549


 HBase cluster test tool (port from 0.89-fb)
 ---

 Key: HBASE-4908
 URL: https://issues.apache.org/jira/browse/HBASE-4908
 Project: HBase
  Issue Type: Sub-task
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: 0001-HBase-cluster-test-tool.patch, D549.1.patch, 
 D549.2.patch, D549.3.patch, D549.4.patch, D549.5.patch, D549.6.patch, 
 D549.7.patch


 Porting one of our HBase cluster test tools (a single-process multi-threaded 
 load generator and verifier) from 0.89-fb to trunk.
 I cleaned up the code a bit compared to what's in 0.89-fb, and discovered 
 that it has some features that I have not tried yet (some kind of a kill 
 test, and some way to run HBase as multiple processes on one machine).
 The main utility of this piece of code for us has been the HBaseClusterTest 
 command-line tool (called HBaseTest in 0.89-fb), which we usually invoke as a 
 load test in our five-node dev cluster testing, e.g.:
 hbase org.apache.hadoop.hbase.manual.HBaseTest -load 10:50:100:20 -tn 
 load_test -read 

[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

2011-12-06 Thread chunhui shen (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164083#comment-13164083
 ] 

chunhui shen commented on HBASE-4880:
-

@all
what about the patchv2?
I remove {code}
addToOnlineRegions(r);
{code} from HReionserver#postOpenDeployTasks.
It means that step2 and step3 are independent now, 
After calling postOpenDeployTasks, we should add region to onlineregions timely 
by the transaction

 Region is on service before completing openRegionHanlder, may cause data loss
 -

 Key: HBASE-4880
 URL: https://issues.apache.org/jira/browse/HBASE-4880
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: hbase-4880.patch, hbase-4880v2.patch


 OpenRegionHandler in regionserver is processed as the following steps:
 {code}
 1.openregion()(Through it, closed = false, closing = false)
 2.addToOnlineRegions(region)
 3.update .meta. table 
 4.update ZK's node state to RS_ZK_REGION_OPEND
 {code}
 We can find that region is on service before Step 4.
 It means client could put data to this region after step 3.
 What will happen if step 4 is failed processing?
 It will execute OpenRegionHandler#cleanupFailedOpen which will do closing 
 region, and master assign this region to another regionserver.
 If closing region is failed, the data which is put between step 3 and step 4 
 may loss, because the region has been opend on another regionserver and be 
 put new data. Therefore, it may not be recoverd through replayRecoveredEdit() 
 because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4908) HBase cluster test tool (port from 0.89-fb)

2011-12-06 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-4908:
---

Attachment: D549.8.patch

mbautin updated the revision [jira] [HBASE-4908] HBase cluster test tool (port 
from 0.89-fb).
Reviewers: stack, Karthik, Kannan, nspiegelberg, JIRA

  Addressing Stack's and Nicolas's comments. Still need to re-run the unit 
tests.

REVISION DETAIL
  https://reviews.facebook.net/D549

AFFECTED FILES
  pom.xml
  src/main/java/org/apache/hadoop/hbase/EmptyWatcher.java
  src/test/java/org/apache/hadoop/hbase/EmptyWatcher.java
  src/main/java/org/apache/hadoop/hbase/HConstants.java
  src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
  src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java
  src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
  
src/main/java/org/apache/hadoop/hbase/regionserver/ConstantSizeRegionSplitPolicy.java
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
  
src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java
  src/main/java/org/apache/hadoop/hbase/util/AbstractHBaseTool.java
  src/main/java/org/apache/hadoop/hbase/util/Bytes.java
  src/main/java/org/apache/hadoop/hbase/util/HMerge.java
  src/main/java/org/apache/hadoop/hbase/util/Keying.java
  src/main/java/org/apache/hadoop/hbase/util/RegionSplitter.java
  src/main/java/org/apache/hadoop/hbase/zookeeper/ZKConfig.java
  src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java
  src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
  src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java
  src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionSplitPolicy.java
  src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java
  src/test/java/org/apache/hadoop/hbase/util/LoadTestKVGenerator.java
  src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java
  src/test/java/org/apache/hadoop/hbase/util/MultiThreadedAction.java
  src/test/java/org/apache/hadoop/hbase/util/MultiThreadedReader.java
  src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java
  src/test/java/org/apache/hadoop/hbase/util/ProcessBasedLocalHBaseCluster.java
  src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java
  src/test/java/org/apache/hadoop/hbase/util/TestBytes.java
  src/test/java/org/apache/hadoop/hbase/util/TestLoadTestKVGenerator.java
  src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java
  src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java
  src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java


 HBase cluster test tool (port from 0.89-fb)
 ---

 Key: HBASE-4908
 URL: https://issues.apache.org/jira/browse/HBASE-4908
 Project: HBase
  Issue Type: Sub-task
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: 0001-HBase-cluster-test-tool.patch, D549.1.patch, 
 D549.2.patch, D549.3.patch, D549.4.patch, D549.5.patch, D549.6.patch, 
 D549.7.patch, D549.8.patch


 Porting one of our HBase cluster test tools (a single-process multi-threaded 
 load generator and verifier) from 0.89-fb to trunk.
 I cleaned up the code a bit compared to what's in 0.89-fb, and discovered 
 that it has some features that I have not tried yet (some kind of a kill 
 test, and some way to run HBase as multiple processes on one machine).
 The main utility of this piece of code for us has been the HBaseClusterTest 
 command-line tool (called HBaseTest in 0.89-fb), which we usually invoke as a 
 load test in our five-node dev cluster testing, e.g.:
 hbase org.apache.hadoop.hbase.manual.HBaseTest -load 10:50:100:20 -tn 
 load_test -read 1:10:50:20 -zk zk_quorum -bloom ROWCOL -compression 
 GZIP
 I will be using this code to load-test the delta encoding patch and making 
 fixes, but I am submitting the patch for early feedback. I will probably try 
 out its other functionality and comment on how it works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4729) Clash between region unassign and splitting kills the master

2011-12-06 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164085#comment-13164085
 ] 

Hudson commented on HBASE-4729:
---

Integrated in HBase-0.92 #173 (See 
[https://builds.apache.org/job/HBase-0.92/173/])
HBASE-4729 Clash between region unassign and splitting kills the master

stack : 
Files : 
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKTable.java


 Clash between region unassign and splitting kills the master
 

 Key: HBASE-4729
 URL: https://issues.apache.org/jira/browse/HBASE-4729
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Assignee: stack
Priority: Critical
 Fix For: 0.92.0, 0.94.0

 Attachments: 4729-v2.txt, 4729-v3.txt, 4729-v4.txt, 4729-v5.txt, 
 4729-v6-092.txt, 4729-v6-trunk.txt, 4729.txt


 I was running an online alter while regions were splitting, and suddenly the 
 master died and left my table half-altered (haven't restarted the master yet).
 What killed the master:
 {quote}
 2011-11-02 17:06:44,428 FATAL org.apache.hadoop.hbase.master.HMaster: 
 Unexpected ZK exception creating node CLOSING
 org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = 
 NodeExists for /hbase/unassigned/f7e1783e65ea8d621a4bc96ad310f101
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:110)
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
 at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637)
 at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:459)
 at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:441)
 at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:769)
 at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.createNodeClosing(ZKAssign.java:568)
 at 
 org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1722)
 at 
 org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1661)
 at org.apache.hadoop.hbase.master.BulkReOpen$1.run(BulkReOpen.java:69)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 {quote}
 A znode was created because the region server was splitting the region 4 
 seconds before:
 {quote}
 2011-11-02 17:06:40,704 INFO 
 org.apache.hadoop.hbase.regionserver.SplitTransaction: Starting split of 
 region TestTable,0012469153,1320253135043.f7e1783e65ea8d621a4bc96ad310f101.
 2011-11-02 17:06:40,704 DEBUG 
 org.apache.hadoop.hbase.regionserver.SplitTransaction: 
 regionserver:62023-0x132f043bbde0710 Creating ephemeral node for 
 f7e1783e65ea8d621a4bc96ad310f101 in SPLITTING state
 2011-11-02 17:06:40,751 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 regionserver:62023-0x132f043bbde0710 Attempting to transition node 
 f7e1783e65ea8d621a4bc96ad310f101 from RS_ZK_REGION_SPLITTING to 
 RS_ZK_REGION_SPLITTING
 ...
 2011-11-02 17:06:44,061 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 regionserver:62023-0x132f043bbde0710 Successfully transitioned node 
 f7e1783e65ea8d621a4bc96ad310f101 from RS_ZK_REGION_SPLITTING to 
 RS_ZK_REGION_SPLIT
 2011-11-02 17:06:44,061 INFO 
 org.apache.hadoop.hbase.regionserver.SplitTransaction: Still waiting on the 
 master to process the split for f7e1783e65ea8d621a4bc96ad310f101
 {quote}
 Now that the master is dead the region server is spewing those last two lines 
 like mad.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4908) HBase cluster test tool (port from 0.89-fb)

2011-12-06 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164087#comment-13164087
 ] 

Hadoop QA commented on HBASE-4908:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12506381/D549.8.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 52 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/455//console

This message is automatically generated.

 HBase cluster test tool (port from 0.89-fb)
 ---

 Key: HBASE-4908
 URL: https://issues.apache.org/jira/browse/HBASE-4908
 Project: HBase
  Issue Type: Sub-task
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: 0001-HBase-cluster-test-tool.patch, D549.1.patch, 
 D549.2.patch, D549.3.patch, D549.4.patch, D549.5.patch, D549.6.patch, 
 D549.7.patch, D549.8.patch


 Porting one of our HBase cluster test tools (a single-process multi-threaded 
 load generator and verifier) from 0.89-fb to trunk.
 I cleaned up the code a bit compared to what's in 0.89-fb, and discovered 
 that it has some features that I have not tried yet (some kind of a kill 
 test, and some way to run HBase as multiple processes on one machine).
 The main utility of this piece of code for us has been the HBaseClusterTest 
 command-line tool (called HBaseTest in 0.89-fb), which we usually invoke as a 
 load test in our five-node dev cluster testing, e.g.:
 hbase org.apache.hadoop.hbase.manual.HBaseTest -load 10:50:100:20 -tn 
 load_test -read 1:10:50:20 -zk zk_quorum -bloom ROWCOL -compression 
 GZIP
 I will be using this code to load-test the delta encoding patch and making 
 fixes, but I am submitting the patch for early feedback. I will probably try 
 out its other functionality and comment on how it works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4970) Add a parameter to change keepAliveTime of Htable thread pool.

2011-12-06 Thread gaojinchao (Created) (JIRA)
Add a parameter  to change keepAliveTime of Htable thread pool.
---

 Key: HBASE-4970
 URL: https://issues.apache.org/jira/browse/HBASE-4970
 Project: HBase
  Issue Type: Improvement
  Components: client
Reporter: gaojinchao
Assignee: gaojinchao
Priority: Trivial


In my cluster, I changed keepAliveTime from 60 s to 3600 s.  Increasing RES is 
slowed down.

Why increasing keepAliveTime of HBase thread pool is slowing down our problem 
occurance [RES value increase]?

You can go through the source of sun.nio.ch.Util. Every thread hold 3 
softreference of direct buffer(mustangsrc) for reusage. The code names the 3 
softreferences buffercache. If the buffer was all occupied or none was suitable 
in size, and new request comes, new direct buffer is allocated. After the 
service, the bigger one replaces the smaller one in buffercache. The replaced 
buffer is released.

So I think we can add a parameter to change keepAliveTime of Htable thread pool.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4970) Add a parameter to change keepAliveTime of Htable thread pool.

2011-12-06 Thread gaojinchao (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gaojinchao updated HBASE-4970:
--

Affects Version/s: 0.90.4
Fix Version/s: 0.90.5

 Add a parameter  to change keepAliveTime of Htable thread pool.
 ---

 Key: HBASE-4970
 URL: https://issues.apache.org/jira/browse/HBASE-4970
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.90.4
Reporter: gaojinchao
Assignee: gaojinchao
Priority: Trivial
 Fix For: 0.90.5


 In my cluster, I changed keepAliveTime from 60 s to 3600 s.  Increasing RES 
 is slowed down.
 Why increasing keepAliveTime of HBase thread pool is slowing down our problem 
 occurance [RES value increase]?
 You can go through the source of sun.nio.ch.Util. Every thread hold 3 
 softreference of direct buffer(mustangsrc) for reusage. The code names the 3 
 softreferences buffercache. If the buffer was all occupied or none was 
 suitable in size, and new request comes, new direct buffer is allocated. 
 After the service, the bigger one replaces the smaller one in buffercache. 
 The replaced buffer is released.
 So I think we can add a parameter to change keepAliveTime of Htable thread 
 pool.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4908) HBase cluster test tool (port from 0.89-fb)

2011-12-06 Thread Mikhail Bautin (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-4908:
--

Status: Open  (was: Patch Available)

 HBase cluster test tool (port from 0.89-fb)
 ---

 Key: HBASE-4908
 URL: https://issues.apache.org/jira/browse/HBASE-4908
 Project: HBase
  Issue Type: Sub-task
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: 0001-HBase-cluster-test-tool.patch, 
 0002-HBase-cluster-test-tool.patch, D549.1.patch, D549.2.patch, D549.3.patch, 
 D549.4.patch, D549.5.patch, D549.6.patch, D549.7.patch, D549.8.patch


 Porting one of our HBase cluster test tools (a single-process multi-threaded 
 load generator and verifier) from 0.89-fb to trunk.
 I cleaned up the code a bit compared to what's in 0.89-fb, and discovered 
 that it has some features that I have not tried yet (some kind of a kill 
 test, and some way to run HBase as multiple processes on one machine).
 The main utility of this piece of code for us has been the HBaseClusterTest 
 command-line tool (called HBaseTest in 0.89-fb), which we usually invoke as a 
 load test in our five-node dev cluster testing, e.g.:
 hbase org.apache.hadoop.hbase.manual.HBaseTest -load 10:50:100:20 -tn 
 load_test -read 1:10:50:20 -zk zk_quorum -bloom ROWCOL -compression 
 GZIP
 I will be using this code to load-test the delta encoding patch and making 
 fixes, but I am submitting the patch for early feedback. I will probably try 
 out its other functionality and comment on how it works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4908) HBase cluster test tool (port from 0.89-fb)

2011-12-06 Thread Mikhail Bautin (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-4908:
--

Status: Patch Available  (was: Open)

 HBase cluster test tool (port from 0.89-fb)
 ---

 Key: HBASE-4908
 URL: https://issues.apache.org/jira/browse/HBASE-4908
 Project: HBase
  Issue Type: Sub-task
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: 0001-HBase-cluster-test-tool.patch, 
 0002-HBase-cluster-test-tool.patch, D549.1.patch, D549.2.patch, D549.3.patch, 
 D549.4.patch, D549.5.patch, D549.6.patch, D549.7.patch, D549.8.patch


 Porting one of our HBase cluster test tools (a single-process multi-threaded 
 load generator and verifier) from 0.89-fb to trunk.
 I cleaned up the code a bit compared to what's in 0.89-fb, and discovered 
 that it has some features that I have not tried yet (some kind of a kill 
 test, and some way to run HBase as multiple processes on one machine).
 The main utility of this piece of code for us has been the HBaseClusterTest 
 command-line tool (called HBaseTest in 0.89-fb), which we usually invoke as a 
 load test in our five-node dev cluster testing, e.g.:
 hbase org.apache.hadoop.hbase.manual.HBaseTest -load 10:50:100:20 -tn 
 load_test -read 1:10:50:20 -zk zk_quorum -bloom ROWCOL -compression 
 GZIP
 I will be using this code to load-test the delta encoding patch and making 
 fixes, but I am submitting the patch for early feedback. I will probably try 
 out its other functionality and comment on how it works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4908) HBase cluster test tool (port from 0.89-fb)

2011-12-06 Thread Mikhail Bautin (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-4908:
--

Attachment: 0002-HBase-cluster-test-tool.patch

Uploading a patch for Jenkins testing.

 HBase cluster test tool (port from 0.89-fb)
 ---

 Key: HBASE-4908
 URL: https://issues.apache.org/jira/browse/HBASE-4908
 Project: HBase
  Issue Type: Sub-task
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: 0001-HBase-cluster-test-tool.patch, 
 0002-HBase-cluster-test-tool.patch, D549.1.patch, D549.2.patch, D549.3.patch, 
 D549.4.patch, D549.5.patch, D549.6.patch, D549.7.patch, D549.8.patch


 Porting one of our HBase cluster test tools (a single-process multi-threaded 
 load generator and verifier) from 0.89-fb to trunk.
 I cleaned up the code a bit compared to what's in 0.89-fb, and discovered 
 that it has some features that I have not tried yet (some kind of a kill 
 test, and some way to run HBase as multiple processes on one machine).
 The main utility of this piece of code for us has been the HBaseClusterTest 
 command-line tool (called HBaseTest in 0.89-fb), which we usually invoke as a 
 load test in our five-node dev cluster testing, e.g.:
 hbase org.apache.hadoop.hbase.manual.HBaseTest -load 10:50:100:20 -tn 
 load_test -read 1:10:50:20 -zk zk_quorum -bloom ROWCOL -compression 
 GZIP
 I will be using this code to load-test the delta encoding patch and making 
 fixes, but I am submitting the patch for early feedback. I will probably try 
 out its other functionality and comment on how it works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4893) HConnectionImplementation closed-but-not-deleted, need a way to find the state of connection

2011-12-06 Thread Mubarak Seyed (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164090#comment-13164090
 ] 

Mubarak Seyed commented on HBASE-4893:
--

HConnectionManager.deleteStaleConnection() is part of 0.92 but i would like to 
fix it in 0.90.5. Can we back-port deleteStaleConnection() to 0.90.5?
If i use HConnectionManager.deleteConnection(connection, true) in 0.90.5, it is 
not going to clean-up the stale connection unless connection.isZeroReference() 
== true but it is not as getConnection(Conf) increments the count 

{code}
 public static HConnection getConnection(Configuration conf)
  throws ZooKeeperConnectionException {
HConnectionKey connectionKey = new HConnectionKey(conf);
synchronized (HBASE_INSTANCES) {
  ..
  connection.incCount();
  return connection;
}
  }
{code}

We need the following methods in 0.90.5 

{code}
 public static void deleteStaleConnection(HConnection connection) {
deleteConnection(connection, true, true);
  }

 private static void deleteConnection(HConnection connection, boolean stopProxy,
  boolean staleConnection) {
synchronized (HBASE_INSTANCES) {
  for (EntryHConnectionKey, HConnectionImplementation connectionEntry : 
HBASE_INSTANCES
  .entrySet()) {
if (connectionEntry.getValue() == connection) {
  deleteConnection(connectionEntry.getKey(), stopProxy, 
staleConnection);
  break;
}
  }
}
  }

  private static void deleteConnection(HConnectionKey connectionKey,
  boolean stopProxy, boolean staleConnection) {
synchronized (HBASE_INSTANCES) {
  HConnectionImplementation connection = HBASE_INSTANCES
  .get(connectionKey);
  if (connection != null) {
connection.decCount();
if (connection.isZeroReference() || staleConnection) {
  HBASE_INSTANCES.remove(connectionKey);
  connection.close(stopProxy);
} else if (stopProxy) {
  connection.stopProxyOnClose(stopProxy);
}
  }
}
  }
{code}

and deleteConnection(conf, stopProxy) needs to call deleteConnection(new 
HConnectionKey(conf), stopProxy, false)

{code}
  public static void deleteConnection(Configuration conf, boolean stopProxy) {
deleteConnection(new HConnectionKey(conf), stopProxy, false);
  }
{code}

 HConnectionImplementation closed-but-not-deleted, need a way to find the 
 state of connection
 

 Key: HBASE-4893
 URL: https://issues.apache.org/jira/browse/HBASE-4893
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.1, 0.90.2, 0.90.3, 0.90.4
 Environment: Linux 2.6, HBase-0.90.1
Reporter: Mubarak Seyed
  Labels: hbase-client
 Fix For: 0.90.1, 0.90.5


 In abort() of HConnectionManager$HConnectionImplementation, instance of 
 HConnectionImplementation is marked as this.closed=true.
 There is no way for client application to check the hbase client connection 
 whether it is still opened/good (this.closed=false) or not. We need a method 
 to validate the state of a connection like isClosed().
 {code}
 public boolean isClosed(){
return this.closed;
 } 
 {code}
 Once the connection is closed and it should get deleted. Client application 
 still gets a connection from HConnectionManager.getConnection(Configuration) 
 and tries to make a RPC call to RS, since connection is already closed, 
 HConnectionImplementation.getRegionServerWithRetries throws 
 RetriesExhaustedException with error message
 {code}
 Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying 
 to contact region server null for region , row 
 '----xxx', but failed after 10 attempts.
 Exceptions:
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 

[jira] [Updated] (HBASE-4893) HConnectionImplementation closed-but-not-deleted, need a way to find the state of connection

2011-12-06 Thread Mubarak Seyed (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mubarak Seyed updated HBASE-4893:
-

Labels: noob  (was: hbase-client)

 HConnectionImplementation closed-but-not-deleted, need a way to find the 
 state of connection
 

 Key: HBASE-4893
 URL: https://issues.apache.org/jira/browse/HBASE-4893
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.1, 0.90.2, 0.90.3, 0.90.4
 Environment: Linux 2.6, HBase-0.90.1
Reporter: Mubarak Seyed
  Labels: noob
 Fix For: 0.90.1, 0.90.5


 In abort() of HConnectionManager$HConnectionImplementation, instance of 
 HConnectionImplementation is marked as this.closed=true.
 There is no way for client application to check the hbase client connection 
 whether it is still opened/good (this.closed=false) or not. We need a method 
 to validate the state of a connection like isClosed().
 {code}
 public boolean isClosed(){
return this.closed;
 } 
 {code}
 Once the connection is closed and it should get deleted. Client application 
 still gets a connection from HConnectionManager.getConnection(Configuration) 
 and tries to make a RPC call to RS, since connection is already closed, 
 HConnectionImplementation.getRegionServerWithRetries throws 
 RetriesExhaustedException with error message
 {code}
 Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying 
 to contact region server null for region , row 
 '----xxx', but failed after 10 attempts.
 Exceptions:
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1008)
   at org.apache.hadoop.hbase.client.HTable.get(HTable.java:546)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

2011-12-06 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164104#comment-13164104
 ] 

Ted Yu commented on HBASE-4880:
---

Did RestAdmin pass for patch v2?

 Region is on service before completing openRegionHanlder, may cause data loss
 -

 Key: HBASE-4880
 URL: https://issues.apache.org/jira/browse/HBASE-4880
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: hbase-4880.patch, hbase-4880v2.patch


 OpenRegionHandler in regionserver is processed as the following steps:
 {code}
 1.openregion()(Through it, closed = false, closing = false)
 2.addToOnlineRegions(region)
 3.update .meta. table 
 4.update ZK's node state to RS_ZK_REGION_OPEND
 {code}
 We can find that region is on service before Step 4.
 It means client could put data to this region after step 3.
 What will happen if step 4 is failed processing?
 It will execute OpenRegionHandler#cleanupFailedOpen which will do closing 
 region, and master assign this region to another regionserver.
 If closing region is failed, the data which is put between step 3 and step 4 
 may loss, because the region has been opend on another regionserver and be 
 put new data. Therefore, it may not be recoverd through replayRecoveredEdit() 
 because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

2011-12-06 Thread Ted Yu (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164104#comment-13164104
 ] 

Ted Yu edited comment on HBASE-4880 at 12/7/11 3:12 AM:


Did TestAdmin pass for patch v2?

  was (Author: yuzhih...@gmail.com):
Did RestAdmin pass for patch v2?
  
 Region is on service before completing openRegionHanlder, may cause data loss
 -

 Key: HBASE-4880
 URL: https://issues.apache.org/jira/browse/HBASE-4880
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: hbase-4880.patch, hbase-4880v2.patch


 OpenRegionHandler in regionserver is processed as the following steps:
 {code}
 1.openregion()(Through it, closed = false, closing = false)
 2.addToOnlineRegions(region)
 3.update .meta. table 
 4.update ZK's node state to RS_ZK_REGION_OPEND
 {code}
 We can find that region is on service before Step 4.
 It means client could put data to this region after step 3.
 What will happen if step 4 is failed processing?
 It will execute OpenRegionHandler#cleanupFailedOpen which will do closing 
 region, and master assign this region to another regionserver.
 If closing region is failed, the data which is put between step 3 and step 4 
 may loss, because the region has been opend on another regionserver and be 
 put new data. Therefore, it may not be recoverd through replayRecoveredEdit() 
 because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

2011-12-06 Thread chunhui shen (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164110#comment-13164110
 ] 

chunhui shen commented on HBASE-4880:
-

@Ted
TestAdmin has passed for patch v2, and you could try again.

 Region is on service before completing openRegionHanlder, may cause data loss
 -

 Key: HBASE-4880
 URL: https://issues.apache.org/jira/browse/HBASE-4880
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: hbase-4880.patch, hbase-4880v2.patch


 OpenRegionHandler in regionserver is processed as the following steps:
 {code}
 1.openregion()(Through it, closed = false, closing = false)
 2.addToOnlineRegions(region)
 3.update .meta. table 
 4.update ZK's node state to RS_ZK_REGION_OPEND
 {code}
 We can find that region is on service before Step 4.
 It means client could put data to this region after step 3.
 What will happen if step 4 is failed processing?
 It will execute OpenRegionHandler#cleanupFailedOpen which will do closing 
 region, and master assign this region to another regionserver.
 If closing region is failed, the data which is put between step 3 and step 4 
 may loss, because the region has been opend on another regionserver and be 
 put new data. Therefore, it may not be recoverd through replayRecoveredEdit() 
 because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4893) HConnectionImplementation closed-but-not-deleted, need a way to find the state of connection

2011-12-06 Thread Mubarak Seyed (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164111#comment-13164111
 ] 

Mubarak Seyed commented on HBASE-4893:
--

Waiting for corporate approval to contribute this patch. Thanks.

 HConnectionImplementation closed-but-not-deleted, need a way to find the 
 state of connection
 

 Key: HBASE-4893
 URL: https://issues.apache.org/jira/browse/HBASE-4893
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.1, 0.90.2, 0.90.3, 0.90.4
 Environment: Linux 2.6, HBase-0.90.1
Reporter: Mubarak Seyed
  Labels: noob
 Fix For: 0.90.1, 0.90.5


 In abort() of HConnectionManager$HConnectionImplementation, instance of 
 HConnectionImplementation is marked as this.closed=true.
 There is no way for client application to check the hbase client connection 
 whether it is still opened/good (this.closed=false) or not. We need a method 
 to validate the state of a connection like isClosed().
 {code}
 public boolean isClosed(){
return this.closed;
 } 
 {code}
 Once the connection is closed and it should get deleted. Client application 
 still gets a connection from HConnectionManager.getConnection(Configuration) 
 and tries to make a RPC call to RS, since connection is already closed, 
 HConnectionImplementation.getRegionServerWithRetries throws 
 RetriesExhaustedException with error message
 {code}
 Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying 
 to contact region server null for region , row 
 '----xxx', but failed after 10 attempts.
 Exceptions:
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1008)
   at org.apache.hadoop.hbase.client.HTable.get(HTable.java:546)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4893) HConnectionImplementation closed-but-not-deleted, need a way to find the state of connection

2011-12-06 Thread Mubarak Seyed (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mubarak Seyed updated HBASE-4893:
-

Affects Version/s: (was: 0.90.4)
   (was: 0.90.3)
   (was: 0.90.2)
Fix Version/s: (was: 0.90.1)

 HConnectionImplementation closed-but-not-deleted, need a way to find the 
 state of connection
 

 Key: HBASE-4893
 URL: https://issues.apache.org/jira/browse/HBASE-4893
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.1
 Environment: Linux 2.6, HBase-0.90.1
Reporter: Mubarak Seyed
  Labels: noob
 Fix For: 0.90.5


 In abort() of HConnectionManager$HConnectionImplementation, instance of 
 HConnectionImplementation is marked as this.closed=true.
 There is no way for client application to check the hbase client connection 
 whether it is still opened/good (this.closed=false) or not. We need a method 
 to validate the state of a connection like isClosed().
 {code}
 public boolean isClosed(){
return this.closed;
 } 
 {code}
 Once the connection is closed and it should get deleted. Client application 
 still gets a connection from HConnectionManager.getConnection(Configuration) 
 and tries to make a RPC call to RS, since connection is already closed, 
 HConnectionImplementation.getRegionServerWithRetries throws 
 RetriesExhaustedException with error message
 {code}
 Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying 
 to contact region server null for region , row 
 '----xxx', but failed after 10 attempts.
 Exceptions:
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1008)
   at org.apache.hadoop.hbase.client.HTable.get(HTable.java:546)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4970) Add a parameter to change keepAliveTime of Htable thread pool.

2011-12-06 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164119#comment-13164119
 ] 

ramkrishna.s.vasudevan commented on HBASE-4970:
---

+1 on this change.

 Add a parameter  to change keepAliveTime of Htable thread pool.
 ---

 Key: HBASE-4970
 URL: https://issues.apache.org/jira/browse/HBASE-4970
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.90.4
Reporter: gaojinchao
Assignee: gaojinchao
Priority: Trivial
 Fix For: 0.90.5


 In my cluster, I changed keepAliveTime from 60 s to 3600 s.  Increasing RES 
 is slowed down.
 Why increasing keepAliveTime of HBase thread pool is slowing down our problem 
 occurance [RES value increase]?
 You can go through the source of sun.nio.ch.Util. Every thread hold 3 
 softreference of direct buffer(mustangsrc) for reusage. The code names the 3 
 softreferences buffercache. If the buffer was all occupied or none was 
 suitable in size, and new request comes, new direct buffer is allocated. 
 After the service, the bigger one replaces the smaller one in buffercache. 
 The replaced buffer is released.
 So I think we can add a parameter to change keepAliveTime of Htable thread 
 pool.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4908) HBase cluster test tool (port from 0.89-fb)

2011-12-06 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164120#comment-13164120
 ] 

Hadoop QA commented on HBASE-4908:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12506382/0002-HBase-cluster-test-tool.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 74 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -160 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 74 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildHole
  
org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildOverlap
  org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildBase

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/456//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/456//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/456//console

This message is automatically generated.

 HBase cluster test tool (port from 0.89-fb)
 ---

 Key: HBASE-4908
 URL: https://issues.apache.org/jira/browse/HBASE-4908
 Project: HBase
  Issue Type: Sub-task
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: 0001-HBase-cluster-test-tool.patch, 
 0002-HBase-cluster-test-tool.patch, D549.1.patch, D549.2.patch, D549.3.patch, 
 D549.4.patch, D549.5.patch, D549.6.patch, D549.7.patch, D549.8.patch


 Porting one of our HBase cluster test tools (a single-process multi-threaded 
 load generator and verifier) from 0.89-fb to trunk.
 I cleaned up the code a bit compared to what's in 0.89-fb, and discovered 
 that it has some features that I have not tried yet (some kind of a kill 
 test, and some way to run HBase as multiple processes on one machine).
 The main utility of this piece of code for us has been the HBaseClusterTest 
 command-line tool (called HBaseTest in 0.89-fb), which we usually invoke as a 
 load test in our five-node dev cluster testing, e.g.:
 hbase org.apache.hadoop.hbase.manual.HBaseTest -load 10:50:100:20 -tn 
 load_test -read 1:10:50:20 -zk zk_quorum -bloom ROWCOL -compression 
 GZIP
 I will be using this code to load-test the delta encoding patch and making 
 fixes, but I am submitting the patch for early feedback. I will probably try 
 out its other functionality and comment on how it works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

2011-12-06 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164122#comment-13164122
 ] 

Zhihong Yu commented on HBASE-4880:
---

Patch v2 is cleaner.

Minor comment:
The change in RegionServerServices.java shouldn't put explanation for 
parameters on a separate line - parameter name and explanation should be on the 
same line.

Good job Chunhui.

 Region is on service before completing openRegionHanlder, may cause data loss
 -

 Key: HBASE-4880
 URL: https://issues.apache.org/jira/browse/HBASE-4880
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: hbase-4880.patch, hbase-4880v2.patch


 OpenRegionHandler in regionserver is processed as the following steps:
 {code}
 1.openregion()(Through it, closed = false, closing = false)
 2.addToOnlineRegions(region)
 3.update .meta. table 
 4.update ZK's node state to RS_ZK_REGION_OPEND
 {code}
 We can find that region is on service before Step 4.
 It means client could put data to this region after step 3.
 What will happen if step 4 is failed processing?
 It will execute OpenRegionHandler#cleanupFailedOpen which will do closing 
 region, and master assign this region to another regionserver.
 If closing region is failed, the data which is put between step 3 and step 4 
 may loss, because the region has been opend on another regionserver and be 
 put new data. Therefore, it may not be recoverd through replayRecoveredEdit() 
 because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4970) Add a parameter to change keepAliveTime of Htable thread pool.

2011-12-06 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164123#comment-13164123
 ] 

Zhihong Yu commented on HBASE-4970:
---

@Jinchao:
Can you upload a patch ?

Thanks

 Add a parameter  to change keepAliveTime of Htable thread pool.
 ---

 Key: HBASE-4970
 URL: https://issues.apache.org/jira/browse/HBASE-4970
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.90.4
Reporter: gaojinchao
Assignee: gaojinchao
Priority: Trivial
 Fix For: 0.90.5


 In my cluster, I changed keepAliveTime from 60 s to 3600 s.  Increasing RES 
 is slowed down.
 Why increasing keepAliveTime of HBase thread pool is slowing down our problem 
 occurance [RES value increase]?
 You can go through the source of sun.nio.ch.Util. Every thread hold 3 
 softreference of direct buffer(mustangsrc) for reusage. The code names the 3 
 softreferences buffercache. If the buffer was all occupied or none was 
 suitable in size, and new request comes, new direct buffer is allocated. 
 After the service, the bigger one replaces the smaller one in buffercache. 
 The replaced buffer is released.
 So I think we can add a parameter to change keepAliveTime of Htable thread 
 pool.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4970) Add a parameter to change keepAliveTime of Htable thread pool.

2011-12-06 Thread gaojinchao (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164127#comment-13164127
 ] 

gaojinchao commented on HBASE-4970:
---

ok, No problem. 

 Add a parameter  to change keepAliveTime of Htable thread pool.
 ---

 Key: HBASE-4970
 URL: https://issues.apache.org/jira/browse/HBASE-4970
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.90.4
Reporter: gaojinchao
Assignee: gaojinchao
Priority: Trivial
 Fix For: 0.90.5


 In my cluster, I changed keepAliveTime from 60 s to 3600 s.  Increasing RES 
 is slowed down.
 Why increasing keepAliveTime of HBase thread pool is slowing down our problem 
 occurance [RES value increase]?
 You can go through the source of sun.nio.ch.Util. Every thread hold 3 
 softreference of direct buffer(mustangsrc) for reusage. The code names the 3 
 softreferences buffercache. If the buffer was all occupied or none was 
 suitable in size, and new request comes, new direct buffer is allocated. 
 After the service, the bigger one replaces the smaller one in buffercache. 
 The replaced buffer is released.
 So I think we can add a parameter to change keepAliveTime of Htable thread 
 pool.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4224) Need a flush by regionserver rather than by table option

2011-12-06 Thread Akash Ashok (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164130#comment-13164130
 ] 

Akash Ashok commented on HBASE-4224:


I am done with the testing for this patch. But though when I was writing the 
test cases I could only think of one. Could some1 please help me out as to what 
different test cases should I be looking to cover for this patch ?



 Need a flush by regionserver rather than by table option
 

 Key: HBASE-4224
 URL: https://issues.apache.org/jira/browse/HBASE-4224
 Project: HBase
  Issue Type: Bug
  Components: shell
Reporter: stack
Assignee: Akash Ashok
 Attachments: HBase-4224.patch


 This evening needed to clean out logs on the cluster.  logs are by 
 regionserver.  to let go of logs, we need to have all edits emptied from 
 memory.  only flush is by table or region.  We need to be able to flush the 
 regionserver.  Need to add this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

2011-12-06 Thread chunhui shen (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164133#comment-13164133
 ] 

chunhui shen commented on HBASE-4880:
-

@Ted
I use the code style by the HBASE-3678.
It will put explanation for parameters on a separate line automatically.
Is there any problem of HBASE-3678 ?

 Region is on service before completing openRegionHanlder, may cause data loss
 -

 Key: HBASE-4880
 URL: https://issues.apache.org/jira/browse/HBASE-4880
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: hbase-4880.patch, hbase-4880v2.patch


 OpenRegionHandler in regionserver is processed as the following steps:
 {code}
 1.openregion()(Through it, closed = false, closing = false)
 2.addToOnlineRegions(region)
 3.update .meta. table 
 4.update ZK's node state to RS_ZK_REGION_OPEND
 {code}
 We can find that region is on service before Step 4.
 It means client could put data to this region after step 3.
 What will happen if step 4 is failed processing?
 It will execute OpenRegionHandler#cleanupFailedOpen which will do closing 
 region, and master assign this region to another regionserver.
 If closing region is failed, the data which is put between step 3 and step 4 
 may loss, because the region has been opend on another regionserver and be 
 put new data. Therefore, it may not be recoverd through replayRecoveredEdit() 
 because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

2011-12-06 Thread chunhui shen (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chunhui shen updated HBASE-4880:


Attachment: hbase-4880v3.patch

 Region is on service before completing openRegionHanlder, may cause data loss
 -

 Key: HBASE-4880
 URL: https://issues.apache.org/jira/browse/HBASE-4880
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch


 OpenRegionHandler in regionserver is processed as the following steps:
 {code}
 1.openregion()(Through it, closed = false, closing = false)
 2.addToOnlineRegions(region)
 3.update .meta. table 
 4.update ZK's node state to RS_ZK_REGION_OPEND
 {code}
 We can find that region is on service before Step 4.
 It means client could put data to this region after step 3.
 What will happen if step 4 is failed processing?
 It will execute OpenRegionHandler#cleanupFailedOpen which will do closing 
 region, and master assign this region to another regionserver.
 If closing region is failed, the data which is put between step 3 and step 4 
 may loss, because the region has been opend on another regionserver and be 
 put new data. Therefore, it may not be recoverd through replayRecoveredEdit() 
 because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4927) CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty

2011-12-06 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164148#comment-13164148
 ] 

Hudson commented on HBASE-4927:
---

Integrated in HBase-0.92 #174 (See 
[https://builds.apache.org/job/HBase-0.92/174/])
HBASE-4927 CatalogJanior:SplitParentFirstComparator doesn't sort as 
expected, for the last region when the endkey is empty

stack : 
Files : 
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/HRegionInfo.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/CatalogJanitor.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java


 CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the 
 last region when the endkey is empty
 ---

 Key: HBASE-4927
 URL: https://issues.apache.org/jira/browse/HBASE-4927
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.92.0

 Attachments: 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator-.patch, 
 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator_v2.patch, 
 hbase-4927-fix-ws.txt


 When reviewing HBASE-4238 backporting, Jon found this issue.
 What happens if the split points are  (empty end key is the last key, empty 
 start key is the first key)
 Parent [A,)
 L daughter [A,B), 
 R daughter [B,)
 When sorted, we gets to end key comparision which results in this incorrector 
 order:
 [A,B), [A,), [B,) 
 we wanted:
 [A,), [A,B), [B,)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

2011-12-06 Thread chunhui shen (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164149#comment-13164149
 ] 

chunhui shen commented on HBASE-4880:
-

@Ted
Amend the problem of putting explanation for parameters on a separate line in 
patchv3.
Please check.
Thanks

 Region is on service before completing openRegionHanlder, may cause data loss
 -

 Key: HBASE-4880
 URL: https://issues.apache.org/jira/browse/HBASE-4880
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch


 OpenRegionHandler in regionserver is processed as the following steps:
 {code}
 1.openregion()(Through it, closed = false, closing = false)
 2.addToOnlineRegions(region)
 3.update .meta. table 
 4.update ZK's node state to RS_ZK_REGION_OPEND
 {code}
 We can find that region is on service before Step 4.
 It means client could put data to this region after step 3.
 What will happen if step 4 is failed processing?
 It will execute OpenRegionHandler#cleanupFailedOpen which will do closing 
 region, and master assign this region to another regionserver.
 If closing region is failed, the data which is put between step 3 and step 4 
 may loss, because the region has been opend on another regionserver and be 
 put new data. Therefore, it may not be recoverd through replayRecoveredEdit() 
 because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

2011-12-06 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164154#comment-13164154
 ] 

Hadoop QA commented on HBASE-4880:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12506391/hbase-4880v3.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -160 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 72 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.regionserver.TestColumnSeeking

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/458//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/458//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/458//console

This message is automatically generated.

 Region is on service before completing openRegionHanlder, may cause data loss
 -

 Key: HBASE-4880
 URL: https://issues.apache.org/jira/browse/HBASE-4880
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch


 OpenRegionHandler in regionserver is processed as the following steps:
 {code}
 1.openregion()(Through it, closed = false, closing = false)
 2.addToOnlineRegions(region)
 3.update .meta. table 
 4.update ZK's node state to RS_ZK_REGION_OPEND
 {code}
 We can find that region is on service before Step 4.
 It means client could put data to this region after step 3.
 What will happen if step 4 is failed processing?
 It will execute OpenRegionHandler#cleanupFailedOpen which will do closing 
 region, and master assign this region to another regionserver.
 If closing region is failed, the data which is put between step 3 and step 4 
 may loss, because the region has been opend on another regionserver and be 
 put new data. Therefore, it may not be recoverd through replayRecoveredEdit() 
 because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

2011-12-06 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164155#comment-13164155
 ] 

Hadoop QA commented on HBASE-4880:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12506380/hbase-4880v2.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -160 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 72 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildHole
  org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildBase
  
org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildOverlap

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/457//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/457//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/457//console

This message is automatically generated.

 Region is on service before completing openRegionHanlder, may cause data loss
 -

 Key: HBASE-4880
 URL: https://issues.apache.org/jira/browse/HBASE-4880
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch


 OpenRegionHandler in regionserver is processed as the following steps:
 {code}
 1.openregion()(Through it, closed = false, closing = false)
 2.addToOnlineRegions(region)
 3.update .meta. table 
 4.update ZK's node state to RS_ZK_REGION_OPEND
 {code}
 We can find that region is on service before Step 4.
 It means client could put data to this region after step 3.
 What will happen if step 4 is failed processing?
 It will execute OpenRegionHandler#cleanupFailedOpen which will do closing 
 region, and master assign this region to another regionserver.
 If closing region is failed, the data which is put between step 3 and step 4 
 may loss, because the region has been opend on another regionserver and be 
 put new data. Therefore, it may not be recoverd through replayRecoveredEdit() 
 because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

2011-12-06 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164168#comment-13164168
 ] 

Ted Yu commented on HBASE-4880:
---

Please check failed tests. 
It seems trunk is broken now. 

 Region is on service before completing openRegionHanlder, may cause data loss
 -

 Key: HBASE-4880
 URL: https://issues.apache.org/jira/browse/HBASE-4880
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch


 OpenRegionHandler in regionserver is processed as the following steps:
 {code}
 1.openregion()(Through it, closed = false, closing = false)
 2.addToOnlineRegions(region)
 3.update .meta. table 
 4.update ZK's node state to RS_ZK_REGION_OPEND
 {code}
 We can find that region is on service before Step 4.
 It means client could put data to this region after step 3.
 What will happen if step 4 is failed processing?
 It will execute OpenRegionHandler#cleanupFailedOpen which will do closing 
 region, and master assign this region to another regionserver.
 If closing region is failed, the data which is put between step 3 and step 4 
 may loss, because the region has been opend on another regionserver and be 
 put new data. Therefore, it may not be recoverd through replayRecoveredEdit() 
 because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

2011-12-06 Thread chunhui shen (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164169#comment-13164169
 ] 

chunhui shen commented on HBASE-4880:
-

I think it is not related to this patch.
The same failed tests appear in 
https://builds.apache.org/job/PreCommit-HBASE-Build/456/testReport/

 Region is on service before completing openRegionHanlder, may cause data loss
 -

 Key: HBASE-4880
 URL: https://issues.apache.org/jira/browse/HBASE-4880
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch


 OpenRegionHandler in regionserver is processed as the following steps:
 {code}
 1.openregion()(Through it, closed = false, closing = false)
 2.addToOnlineRegions(region)
 3.update .meta. table 
 4.update ZK's node state to RS_ZK_REGION_OPEND
 {code}
 We can find that region is on service before Step 4.
 It means client could put data to this region after step 3.
 What will happen if step 4 is failed processing?
 It will execute OpenRegionHandler#cleanupFailedOpen which will do closing 
 region, and master assign this region to another regionserver.
 If closing region is failed, the data which is put between step 3 and step 4 
 may loss, because the region has been opend on another regionserver and be 
 put new data. Therefore, it may not be recoverd through replayRecoveredEdit() 
 because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4966) Put/Delete values cannot be tested with MRUnit

2011-12-06 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164190#comment-13164190
 ] 

Lars Hofhansl commented on HBASE-4966:
--

Do you think you can work out a patch Nicholas?

 Put/Delete values cannot be tested with MRUnit
 --

 Key: HBASE-4966
 URL: https://issues.apache.org/jira/browse/HBASE-4966
 Project: HBase
  Issue Type: Bug
  Components: client, mapreduce
Affects Versions: 0.90.4
Reporter: Nicholas Telford
Priority: Minor

 When using the IdentityTableReducer, which expects input values of either a 
 Put or Delete object, testing with MRUnit the Mapper with MRUnit is not 
 possible because neither Put nor Delete implement equals().
 We should implement equals() on both such that equality means:
 * Both objects are of the same class (in this case, Put or Delete)
 * Both objects are for the same key.
 * Both objects contain an equal set of KeyValues (applicable only to Put)
 KeyValue.equals() appears to already be implemented, but only checks for 
 equality of row key, column family and column qualifier - two KeyValues can 
 be considered equal if they contain different values. This won't work for 
 testing.
 Instead, the Put.equals() and Delete.equals() implementations should do a 
 deep equality check on their KeyValues, like this:
 {code:java}
 myKv.equals(theirKv)  Bytes.equals(myKv.getValue(), theirKv.getValue());
 {code}
 NOTE: This would impact any code that relies on the existing identity 
 implementation of Put.equals() and Delete.equals(), therefore cannot be 
 guaranteed to be backwards-compatible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4961) Lots of precommit builds hanging for days

2011-12-06 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164191#comment-13164191
 ] 

Lars Hofhansl commented on HBASE-4961:
--

I was hoping that these would all be hanging because of SecureRandom issues 
(blocking on /dev/random on Linux), but that does not appear to be the case.

 Lots of precommit builds hanging for days
 -

 Key: HBASE-4961
 URL: https://issues.apache.org/jira/browse/HBASE-4961
 Project: HBase
  Issue Type: Bug
  Components: build, test
Affects Versions: 0.92.0, 0.94.0
Reporter: Todd Lipcon
 Attachments: hbase-hung-builds.tar.gz


 I was logged into the ASF build machines and saw about 10-15 HBase precommit 
 builds that have been hung for weeks. I took a jstack of each, which I'll 
 attach here. I then kill -9ed them to free up the resources.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4936) Cached HRegionInterface connections crash when getting UnknownHost exceptions

2011-12-06 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164203#comment-13164203
 ] 

Hudson commented on HBASE-4936:
---

Integrated in HBase-TRUNK #2523 (See 
[https://builds.apache.org/job/HBase-TRUNK/2523/])
HBASE-4936 Cached HRegionInterface connections crash when getting 
UnknownHost exceptions

stack : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java


 Cached HRegionInterface connections crash when getting UnknownHost exceptions
 -

 Key: HBASE-4936
 URL: https://issues.apache.org/jira/browse/HBASE-4936
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Andrei Dragomir
Assignee: Andrei Dragomir
 Fix For: 0.94.0

 Attachments: HBASE-4936-v2.patch, HBASE-4936.patch


 This isssue is unlikely to come up in a cluster test case. However, for 
 development, the following thing happens: 
 1. Start the HBase cluster locally, on network A (DNS A, etc)
 2. The region locations are cached using the hostname 
 (mycomputer.company.com, 211.x.y.z - real ip)
 3. Change network location (go home)
 4. Start the HBase cluster locally. My hostname / ips are not different 
 (mycomputer, 192.168.0.130 - new ip)
 If the region locations have been cached using the hostname, there is an 
 UnknownHostException in CatalogTracker.getCachedConnection(ServerName sn), 
 uncaught in the catch statements. The server will crash constantly. 
 The error should be caught and not rethrown, so that the cached connection 
 expires normally. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4937) Error in Quick Start Shell Exercises

2011-12-06 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164204#comment-13164204
 ] 

Hudson commented on HBASE-4937:
---

Integrated in HBase-TRUNK #2523 (See 
[https://builds.apache.org/job/HBase-TRUNK/2523/])
HBASE-4937 Error in Quick Start Shell Exercises

stack : 
Files : 
* /hbase/trunk/src/docbkx/getting_started.xml


 Error in Quick Start Shell Exercises
 

 Key: HBASE-4937
 URL: https://issues.apache.org/jira/browse/HBASE-4937
 Project: HBase
  Issue Type: Bug
  Components: documentation
Reporter: Ryan Berdeen
Assignee: stack
 Fix For: 0.94.0

 Attachments: 4937.txt


 The shell exercises in the Quick Start 
 (http://hbase.apache.org/book/quickstart.html) starts
 {code}
 hbase(main):003:0 create 'test', 'cf'
 0 row(s) in 1.2200 seconds
 hbase(main):003:0 list 'table'
 test
 1 row(s) in 0.0550 seconds
 {code}
 It looks like the second command is wrong. Running it, the actual output is
 {code}
 hbase(main):001:0 create 'test', 'cf'
 0 row(s) in 0.3630 seconds
 hbase(main):002:0 list 'table'
 TABLE 
   
   
 0 row(s) in 0.0100 seconds
 {code}
 The argument to list should be 'test', not 'table', and the output in the 
 example is missing the {{TABLE}} line.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (HBASE-4956) Control direct memory buffer consumption by HBaseClient

2011-12-06 Thread Lars Hofhansl (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164199#comment-13164199
 ] 

Lars Hofhansl edited comment on HBASE-4956 at 12/7/11 7:18 AM:
---

We might not have to go all the way to use netty (although that would be nice).
If we find that it is possible to avoid calling 
HBaseClient.Connection.sendParam from the client thread, but have the call 
actually be made from the Connection thread (after the callable representing 
the operation was queued), we have limited the number of threads that will have 
cached DirectBuffer on their behalf.
Stack suggested the only reason for the direct call might be to pass errors 
back to the client. We could hand the client a Deferred or Future that will 
eventually hold any encountered exception, the client could (and would by 
default to keep the current synchronous behavior) also wait on that object.

  was (Author: lhofhansl):
We might not have to go all the way to use netty (although that would be 
nice).
If we find that it is possible to avoid calling 
HBaseClient.Connection.sendParam from the client thread, but have the call 
actually be made from the Connection thread (after the callable representing 
the operation was queued), we have limited the number of threads that will have 
cached DirectBuffer on their behalf.
Stack suggested the only reason for the direct call might be to pass errors 
back to the client. We could hand the client a Deferred or Future that will 
eventually hold any encountered exception, the client could (and would be 
default to keep the current synchronous behavior) also wait on that object.
  
 Control direct memory buffer consumption by HBaseClient
 ---

 Key: HBASE-4956
 URL: https://issues.apache.org/jira/browse/HBASE-4956
 Project: HBase
  Issue Type: New Feature
Reporter: Ted Yu

 As Jonathan explained here 
 https://groups.google.com/group/asynchbase/browse_thread/thread/c45bc7ba788b2357?pli=1
  , standard hbase client inadvertently consumes large amount of direct memory.
 We should consider using netty for NIO-related tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4968) Add to troubleshooting workaround for direct buffer oome's.

2011-12-06 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164201#comment-13164201
 ] 

Hudson commented on HBASE-4968:
---

Integrated in HBase-TRUNK #2523 (See 
[https://builds.apache.org/job/HBase-TRUNK/2523/])
HBASE-4968 Add to troubleshooting workaround for direct buffer oome's.

stack : 
Files : 
* /hbase/trunk/src/docbkx/troubleshooting.xml


 Add to troubleshooting workaround for direct buffer oome's.
 ---

 Key: HBASE-4968
 URL: https://issues.apache.org/jira/browse/HBASE-4968
 Project: HBase
  Issue Type: Task
Reporter: stack
Assignee: stack
 Fix For: 0.94.0

 Attachments: client.oome.txt


 Put into book workaround arrived at up on list discussing client oome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4376) Document login configuration when running on top of secure Hadoop with Kerberos auth enabled

2011-12-06 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164202#comment-13164202
 ] 

Hudson commented on HBASE-4376:
---

Integrated in HBase-TRUNK #2523 (See 
[https://builds.apache.org/job/HBase-TRUNK/2523/])
HBASE-4376 Document mutual authentication between HBase and Zookeeper using 
SASL

stack : 
Files : 
* /hbase/trunk/src/docbkx/configuration.xml


 Document login configuration when running on top of secure Hadoop with 
 Kerberos auth enabled
 

 Key: HBASE-4376
 URL: https://issues.apache.org/jira/browse/HBASE-4376
 Project: HBase
  Issue Type: Task
  Components: documentation, security
Affects Versions: 0.90.4
Reporter: Gary Helmling

 We provide basic support for HBase to run on top of kerberos-authenticated 
 Hadoop, by providing configuration options to have HMaster and HRegionServer 
 login from a keytab on startup.  But this isn't documented anywhere outside 
 of hbase-default.xml.  We need to provide some basic guidance on setup in the 
 HBase docs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




  1   2   >