[jira] [Updated] (HBASE-5682) Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only)

2012-04-02 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5682:
-

Attachment: 5682-all-v3.txt

Patch that removes the log statement Stack mentioned (had it in there for 
earlier debugging, forgot to remove it).

Also adds a simple test with an HConnection that is created before the 
mini-cluster is started to prove that initialization is indeed lazy.
(can't test with stopping and restarting the minicluster as new random ports 
are used each time).

 Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 
 only)
 --

 Key: HBASE-5682
 URL: https://issues.apache.org/jira/browse/HBASE-5682
 Project: HBase
  Issue Type: Improvement
  Components: client
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.94.0

 Attachments: 5682-all-v2.txt, 5682-all-v3.txt, 5682-all.txt, 
 5682-v2.txt, 5682.txt


 Just realized that without this HBASE-4805 is broken.
 I.e. there's no point keeping a persistent HConnection around if it can be 
 rendered permanently unusable if the ZK connection is lost temporarily.
 Note that this is fixed in 0.96 with HBASE-5399 (but that seems to big to 
 backport)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5682) Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only)

2012-04-02 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243992#comment-13243992
 ] 

Lars Hofhansl commented on HBASE-5682:
--

all-v3 is what I like to commit tomorrow if there are no objections.

 Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 
 only)
 --

 Key: HBASE-5682
 URL: https://issues.apache.org/jira/browse/HBASE-5682
 Project: HBase
  Issue Type: Improvement
  Components: client
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.94.0

 Attachments: 5682-all-v2.txt, 5682-all-v3.txt, 5682-all.txt, 
 5682-v2.txt, 5682.txt


 Just realized that without this HBASE-4805 is broken.
 I.e. there's no point keeping a persistent HConnection around if it can be 
 rendered permanently unusable if the ZK connection is lost temporarily.
 Note that this is fixed in 0.96 with HBASE-5399 (but that seems to big to 
 backport)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5436) Right-size the map when reading attributes.

2012-04-02 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243995#comment-13243995
 ] 

Hudson commented on HBASE-5436:
---

Integrated in HBase-0.92 #349 (See 
[https://builds.apache.org/job/HBase-0.92/349/])
HBASE-5436  Right-size the map when reading attributes (Benoit) (Revision 
1308232)

 Result = FAILURE
tedyu : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/OperationWithAttributes.java


 Right-size the map when reading attributes.
 ---

 Key: HBASE-5436
 URL: https://issues.apache.org/jira/browse/HBASE-5436
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.92.0
Reporter: Benoit Sigoure
Assignee: Benoit Sigoure
Priority: Trivial
  Labels: performance
 Fix For: 0.94.0

 Attachments: 0001-Right-size-the-map-when-reading-attributes.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5451) Switch RPC call envelope/headers to PBs

2012-04-02 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244006#comment-13244006
 ] 

jirapos...@reviews.apache.org commented on HBASE-5451:
--



bq.  On 2012-04-02 00:21:20, Michael Stack wrote:
bq.   Some more questions.  Just being careful DD.

That's fine. Hope the answers below are okay. Please let me know your response 
soon so that I can submit another patch.


bq.  On 2012-04-02 00:21:20, Michael Stack wrote:
bq.   
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/DataOutputOutputStream.java,
 line 25
bq.   https://reviews.apache.org/r/4096/diff/3/?file=97739#file97739line25
bq.  
bq.   We should just be using the hadoop DOOS... looks like no diff (when 
I diff them).  I'll make an issue to remove.

cool


bq.  On 2012-04-02 00:21:20, Michael Stack wrote:
bq.   
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java,
 line 446
bq.   https://reviews.apache.org/r/4096/diff/3/?file=97740#file97740line446
bq.  
bq.   Is this written up anywhere?  That its hrpc, then version, then a 
length, then a protobuf?
bq.   
bq.   I see it in the proto definition.  That'll do.

cool


bq.  On 2012-04-02 00:21:20, Michael Stack wrote:
bq.   
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java,
 line 548
bq.   https://reviews.apache.org/r/4096/diff/3/?file=97740#file97740line548
bq.  
bq.   We have an issue for removing this Invocation stuff?

No not yet. But I'll create one to do with this issue once this patch is 
committed.


bq.  On 2012-04-02 00:21:20, Michael Stack wrote:
bq.   
http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto,
 line 25
bq.   https://reviews.apache.org/r/4096/diff/3/?file=97744#file97744line25
bq.  
bq.   Should we just remove them in the next iteration on rpc since 0.96 
is to be a singularity?  Why even bother trying to keep compatibility w/ older 
clients?
bq.   
bq.   What is 'failure compatibility'?  We are telling the client to go 
away, nicely (smile).
bq.   
bq.   What you think we should replace hrpc0x0005 with?
bq.   
bq.   this - these

Yeah, valid points. We can remove this version string and all in a follow up 
patch.


bq.  On 2012-04-02 00:21:20, Michael Stack wrote:
bq.   
http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto,
 line 28
bq.   https://reviews.apache.org/r/4096/diff/3/?file=97744#file97744line28
bq.  
bq.   How does RpcRequestWithHeaderProto relate to ConnectionHeaderProto?  
This text should say?
bq.   
bq.   Would be nice to have illustration on how the back and forth work.

The latter is used only while establishing connections and the former for 
exchanging RPC requests/responses over a channel that is connected. Okay, will 
add some text.


bq.  On 2012-04-02 00:21:20, Michael Stack wrote:
bq.   
http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto,
 line 55
bq.   https://reviews.apache.org/r/4096/diff/3/?file=97744#file97744line55
bq.  
bq.   We'll send this String each time?

Actually, I could make this field 'optional' since this has a default value. 
Will do so.


bq.  On 2012-04-02 00:21:20, Michael Stack wrote:
bq.   
http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto,
 line 66
bq.   https://reviews.apache.org/r/4096/diff/3/?file=97744#file97744line66
bq.  
bq.   Which part in here is the 'header'?  How does it relate to 
ConnectionHeaderProto?
bq.   
bq.   request can be an Invocation/Writable?  Or a protobuf?  Do we need a 
length in here?

Today the only 'header' is the callId.. There is no relation to 
ConnectionHeaderProto. If the 'header' is confusing, I can take it off the 
object name. Let me know.

'request' in this patch is only a Invocation/Writable. In theory, it could be a 
protobuf object as well (since it is just bytes), but, for protobuf, we could 
make things more explicit by defining a protobuf object rather than a opaque 
set of bytes. But that's another jira (ProtoBufRpcEngine implementation similar 
to Hadoop). Length is not needed - the protobuf serialization/deserialization 
will take care of it.. 


bq.  On 2012-04-02 00:21:20, Michael Stack wrote:
bq.   
http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto,
 line 93
bq.   https://reviews.apache.org/r/4096/diff/3/?file=97744#file97744line93
bq.  
bq.   Should this precede the response?  So if false, a response follows 
else an exception?  Do we need a length here?  Where is the header that the 
message name refers too?

Length will be taken care of by the protobuf serialization/deserialization. The 
header is the combination of callId, error. If the 'header' is confusing, 

[jira] [Updated] (HBASE-5693) When creating a region, the master initializes it and creates a memstore within the master server

2012-04-02 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5693:
---

Attachment: 5593.v2.patch

 When creating a region, the master initializes it and creates a memstore 
 within the master server
 -

 Key: HBASE-5693
 URL: https://issues.apache.org/jira/browse/HBASE-5693
 Project: HBase
  Issue Type: Improvement
  Components: master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5593.v2.patch, 5693.v1.patch


 I didn't do a complete analysis, but the attached patch saves more than 0.25s 
 for each region creation and locally all the unit tests work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5693) When creating a region, the master initializes it and creates a memstore within the master server

2012-04-02 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5693:
---

Status: Open  (was: Patch Available)

 When creating a region, the master initializes it and creates a memstore 
 within the master server
 -

 Key: HBASE-5693
 URL: https://issues.apache.org/jira/browse/HBASE-5693
 Project: HBase
  Issue Type: Improvement
  Components: master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5593.v2.patch, 5693.v1.patch


 I didn't do a complete analysis, but the attached patch saves more than 0.25s 
 for each region creation and locally all the unit tests work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5693) When creating a region, the master initializes it and creates a memstore within the master server

2012-04-02 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5693:
---

Status: Patch Available  (was: Open)

 When creating a region, the master initializes it and creates a memstore 
 within the master server
 -

 Key: HBASE-5693
 URL: https://issues.apache.org/jira/browse/HBASE-5693
 Project: HBase
  Issue Type: Improvement
  Components: master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5593.v2.patch, 5693.v1.patch


 I didn't do a complete analysis, but the attached patch saves more than 0.25s 
 for each region creation and locally all the unit tests work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5693) When creating a region, the master initializes it and creates a memstore within the master server

2012-04-02 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244021#comment-13244021
 ] 

Hadoop QA commented on HBASE-5693:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12520876/5593.v2.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.client.TestFromClientSide
  org.apache.hadoop.hbase.io.hfile.TestForceCacheImportantBlocks
  
org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster
  org.apache.hadoop.hbase.mapreduce.TestImportTsv
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1364//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1364//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1364//console

This message is automatically generated.

 When creating a region, the master initializes it and creates a memstore 
 within the master server
 -

 Key: HBASE-5693
 URL: https://issues.apache.org/jira/browse/HBASE-5693
 Project: HBase
  Issue Type: Improvement
  Components: master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5593.v2.patch, 5693.v1.patch


 I didn't do a complete analysis, but the attached patch saves more than 0.25s 
 for each region creation and locally all the unit tests work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5694) getRowsWithColumnsTs function Thrift service incorrectly handles time range

2012-04-02 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244086#comment-13244086
 ] 

Hadoop QA commented on HBASE-5694:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12520856/HBASE-5694.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1366//console

This message is automatically generated.

 getRowsWithColumnsTs function Thrift service incorrectly handles time range
 ---

 Key: HBASE-5694
 URL: https://issues.apache.org/jira/browse/HBASE-5694
 Project: HBase
  Issue Type: Bug
  Components: thrift
Affects Versions: 0.92.1
Reporter: Wouter Bolsterlee
 Fix For: 0.92.2

 Attachments: HBASE-5694.patch


 The getRowsWithColumnsTs() method in the Thrift interface only applies the 
 timestamp if columns are explicitly specified. However, this method also 
 allows for columns to be unspecified (this is even used internally to 
 implement e.g. getRows()). The cause of the bug is a minor scoping issue: the 
 time range is set inside a wrong if statement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4393) Implement a canary monitoring program

2012-04-02 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244087#comment-13244087
 ] 

Hadoop QA commented on HBASE-4393:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12519998/Canary-v0.java
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1367//console

This message is automatically generated.

 Implement a canary monitoring program
 -

 Key: HBASE-4393
 URL: https://issues.apache.org/jira/browse/HBASE-4393
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Matteo Bertozzi
 Attachments: Canary-v0.java, HBaseCanary.java


 This JIRA is to implement a standalone program that can be used to do canary 
 monitoring of a running HBase cluster. This program would gather a list of 
 the regions in the cluster, then iterate over them doing lightweight 
 operations (eg short scans) to provide metrics about latency as well as alert 
 on availability issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5682) Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only)

2012-04-02 Thread Jieshan Bean (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244093#comment-13244093
 ] 

Jieshan Bean commented on HBASE-5682:
-

Everything seems good to me. Only a minor doubt, is it necessary to close 
zooKeeper before set it as null?
If HConnectionImplementation#managed is true, HConnectionImplementation#abort 
doesn't set closed to true, just calls close method. It makes sense to me:). So 
the retry logic introduced in HBASE-5153 seems redundant.
If one want to manage the connection by himself. If the connection is aborted. 
We should suggest to recreate the HConnection and HTable, right? 

 Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 
 only)
 --

 Key: HBASE-5682
 URL: https://issues.apache.org/jira/browse/HBASE-5682
 Project: HBase
  Issue Type: Improvement
  Components: client
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.94.0

 Attachments: 5682-all-v2.txt, 5682-all-v3.txt, 5682-all.txt, 
 5682-v2.txt, 5682.txt


 Just realized that without this HBASE-4805 is broken.
 I.e. there's no point keeping a persistent HConnection around if it can be 
 rendered permanently unusable if the ZK connection is lost temporarily.
 Note that this is fixed in 0.96 with HBASE-5399 (but that seems to big to 
 backport)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5644) [findbugs] Fix null pointer warnings.

2012-04-02 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244095#comment-13244095
 ] 

Hadoop QA commented on HBASE-5644:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12520867/HBASE-5644.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1365//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1365//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1365//console

This message is automatically generated.

 [findbugs] Fix null pointer warnings.
 -

 Key: HBASE-5644
 URL: https://issues.apache.org/jira/browse/HBASE-5644
 Project: HBase
  Issue Type: Sub-task
  Components: scripts
Reporter: Jonathan Hsieh
Assignee: Uma Maheswara Rao G
 Attachments: HBASE-5644.patch, NullPointerFindBugs_Analysis.xlsx


 See 
 https://builds.apache.org/job/PreCommit-HBASE-Build/1313//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
 Fix the NP category

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5666) RegionServer doesn't retry to check if base node is available

2012-04-02 Thread Matteo Bertozzi (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-5666:
---

Attachment: HBASE-5666-v3.patch

 RegionServer doesn't retry to check if base node is available
 -

 Key: HBASE-5666
 URL: https://issues.apache.org/jira/browse/HBASE-5666
 Project: HBase
  Issue Type: Bug
  Components: regionserver, zookeeper
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Attachments: HBASE-5666-v1.patch, HBASE-5666-v2.patch, 
 HBASE-5666-v3.patch, hbase-1-regionserver.log, hbase-2-regionserver.log, 
 hbase-3-regionserver.log, hbase-master.log, hbase-regionserver.log, 
 hbase-zookeeper.log


 I've a script that starts hbase and a couple of region servers in distributed 
 mode (hbase.cluster.distributed = true)
 {code}
 $HBASE_HOME/bin/start-hbase.sh
 $HBASE_HOME/bin/local-regionservers.sh start 1 2 3
 {code}
 but the region servers are not able to start...
 It seems that during the RS start the the znode is still not available, and 
 HRegionServer.initializeZooKeeper() check just once if the base not is 
 available.
 {code}
 2012-03-28 21:54:05,013 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Check the value 
 configured in 'zookeeper.znode.parent'. There could be a mismatch with the 
 one configured in the master.
 2012-03-28 21:54:08,598 FATAL 
 org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server 
 localhost,60202,133296824: Initialization of RS failed.  Hence aborting 
 RS.
 java.io.IOException: Received the shutdown message while waiting.
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:626)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:596)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:558)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:672)
   at java.lang.Thread.run(Thread.java:662)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5698) Add new coprocessor hooks in doMiniBatchPut

2012-04-02 Thread ramkrishna.s.vasudevan (Created) (JIRA)
Add new coprocessor hooks in doMiniBatchPut
---

 Key: HBASE-5698
 URL: https://issues.apache.org/jira/browse/HBASE-5698
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.96.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan


As discussed in the JIRA HBASE-5617, this JIRA has been raised to add new hooks 
to doMiniBatchPut.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5666) RegionServer doesn't retry to check if base node is available

2012-04-02 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244120#comment-13244120
 ] 

Hadoop QA commented on HBASE-5666:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12520963/HBASE-5666-v3.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.util.TestHBaseFsck
  org.apache.hadoop.hbase.client.TestFromClientSide

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1368//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1368//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1368//console

This message is automatically generated.

 RegionServer doesn't retry to check if base node is available
 -

 Key: HBASE-5666
 URL: https://issues.apache.org/jira/browse/HBASE-5666
 Project: HBase
  Issue Type: Bug
  Components: regionserver, zookeeper
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Attachments: HBASE-5666-v1.patch, HBASE-5666-v2.patch, 
 HBASE-5666-v3.patch, hbase-1-regionserver.log, hbase-2-regionserver.log, 
 hbase-3-regionserver.log, hbase-master.log, hbase-regionserver.log, 
 hbase-zookeeper.log


 I've a script that starts hbase and a couple of region servers in distributed 
 mode (hbase.cluster.distributed = true)
 {code}
 $HBASE_HOME/bin/start-hbase.sh
 $HBASE_HOME/bin/local-regionservers.sh start 1 2 3
 {code}
 but the region servers are not able to start...
 It seems that during the RS start the the znode is still not available, and 
 HRegionServer.initializeZooKeeper() check just once if the base not is 
 available.
 {code}
 2012-03-28 21:54:05,013 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Check the value 
 configured in 'zookeeper.znode.parent'. There could be a mismatch with the 
 one configured in the master.
 2012-03-28 21:54:08,598 FATAL 
 org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server 
 localhost,60202,133296824: Initialization of RS failed.  Hence aborting 
 RS.
 java.io.IOException: Received the shutdown message while waiting.
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:626)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:596)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:558)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:672)
   at java.lang.Thread.run(Thread.java:662)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5695) Use Hadoop's DataOutputOutputStream instead of have a copy local

2012-04-02 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244123#comment-13244123
 ] 

ramkrishna.s.vasudevan commented on HBASE-5695:
---

This is same as HBASE-5696.  Did you intend something else as the topic of this 
JIRA?

 Use Hadoop's DataOutputOutputStream instead of have a copy local
 

 Key: HBASE-5695
 URL: https://issues.apache.org/jira/browse/HBASE-5695
 Project: HBase
  Issue Type: Improvement
Reporter: stack



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3909) Add dynamic config

2012-04-02 Thread Uma Maheswara Rao G (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244127#comment-13244127
 ] 

Uma Maheswara Rao G commented on HBASE-3909:


I think syncing the configuration across clusters would be mostly OM kind of 
tools functionality. Bringing that into Hadoop/Hbase may not be correct.
I feel the current issue scope would be to allow some way to do the in-memory 
config updates with out restarting the node.

And I agree with Todd. OM tools are good in managing configs.
{quote}
 ¦operations teams are very good at managing text-based configuration files 
with tools like puppet, cfengine, etc. It's also easy to version-control these 
kinds of configs, add !-- comments --, etc. Moving to ZK makes these tasks 
more difficult – we'd need lots of tooling, etc.
{quote}
The current limitation point would be that, even though OMs are capable enough 
for updating the configurations in all the places, there is no way to make the 
nodes reflect with that configs without restart of that node.

I am thinking to proceed with Hadoop-7001 kind of implementation, if there are 
no objections.
Also, as a next step we can provide the options like, updating configs from 
shell and provide command to reload the config from disk one more..etc

 Add dynamic config
 --

 Key: HBASE-3909
 URL: https://issues.apache.org/jira/browse/HBASE-3909
 Project: HBase
  Issue Type: Bug
Reporter: stack
 Fix For: 0.96.0


 I'm sure this issue exists already, at least as part of the discussion around 
 making online schema edits possible, but no hard this having its own issue.  
 Ted started a conversation on this topic up on dev and Todd suggested we 
 lookd at how Hadoop did it over in HADOOP-7001

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5636) TestTableMapReduce doesn't work properly.

2012-04-02 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5636:
--

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Integrated to trunk and 0.94.

Thanks for the patch, Takuya.

 TestTableMapReduce doesn't work properly.
 -

 Key: HBASE-5636
 URL: https://issues.apache.org/jira/browse/HBASE-5636
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.92.1, 0.94.0
Reporter: Takuya Ueshin
Assignee: Takuya Ueshin
 Attachments: HBASE-5636-v2.patch, HBASE-5636.patch


 No map function is called because there are no test data put before test 
 starts.
 The following three tests are in the same situation:
 - org.apache.hadoop.hbase.mapred.TestTableMapReduce
 - org.apache.hadoop.hbase.mapreduce.TestTableMapReduce
 - org.apache.hadoop.hbase.mapreduce.TestMulitthreadedTableMapper

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5663) MultithreadedTableMapper doesn't work.

2012-04-02 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244189#comment-13244189
 ] 

Zhihong Yu commented on HBASE-5663:
---

Integrated to trunk and 0.94.

Thanks for the patch, Takuya.

 MultithreadedTableMapper doesn't work.
 --

 Key: HBASE-5663
 URL: https://issues.apache.org/jira/browse/HBASE-5663
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.94.0
Reporter: Takuya Ueshin
Assignee: Takuya Ueshin
 Fix For: 0.94.0, 0.96.0

 Attachments: 5663+5636.txt, HBASE-5663.patch


 MapReduce job using MultithreadedTableMapper goes down throwing the following 
 Exception:
 {noformat}
 java.io.IOException: java.lang.NoSuchMethodException: 
 org.apache.hadoop.mapreduce.Mapper$Context.init(org.apache.hadoop.conf.Configuration,
  org.apache.hadoop.mapred.TaskAttemptID, 
 org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper$SubMapRecordReader,
  
 org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper$SubMapRecordWriter,
  org.apache.hadoop.hbase.mapreduce.TableOutputCommitter, 
 org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper$SubMapStatusReporter,
  org.apache.hadoop.hbase.mapreduce.TableSplit)
   at 
 org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper$MapRunner.init(MultithreadedTableMapper.java:260)
   at 
 org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper.run(MultithreadedTableMapper.java:133)
   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: java.lang.NoSuchMethodException: 
 org.apache.hadoop.mapreduce.Mapper$Context.init(org.apache.hadoop.conf.Configuration,
  org.apache.hadoop.mapred.TaskAttemptID, 
 org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper$SubMapRecordReader,
  
 org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper$SubMapRecordWriter,
  org.apache.hadoop.hbase.mapreduce.TableOutputCommitter, 
 org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper$SubMapStatusReporter,
  org.apache.hadoop.hbase.mapreduce.TableSplit)
   at java.lang.Class.getConstructor0(Class.java:2706)
   at java.lang.Class.getConstructor(Class.java:1657)
   at 
 org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper$MapRunner.init(MultithreadedTableMapper.java:241)
   ... 8 more
 {noformat}
 This occured when the tasks are creating MapRunner threads.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5636) TestTableMapReduce doesn't work properly.

2012-04-02 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5636:
--

Fix Version/s: 0.96.0
   0.94.0

 TestTableMapReduce doesn't work properly.
 -

 Key: HBASE-5636
 URL: https://issues.apache.org/jira/browse/HBASE-5636
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.92.1, 0.94.0
Reporter: Takuya Ueshin
Assignee: Takuya Ueshin
 Fix For: 0.94.0, 0.96.0

 Attachments: HBASE-5636-v2.patch, HBASE-5636.patch


 No map function is called because there are no test data put before test 
 starts.
 The following three tests are in the same situation:
 - org.apache.hadoop.hbase.mapred.TestTableMapReduce
 - org.apache.hadoop.hbase.mapreduce.TestTableMapReduce
 - org.apache.hadoop.hbase.mapreduce.TestMulitthreadedTableMapper

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5694) getRowsWithColumnsTs function Thrift service incorrectly handles time range

2012-04-02 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244199#comment-13244199
 ] 

Zhihong Yu commented on HBASE-5694:
---

@Wouter:
Can you attach a patch which can be applied to trunk ?

Thanks

 getRowsWithColumnsTs function Thrift service incorrectly handles time range
 ---

 Key: HBASE-5694
 URL: https://issues.apache.org/jira/browse/HBASE-5694
 Project: HBase
  Issue Type: Bug
  Components: thrift
Affects Versions: 0.92.1
Reporter: Wouter Bolsterlee
 Fix For: 0.92.2

 Attachments: HBASE-5694.patch


 The getRowsWithColumnsTs() method in the Thrift interface only applies the 
 timestamp if columns are explicitly specified. However, this method also 
 allows for columns to be unspecified (this is even used internally to 
 implement e.g. getRows()). The cause of the bug is a minor scoping issue: the 
 time range is set inside a wrong if statement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5689) Skipping RecoveredEdits may cause data loss

2012-04-02 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5689:
--

Attachment: (was: HBASE-5689.patch)

 Skipping RecoveredEdits may cause data loss
 ---

 Key: HBASE-5689
 URL: https://issues.apache.org/jira/browse/HBASE-5689
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.94.0
Reporter: chunhui shen
Assignee: chunhui shen
Priority: Critical
 Fix For: 0.94.0

 Attachments: 5689-simplified.txt, 5689-testcase.patch, 
 HBASE-5689.patch


 Let's see the following scenario:
 1.Region is on the server A
 2.put KV(r1-v1) to the region
 3.move region from server A to server B
 4.put KV(r2-v2) to the region
 5.move region from server B to server A
 6.put KV(r3-v3) to the region
 7.kill -9 server B and start it
 8.kill -9 server A and start it 
 9.scan the region, we could only get two KV(r1-v1,r2-v2), the third 
 KV(r3-v3) is lost.
 Let's analyse the upper scenario from the code:
 1.the edit logs of KV(r1-v1) and KV(r3-v3) are both recorded in the same 
 hlog file on server A.
 2.when we split server B's hlog file in the process of ServerShutdownHandler, 
 we create one RecoveredEdits file f1 for the region.
 2.when we split server A's hlog file in the process of ServerShutdownHandler, 
 we create another RecoveredEdits file f2 for the region.
 3.however, RecoveredEdits file f2 will be skiped when initializing region
 HRegion#replayRecoveredEditsIfAny
 {code}
  for (Path edits: files) {
   if (edits == null || !this.fs.exists(edits)) {
 LOG.warn(Null or non-existent edits file:  + edits);
 continue;
   }
   if (isZeroLengthThenDelete(this.fs, edits)) continue;
   if (checkSafeToSkip) {
 Path higher = files.higher(edits);
 long maxSeqId = Long.MAX_VALUE;
 if (higher != null) {
   // Edit file name pattern, HLog.EDITFILES_NAME_PATTERN: -?[0-9]+
   String fileName = higher.getName();
   maxSeqId = Math.abs(Long.parseLong(fileName));
 }
 if (maxSeqId = minSeqId) {
   String msg = Maximum possible sequenceid for this log is  + 
 maxSeqId
   + , skipped the whole file, path= + edits;
   LOG.debug(msg);
   continue;
 } else {
   checkSafeToSkip = false;
 }
   }
 {code}
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5625) Avoid byte buffer allocations when reading a value from a Result object

2012-04-02 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244219#comment-13244219
 ] 

jirapos...@reviews.apache.org commented on HBASE-5625:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4607/
---

Review request for hbase.


Summary
---

When calling Result.getValue(), an extra dummy KeyValue and its associated 
underlying byte array are allocated, as well as a persistent buffer that will 
contain the returned value.

These can be avoided by reusing a static array for the dummy object and by 
passing a ByteBuffer object as a value destination buffer to the read method.


This addresses bug HBASE-5625.
https://issues.apache.org/jira/browse/HBASE-5625


Diffs
-

  src/main/java/org/apache/hadoop/hbase/KeyValue.java 243d76f 
  src/main/java/org/apache/hadoop/hbase/client/Result.java df0b3ef 
  src/test/java/org/apache/hadoop/hbase/client/TestResult.java f9e29c2 

Diff: https://reviews.apache.org/r/4607/diff


Testing
---

Added value check to TestResult#testBasic and TestResult.testMultiVersion.


Thanks,

Tudor



 Avoid byte buffer allocations when reading a value from a Result object
 ---

 Key: HBASE-5625
 URL: https://issues.apache.org/jira/browse/HBASE-5625
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.92.1
Reporter: Tudor Scurtu
Assignee: Tudor Scurtu
  Labels: patch
 Attachments: 5625.txt, 5625v2.txt, 5625v3.txt, 5625v4.txt


 When calling Result.getValue(), an extra dummy KeyValue and its associated 
 underlying byte array are allocated, as well as a persistent buffer that will 
 contain the returned value.
 These can be avoided by reusing a static array for the dummy object and by 
 passing a ByteBuffer object as a value destination buffer to the read method.
 The current functionality is maintained, and we have added a separate method 
 call stack that employs the described changes. I will provide more details 
 with the patch.
 Running tests with a profiler, the reduction of read time seems to be of up 
 to 40%.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5625) Avoid byte buffer allocations when reading a value from a Result object

2012-04-02 Thread Tudor Scurtu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tudor Scurtu updated HBASE-5625:


Attachment: 5625v5.txt

@Zhihong:
Thanks for the review request. I actually had to make my own in order to upload 
the diff: https://reviews.apache.org/r/4607/

The performance actually depends on the system capabilities. It's hard to write 
a microbenchmark test for an issue that manifests itself on large I/O intensive 
jobs that put a lot of gc pressure. I implemented a few of Cosmin's suggestions.

 Avoid byte buffer allocations when reading a value from a Result object
 ---

 Key: HBASE-5625
 URL: https://issues.apache.org/jira/browse/HBASE-5625
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.92.1
Reporter: Tudor Scurtu
Assignee: Tudor Scurtu
  Labels: patch
 Attachments: 5625.txt, 5625v2.txt, 5625v3.txt, 5625v4.txt, 5625v5.txt


 When calling Result.getValue(), an extra dummy KeyValue and its associated 
 underlying byte array are allocated, as well as a persistent buffer that will 
 contain the returned value.
 These can be avoided by reusing a static array for the dummy object and by 
 passing a ByteBuffer object as a value destination buffer to the read method.
 The current functionality is maintained, and we have added a separate method 
 call stack that employs the described changes. I will provide more details 
 with the patch.
 Running tests with a profiler, the reduction of read time seems to be of up 
 to 40%.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5689) Skipping RecoveredEdits may cause data loss

2012-04-02 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244233#comment-13244233
 ] 

Zhihong Yu commented on HBASE-5689:
---

@Chunhui:
Hadoop QA isn't picking up any patches from this JIRA.

Please run through test suite and let us know the result.

 Skipping RecoveredEdits may cause data loss
 ---

 Key: HBASE-5689
 URL: https://issues.apache.org/jira/browse/HBASE-5689
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.94.0
Reporter: chunhui shen
Assignee: chunhui shen
Priority: Critical
 Fix For: 0.94.0

 Attachments: 5689-simplified.txt, 5689-testcase.patch, 
 HBASE-5689.patch


 Let's see the following scenario:
 1.Region is on the server A
 2.put KV(r1-v1) to the region
 3.move region from server A to server B
 4.put KV(r2-v2) to the region
 5.move region from server B to server A
 6.put KV(r3-v3) to the region
 7.kill -9 server B and start it
 8.kill -9 server A and start it 
 9.scan the region, we could only get two KV(r1-v1,r2-v2), the third 
 KV(r3-v3) is lost.
 Let's analyse the upper scenario from the code:
 1.the edit logs of KV(r1-v1) and KV(r3-v3) are both recorded in the same 
 hlog file on server A.
 2.when we split server B's hlog file in the process of ServerShutdownHandler, 
 we create one RecoveredEdits file f1 for the region.
 2.when we split server A's hlog file in the process of ServerShutdownHandler, 
 we create another RecoveredEdits file f2 for the region.
 3.however, RecoveredEdits file f2 will be skiped when initializing region
 HRegion#replayRecoveredEditsIfAny
 {code}
  for (Path edits: files) {
   if (edits == null || !this.fs.exists(edits)) {
 LOG.warn(Null or non-existent edits file:  + edits);
 continue;
   }
   if (isZeroLengthThenDelete(this.fs, edits)) continue;
   if (checkSafeToSkip) {
 Path higher = files.higher(edits);
 long maxSeqId = Long.MAX_VALUE;
 if (higher != null) {
   // Edit file name pattern, HLog.EDITFILES_NAME_PATTERN: -?[0-9]+
   String fileName = higher.getName();
   maxSeqId = Math.abs(Long.parseLong(fileName));
 }
 if (maxSeqId = minSeqId) {
   String msg = Maximum possible sequenceid for this log is  + 
 maxSeqId
   + , skipped the whole file, path= + edits;
   LOG.debug(msg);
   continue;
 } else {
   checkSafeToSkip = false;
 }
   }
 {code}
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5663) MultithreadedTableMapper doesn't work.

2012-04-02 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244245#comment-13244245
 ] 

Hudson commented on HBASE-5663:
---

Integrated in HBase-0.94 #75 (See 
[https://builds.apache.org/job/HBase-0.94/75/])
HBASE-5663 HBASE-5636 MultithreadedTableMapper doesn't work (Takuya Ueshin) 
(Revision 1308354)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/MultithreadedTableMapper.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapred/TestTableMapReduce.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestMulitthreadedTableMapper.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestMultithreadedTableMapper.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java


 MultithreadedTableMapper doesn't work.
 --

 Key: HBASE-5663
 URL: https://issues.apache.org/jira/browse/HBASE-5663
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.94.0
Reporter: Takuya Ueshin
Assignee: Takuya Ueshin
 Fix For: 0.94.0, 0.96.0

 Attachments: 5663+5636.txt, HBASE-5663.patch


 MapReduce job using MultithreadedTableMapper goes down throwing the following 
 Exception:
 {noformat}
 java.io.IOException: java.lang.NoSuchMethodException: 
 org.apache.hadoop.mapreduce.Mapper$Context.init(org.apache.hadoop.conf.Configuration,
  org.apache.hadoop.mapred.TaskAttemptID, 
 org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper$SubMapRecordReader,
  
 org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper$SubMapRecordWriter,
  org.apache.hadoop.hbase.mapreduce.TableOutputCommitter, 
 org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper$SubMapStatusReporter,
  org.apache.hadoop.hbase.mapreduce.TableSplit)
   at 
 org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper$MapRunner.init(MultithreadedTableMapper.java:260)
   at 
 org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper.run(MultithreadedTableMapper.java:133)
   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: java.lang.NoSuchMethodException: 
 org.apache.hadoop.mapreduce.Mapper$Context.init(org.apache.hadoop.conf.Configuration,
  org.apache.hadoop.mapred.TaskAttemptID, 
 org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper$SubMapRecordReader,
  
 org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper$SubMapRecordWriter,
  org.apache.hadoop.hbase.mapreduce.TableOutputCommitter, 
 org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper$SubMapStatusReporter,
  org.apache.hadoop.hbase.mapreduce.TableSplit)
   at java.lang.Class.getConstructor0(Class.java:2706)
   at java.lang.Class.getConstructor(Class.java:1657)
   at 
 org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper$MapRunner.init(MultithreadedTableMapper.java:241)
   ... 8 more
 {noformat}
 This occured when the tasks are creating MapRunner threads.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5636) TestTableMapReduce doesn't work properly.

2012-04-02 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244244#comment-13244244
 ] 

Hudson commented on HBASE-5636:
---

Integrated in HBase-0.94 #75 (See 
[https://builds.apache.org/job/HBase-0.94/75/])
HBASE-5663 HBASE-5636 MultithreadedTableMapper doesn't work (Takuya Ueshin) 
(Revision 1308354)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/MultithreadedTableMapper.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapred/TestTableMapReduce.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestMulitthreadedTableMapper.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestMultithreadedTableMapper.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java


 TestTableMapReduce doesn't work properly.
 -

 Key: HBASE-5636
 URL: https://issues.apache.org/jira/browse/HBASE-5636
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.92.1, 0.94.0
Reporter: Takuya Ueshin
Assignee: Takuya Ueshin
 Fix For: 0.94.0, 0.96.0

 Attachments: HBASE-5636-v2.patch, HBASE-5636.patch


 No map function is called because there are no test data put before test 
 starts.
 The following three tests are in the same situation:
 - org.apache.hadoop.hbase.mapred.TestTableMapReduce
 - org.apache.hadoop.hbase.mapreduce.TestTableMapReduce
 - org.apache.hadoop.hbase.mapreduce.TestMulitthreadedTableMapper

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5680) Hbase94 and Hbase 92.2 is not compatible with the Hadoop 23.1

2012-04-02 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244250#comment-13244250
 ] 

Jonathan Hsieh commented on HBASE-5680:
---

I feel like to resolve this we should give the user some sort of warning about 
needing to recompile against hadoop23 (or vice versa if a version compiled 
against hadoop23 is attempts to run against a hadoop 1.0.0/0.20.x based hdfs).  
Thoughts?

 Hbase94 and Hbase 92.2 is not compatible with the Hadoop 23.1 
 --

 Key: HBASE-5680
 URL: https://issues.apache.org/jira/browse/HBASE-5680
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Kristam Subba Swathi

 Hmaster is not able to start because of the following error
 Please find the following error 
 
 2012-03-30 11:12:19,487 FATAL org.apache.hadoop.hbase.master.HMaster: 
 Unhandled exception. Starting shutdown.
 java.lang.NoClassDefFoundError: 
 org/apache/hadoop/hdfs/protocol/FSConstants$SafeModeAction
   at org.apache.hadoop.hbase.util.FSUtils.waitOnSafeMode(FSUtils.java:524)
   at 
 org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:324)
   at 
 org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:127)
   at 
 org.apache.hadoop.hbase.master.MasterFileSystem.init(MasterFileSystem.java:112)
   at 
 org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:496)
   at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:363)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: java.lang.ClassNotFoundException: 
 org.apache.hadoop.hdfs.protocol.FSConstants$SafeModeAction
   at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
   ... 7 more
 There is a change in the FSConstants

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5625) Avoid byte buffer allocations when reading a value from a Result object

2012-04-02 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244253#comment-13244253
 ] 

Hadoop QA commented on HBASE-5625:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12520978/5625v5.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 1 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.mapreduce.TestMultithreadedTableMapper
  org.apache.hadoop.hbase.mapreduce.TestImportTsv
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
  org.apache.hadoop.hbase.mapreduce.TestTableMapReduce

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1369//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1369//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1369//console

This message is automatically generated.

 Avoid byte buffer allocations when reading a value from a Result object
 ---

 Key: HBASE-5625
 URL: https://issues.apache.org/jira/browse/HBASE-5625
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.92.1
Reporter: Tudor Scurtu
Assignee: Tudor Scurtu
  Labels: patch
 Attachments: 5625.txt, 5625v2.txt, 5625v3.txt, 5625v4.txt, 5625v5.txt


 When calling Result.getValue(), an extra dummy KeyValue and its associated 
 underlying byte array are allocated, as well as a persistent buffer that will 
 contain the returned value.
 These can be avoided by reusing a static array for the dummy object and by 
 passing a ByteBuffer object as a value destination buffer to the read method.
 The current functionality is maintained, and we have added a separate method 
 call stack that employs the described changes. I will provide more details 
 with the patch.
 Running tests with a profiler, the reduction of read time seems to be of up 
 to 40%.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5451) Switch RPC call envelope/headers to PBs

2012-04-02 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244255#comment-13244255
 ] 

jirapos...@reviews.apache.org commented on HBASE-5451:
--



bq.  On 2012-04-02 00:21:20, Michael Stack wrote:
bq.   
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java,
 line 548
bq.   https://reviews.apache.org/r/4096/diff/3/?file=97740#file97740line548
bq.  
bq.   We have an issue for removing this Invocation stuff?
bq.  
bq.  Devaraj Das wrote:
bq.  No not yet. But I'll create one to do with this issue once this patch 
is committed.

Thanks


bq.  On 2012-04-02 00:21:20, Michael Stack wrote:
bq.   
http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto,
 line 25
bq.   https://reviews.apache.org/r/4096/diff/3/?file=97744#file97744line25
bq.  
bq.   Should we just remove them in the next iteration on rpc since 0.96 
is to be a singularity?  Why even bother trying to keep compatibility w/ older 
clients?
bq.   
bq.   What is 'failure compatibility'?  We are telling the client to go 
away, nicely (smile).
bq.   
bq.   What you think we should replace hrpc0x0005 with?
bq.   
bq.   this - these
bq.  
bq.  Devaraj Das wrote:
bq.  Yeah, valid points. We can remove this version string and all in a 
follow up patch.

Lets discuss in another jira.  A hrpc version followed by something that 
says its protobuf that follows, etc.,


bq.  On 2012-04-02 00:21:20, Michael Stack wrote:
bq.   
http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto,
 line 28
bq.   https://reviews.apache.org/r/4096/diff/3/?file=97744#file97744line28
bq.  
bq.   How does RpcRequestWithHeaderProto relate to ConnectionHeaderProto?  
This text should say?
bq.   
bq.   Would be nice to have illustration on how the back and forth work.
bq.  
bq.  Devaraj Das wrote:
bq.  The latter is used only while establishing connections and the former 
for exchanging RPC requests/responses over a channel that is connected. Okay, 
will add some text.

Thanks


bq.  On 2012-04-02 00:21:20, Michael Stack wrote:
bq.   
http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto,
 line 55
bq.   https://reviews.apache.org/r/4096/diff/3/?file=97744#file97744line55
bq.  
bq.   We'll send this String each time?
bq.  
bq.  Devaraj Das wrote:
bq.  Actually, I could make this field 'optional' since this has a default 
value. Will do so.

That'd be a good idea I think.  The other protocols are less used and can 
include the String


bq.  On 2012-04-02 00:21:20, Michael Stack wrote:
bq.   
http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto,
 line 66
bq.   https://reviews.apache.org/r/4096/diff/3/?file=97744#file97744line66
bq.  
bq.   Which part in here is the 'header'?  How does it relate to 
ConnectionHeaderProto?
bq.   
bq.   request can be an Invocation/Writable?  Or a protobuf?  Do we need a 
length in here?
bq.  
bq.  Devaraj Das wrote:
bq.  Today the only 'header' is the callId.. There is no relation to 
ConnectionHeaderProto. If the 'header' is confusing, I can take it off the 
object name. Let me know.
bq.  
bq.  'request' in this patch is only a Invocation/Writable. In theory, it 
could be a protobuf object as well (since it is just bytes), but, for protobuf, 
we could make things more explicit by defining a protobuf object rather than a 
opaque set of bytes. But that's another jira (ProtoBufRpcEngine implementation 
similar to Hadoop). Length is not needed - the protobuf 
serialization/deserialization will take care of it..

I think taking the 'Header' off Request/Response would be best (Did I ask you 
add it previous?  If so, sorry... I misunderstood.  Thanks for being 
accomodating).   Yes, on a new issue to make it pb rather than opaque bytes.  
Do you have to do something here -- make bytes optional? -- to allow for the 
later pb replacement?

On length, thats probably good to keep.  For us, we'll give the stream to a pb 
deserializer but other clients might want to know how many bytes on the 
line so keep it I'd say.


bq.  On 2012-04-02 00:21:20, Michael Stack wrote:
bq.   
http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto,
 line 93
bq.   https://reviews.apache.org/r/4096/diff/3/?file=97744#file97744line93
bq.  
bq.   Should this precede the response?  So if false, a response follows 
else an exception?  Do we need a length here?  Where is the header that the 
message name refers too?
bq.  
bq.  Devaraj Das wrote:
bq.  Length will be taken care of by the protobuf 
serialization/deserialization. The header is the combination of callId, error. 
If the 'header' is confusing, I can take it off the object name. Let me know.

Yeah, take away 

[jira] [Resolved] (HBASE-5695) Use Hadoop's DataOutputOutputStream instead of have a copy local

2012-04-02 Thread stack (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-5695.
--

Resolution: Duplicate

hbase-5696 (Thanks Ram)

 Use Hadoop's DataOutputOutputStream instead of have a copy local
 

 Key: HBASE-5695
 URL: https://issues.apache.org/jira/browse/HBASE-5695
 Project: HBase
  Issue Type: Improvement
Reporter: stack



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5435) TestForceCacheImportantBlocks fails with OutOfMemoryError

2012-04-02 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244276#comment-13244276
 ] 

Zhihong Yu commented on HBASE-5435:
---

The test error happened in 0.94 build #75 as well.
https://builds.apache.org/job/HBase-0.94/75/testReport/junit/org.apache.hadoop.hbase.io.hfile/TestForceCacheImportantBlocks/testCacheBlocks_1_/

 TestForceCacheImportantBlocks fails with OutOfMemoryError
 -

 Key: HBASE-5435
 URL: https://issues.apache.org/jira/browse/HBASE-5435
 Project: HBase
  Issue Type: Test
Reporter: Zhihong Yu
 Fix For: 0.96.0


 Here is related stack trace (see 
 https://builds.apache.org/job/HBase-TRUNK/2665/testReport/org.apache.hadoop.hbase.io.hfile/TestForceCacheImportantBlocks/testCacheBlocks_1_/):
 {code}
 Caused by: java.lang.OutOfMemoryError
   at java.util.zip.Deflater.init(Native Method)
   at java.util.zip.Deflater.init(Deflater.java:124)
   at java.util.zip.GZIPOutputStream.init(GZIPOutputStream.java:46)
   at java.util.zip.GZIPOutputStream.init(GZIPOutputStream.java:58)
   at 
 org.apache.hadoop.hbase.io.hfile.ReusableStreamGzipCodec$ReusableGzipOutputStream$ResetableGZIPOutputStream.init(ReusableStreamGzipCodec.java:79)
   at 
 org.apache.hadoop.hbase.io.hfile.ReusableStreamGzipCodec$ReusableGzipOutputStream.init(ReusableStreamGzipCodec.java:90)
   at 
 org.apache.hadoop.hbase.io.hfile.ReusableStreamGzipCodec.createOutputStream(ReusableStreamGzipCodec.java:130)
   at 
 org.apache.hadoop.io.compress.GzipCodec.createOutputStream(GzipCodec.java:101)
   at 
 org.apache.hadoop.hbase.io.hfile.Compression$Algorithm.createPlainCompressionStream(Compression.java:239)
   at 
 org.apache.hadoop.hbase.io.hfile.Compression$Algorithm.createCompressionStream(Compression.java:223)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileWriterV1.getCompressingStream(HFileWriterV1.java:270)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileWriterV1.close(HFileWriterV1.java:416)
   at 
 org.apache.hadoop.hbase.regionserver.StoreFile$Writer.close(StoreFile.java:1115)
   at 
 org.apache.hadoop.hbase.regionserver.Store.internalFlushCache(Store.java:706)
   at org.apache.hadoop.hbase.regionserver.Store.flushCache(Store.java:633)
   at org.apache.hadoop.hbase.regionserver.Store.access$400(Store.java:106)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5680) Hbase94 and Hbase 92.2 is not compatible with the Hadoop 23.1

2012-04-02 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244283#comment-13244283
 ] 

stack commented on HBASE-5680:
--



Yes.  Unless someone has a bit of reflection jujitsu they can apply here. It'd 
be a PITA shipping four tgzs.  Two is already too many.

 Hbase94 and Hbase 92.2 is not compatible with the Hadoop 23.1 
 --

 Key: HBASE-5680
 URL: https://issues.apache.org/jira/browse/HBASE-5680
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Kristam Subba Swathi

 Hmaster is not able to start because of the following error
 Please find the following error 
 
 2012-03-30 11:12:19,487 FATAL org.apache.hadoop.hbase.master.HMaster: 
 Unhandled exception. Starting shutdown.
 java.lang.NoClassDefFoundError: 
 org/apache/hadoop/hdfs/protocol/FSConstants$SafeModeAction
   at org.apache.hadoop.hbase.util.FSUtils.waitOnSafeMode(FSUtils.java:524)
   at 
 org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:324)
   at 
 org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:127)
   at 
 org.apache.hadoop.hbase.master.MasterFileSystem.init(MasterFileSystem.java:112)
   at 
 org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:496)
   at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:363)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: java.lang.ClassNotFoundException: 
 org.apache.hadoop.hdfs.protocol.FSConstants$SafeModeAction
   at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
   ... 7 more
 There is a change in the FSConstants

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5625) Avoid byte buffer allocations when reading a value from a Result object

2012-04-02 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244324#comment-13244324
 ] 

jirapos...@reviews.apache.org commented on HBASE-5625:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4607/#review6622
---


I ran TestTableMapReduce and TestMultithreadedTableMapper with patch v5.
They passed.

Some minor comments below.


src/main/java/org/apache/hadoop/hbase/KeyValue.java
https://reviews.apache.org/r/4607/#comment14303

Please include vlength in the exception message



src/main/java/org/apache/hadoop/hbase/KeyValue.java
https://reviews.apache.org/r/4607/#comment14304

Should read 'BufferOverflowException if there'



src/main/java/org/apache/hadoop/hbase/KeyValue.java
https://reviews.apache.org/r/4607/#comment14305

Add a space between comma and fl.



src/main/java/org/apache/hadoop/hbase/client/Result.java
https://reviews.apache.org/r/4607/#comment14306

Is this comment needed ?



src/main/java/org/apache/hadoop/hbase/client/Result.java
https://reviews.apache.org/r/4607/#comment14307

This line can be removed.



src/test/java/org/apache/hadoop/hbase/client/TestResult.java
https://reviews.apache.org/r/4607/#comment14308

white space.



src/test/java/org/apache/hadoop/hbase/client/TestResult.java
https://reviews.apache.org/r/4607/#comment14309

Since benchmarking is hard to do, this test case can be dropped.


- Ted


On 2012-04-02 14:22:48, Tudor Scurtu wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4607/
bq.  ---
bq.  
bq.  (Updated 2012-04-02 14:22:48)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  When calling Result.getValue(), an extra dummy KeyValue and its associated 
underlying byte array are allocated, as well as a persistent buffer that will 
contain the returned value.
bq.  
bq.  These can be avoided by reusing a static array for the dummy object and by 
passing a ByteBuffer object as a value destination buffer to the read method.
bq.  
bq.  
bq.  This addresses bug HBASE-5625.
bq.  https://issues.apache.org/jira/browse/HBASE-5625
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.src/main/java/org/apache/hadoop/hbase/KeyValue.java 243d76f 
bq.src/main/java/org/apache/hadoop/hbase/client/Result.java df0b3ef 
bq.src/test/java/org/apache/hadoop/hbase/client/TestResult.java f9e29c2 
bq.  
bq.  Diff: https://reviews.apache.org/r/4607/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Added value check to TestResult#testBasic and TestResult.testMultiVersion.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Tudor
bq.  
bq.



 Avoid byte buffer allocations when reading a value from a Result object
 ---

 Key: HBASE-5625
 URL: https://issues.apache.org/jira/browse/HBASE-5625
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.92.1
Reporter: Tudor Scurtu
Assignee: Tudor Scurtu
  Labels: patch
 Attachments: 5625.txt, 5625v2.txt, 5625v3.txt, 5625v4.txt, 5625v5.txt


 When calling Result.getValue(), an extra dummy KeyValue and its associated 
 underlying byte array are allocated, as well as a persistent buffer that will 
 contain the returned value.
 These can be avoided by reusing a static array for the dummy object and by 
 passing a ByteBuffer object as a value destination buffer to the read method.
 The current functionality is maintained, and we have added a separate method 
 call stack that employs the described changes. I will provide more details 
 with the patch.
 Running tests with a profiler, the reduction of read time seems to be of up 
 to 40%.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5625) Avoid byte buffer allocations when reading a value from a Result object

2012-04-02 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5625:
--

Fix Version/s: 0.96.0
 Hadoop Flags: Reviewed

 Avoid byte buffer allocations when reading a value from a Result object
 ---

 Key: HBASE-5625
 URL: https://issues.apache.org/jira/browse/HBASE-5625
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.92.1
Reporter: Tudor Scurtu
Assignee: Tudor Scurtu
  Labels: patch
 Fix For: 0.96.0

 Attachments: 5625.txt, 5625v2.txt, 5625v3.txt, 5625v4.txt, 5625v5.txt


 When calling Result.getValue(), an extra dummy KeyValue and its associated 
 underlying byte array are allocated, as well as a persistent buffer that will 
 contain the returned value.
 These can be avoided by reusing a static array for the dummy object and by 
 passing a ByteBuffer object as a value destination buffer to the read method.
 The current functionality is maintained, and we have added a separate method 
 call stack that employs the described changes. I will provide more details 
 with the patch.
 Running tests with a profiler, the reduction of read time seems to be of up 
 to 40%.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5672) TestLruBlockCache#testBackgroundEvictionThread fails occasionally

2012-04-02 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244343#comment-13244343
 ] 

stack commented on HBASE-5672:
--

bq. I think Thread.isAlive returns true if we have called 
Thread.start(),however, Thread.run() haven't been executed at that time.

That may be so (I've not looked at source).  Do you want to have a flag in the 
Thread that gets set when you enter the run method and check that too?

The above would still be better than a timed wait.

 TestLruBlockCache#testBackgroundEvictionThread fails occasionally
 -

 Key: HBASE-5672
 URL: https://issues.apache.org/jira/browse/HBASE-5672
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: HBASE-5672.patch


 We find TestLruBlockCache#testBackgroundEvictionThread fails occasionally.
 I think it's a problem of the test case.
 Because runEviction() only do evictionThread.evict():
 {code}
 public void evict() {
   synchronized(this) {
 this.notify(); // FindBugs NN_NAKED_NOTIFY
   }
 }
 {code}
 However when we call evictionThread.evict(), the evictionThread may haven't 
 been in run() in the TestLruBlockCache#testBackgroundEvictionThread.
 If we run the test many times, we could find failture easily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5694) getRowsWithColumnsTs function Thrift service incorrectly handles time range

2012-04-02 Thread Wouter Bolsterlee (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wouter Bolsterlee updated HBASE-5694:
-

Status: Open  (was: Patch Available)

 getRowsWithColumnsTs function Thrift service incorrectly handles time range
 ---

 Key: HBASE-5694
 URL: https://issues.apache.org/jira/browse/HBASE-5694
 Project: HBase
  Issue Type: Bug
  Components: thrift
Affects Versions: 0.92.1
Reporter: Wouter Bolsterlee
 Fix For: 0.92.2

 Attachments: HBASE-5694.patch


 The getRowsWithColumnsTs() method in the Thrift interface only applies the 
 timestamp if columns are explicitly specified. However, this method also 
 allows for columns to be unspecified (this is even used internally to 
 implement e.g. getRows()). The cause of the bug is a minor scoping issue: the 
 time range is set inside a wrong if statement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5694) getRowsWithColumnsTs() in Thrift service handles timestamps incorrectly

2012-04-02 Thread Wouter Bolsterlee (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wouter Bolsterlee updated HBASE-5694:
-

Attachment: HBASE-5694-trunk-20120402.patch

Patch against SVN trunk as of today. It's a one-liner that moves the 
setTimeRange() call outside the if (columns != null) block.

 getRowsWithColumnsTs() in Thrift service handles timestamps incorrectly
 ---

 Key: HBASE-5694
 URL: https://issues.apache.org/jira/browse/HBASE-5694
 Project: HBase
  Issue Type: Bug
  Components: thrift
Affects Versions: 0.92.1
Reporter: Wouter Bolsterlee
 Fix For: 0.92.2

 Attachments: HBASE-5694-trunk-20120402.patch, HBASE-5694.patch


 The getRowsWithColumnsTs() method in the Thrift interface only applies the 
 timestamp if columns are explicitly specified. However, this method also 
 allows for columns to be unspecified (this is even used internally to 
 implement e.g. getRows()). The cause of the bug is a minor scoping issue: the 
 time range is set inside a wrong if statement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5694) getRowsWithColumnsTs() in Thrift service handles timestamps incorrectly

2012-04-02 Thread Wouter Bolsterlee (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wouter Bolsterlee updated HBASE-5694:
-

Summary: getRowsWithColumnsTs() in Thrift service handles timestamps 
incorrectly  (was: getRowsWithColumnsTs function Thrift service incorrectly 
handles time range)

 getRowsWithColumnsTs() in Thrift service handles timestamps incorrectly
 ---

 Key: HBASE-5694
 URL: https://issues.apache.org/jira/browse/HBASE-5694
 Project: HBase
  Issue Type: Bug
  Components: thrift
Affects Versions: 0.92.1
Reporter: Wouter Bolsterlee
 Fix For: 0.92.2

 Attachments: HBASE-5694-trunk-20120402.patch, HBASE-5694.patch


 The getRowsWithColumnsTs() method in the Thrift interface only applies the 
 timestamp if columns are explicitly specified. However, this method also 
 allows for columns to be unspecified (this is even used internally to 
 implement e.g. getRows()). The cause of the bug is a minor scoping issue: the 
 time range is set inside a wrong if statement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5682) Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only)

2012-04-02 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244359#comment-13244359
 ] 

Lars Hofhansl commented on HBASE-5682:
--

Presumably close it not needed since the connection is known to be down in this 
case. To be save, I'll add that, and make sure it doesn't cause another hang.

I think this is better than HBASE-5153, because it attempts to reconnect when 
the connection is needed and not when it was lost (in which case it is likely 
that the next retry will fail as well, leading to long hangs with no change for 
the caller to notice).


 Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 
 only)
 --

 Key: HBASE-5682
 URL: https://issues.apache.org/jira/browse/HBASE-5682
 Project: HBase
  Issue Type: Improvement
  Components: client
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.94.0

 Attachments: 5682-all-v2.txt, 5682-all-v3.txt, 5682-all.txt, 
 5682-v2.txt, 5682.txt


 Just realized that without this HBASE-4805 is broken.
 I.e. there's no point keeping a persistent HConnection around if it can be 
 rendered permanently unusable if the ZK connection is lost temporarily.
 Note that this is fixed in 0.96 with HBASE-5399 (but that seems to big to 
 backport)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5451) Switch RPC call envelope/headers to PBs

2012-04-02 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244362#comment-13244362
 ] 

jirapos...@reviews.apache.org commented on HBASE-5451:
--



bq.  On 2012-04-02 00:21:20, Michael Stack wrote:
bq.   
http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto,
 line 93
bq.   https://reviews.apache.org/r/4096/diff/3/?file=97744#file97744line93
bq.  
bq.   Should this precede the response?  So if false, a response follows 
else an exception?  Do we need a length here?  Where is the header that the 
message name refers too?
bq.  
bq.  Devaraj Das wrote:
bq.  Length will be taken care of by the protobuf 
serialization/deserialization. The header is the combination of callId, error. 
If the 'header' is confusing, I can take it off the object name. Let me know.
bq.  
bq.  Michael Stack wrote:
bq.  Yeah, take away the header.  Length I think is good.  Makes it more 
robust (IIRC, we went out of our way to add length to the old RPC to help 
clients figure how much to pull).

The argument above for 'length' applies here too... 


bq.  On 2012-04-02 00:21:20, Michael Stack wrote:
bq.   
http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto,
 line 66
bq.   https://reviews.apache.org/r/4096/diff/3/?file=97744#file97744line66
bq.  
bq.   Which part in here is the 'header'?  How does it relate to 
ConnectionHeaderProto?
bq.   
bq.   request can be an Invocation/Writable?  Or a protobuf?  Do we need a 
length in here?
bq.  
bq.  Devaraj Das wrote:
bq.  Today the only 'header' is the callId.. There is no relation to 
ConnectionHeaderProto. If the 'header' is confusing, I can take it off the 
object name. Let me know.
bq.  
bq.  'request' in this patch is only a Invocation/Writable. In theory, it 
could be a protobuf object as well (since it is just bytes), but, for protobuf, 
we could make things more explicit by defining a protobuf object rather than a 
opaque set of bytes. But that's another jira (ProtoBufRpcEngine implementation 
similar to Hadoop). Length is not needed - the protobuf 
serialization/deserialization will take care of it..
bq.  
bq.  Michael Stack wrote:
bq.  I think taking the 'Header' off Request/Response would be best (Did I 
ask you add it previous?  If so, sorry... I misunderstood.  Thanks for being 
accomodating).   Yes, on a new issue to make it pb rather than opaque bytes.  
Do you have to do something here -- make bytes optional? -- to allow for the 
later pb replacement?
bq.  
bq.  On length, thats probably good to keep.  For us, we'll give the stream 
to a pb deserializer but other clients might want to know how many bytes on the 
line so keep it I'd say.

Yes, I'll take off the 'header' from the message name. I could make the 'bytes' 
field optional.

Actually, on the length, I am not sure I understand why we need it in the PB 
model. Generally speaking, clients talking to servers have to be aware of the 
PB encoding in order for them to make any sense of the PB data.. The PB type 
'bytes' has the length taken care of in the implementation of 
serialization/deserialization internally. In that sense, I don't think having 
an explicit length field is required. Does this reasoning make sense?

(Also note that the top level RPC request envelope has the length preceding the 
request data)


- Devaraj


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4096/#review6613
---


On 2012-03-30 23:29:32, Devaraj Das wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4096/
bq.  ---
bq.  
bq.  (Updated 2012-03-30 23:29:32)
bq.  
bq.  
bq.  Review request for Michael Stack and Benoit Sigoure.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Switch RPC call envelope/headers to PBs
bq.  
bq.  
bq.  This addresses bug HBASE-5451.
bq.  https://issues.apache.org/jira/browse/HBASE-5451
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/DataOutputOutputStream.java
 1307644 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java
 1307644 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java
 1307644 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/RPCMessageProtos.java
 PRE-CREATION 
bq.

[jira] [Commented] (HBASE-5682) Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only)

2012-04-02 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244363#comment-13244363
 ] 

Lars Hofhansl commented on HBASE-5682:
--

Oh, and thanks for taking a look Jieshan :)

 Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 
 only)
 --

 Key: HBASE-5682
 URL: https://issues.apache.org/jira/browse/HBASE-5682
 Project: HBase
  Issue Type: Improvement
  Components: client
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.94.0

 Attachments: 5682-all-v2.txt, 5682-all-v3.txt, 5682-all.txt, 
 5682-v2.txt, 5682.txt


 Just realized that without this HBASE-4805 is broken.
 I.e. there's no point keeping a persistent HConnection around if it can be 
 rendered permanently unusable if the ZK connection is lost temporarily.
 Note that this is fixed in 0.96 with HBASE-5399 (but that seems to big to 
 backport)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5451) Switch RPC call envelope/headers to PBs

2012-04-02 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244365#comment-13244365
 ] 

jirapos...@reviews.apache.org commented on HBASE-5451:
--



bq.  On 2012-04-02 00:21:20, Michael Stack wrote:
bq.   
http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto,
 line 66
bq.   https://reviews.apache.org/r/4096/diff/3/?file=97744#file97744line66
bq.  
bq.   Which part in here is the 'header'?  How does it relate to 
ConnectionHeaderProto?
bq.   
bq.   request can be an Invocation/Writable?  Or a protobuf?  Do we need a 
length in here?
bq.  
bq.  Devaraj Das wrote:
bq.  Today the only 'header' is the callId.. There is no relation to 
ConnectionHeaderProto. If the 'header' is confusing, I can take it off the 
object name. Let me know.
bq.  
bq.  'request' in this patch is only a Invocation/Writable. In theory, it 
could be a protobuf object as well (since it is just bytes), but, for protobuf, 
we could make things more explicit by defining a protobuf object rather than a 
opaque set of bytes. But that's another jira (ProtoBufRpcEngine implementation 
similar to Hadoop). Length is not needed - the protobuf 
serialization/deserialization will take care of it..
bq.  
bq.  Michael Stack wrote:
bq.  I think taking the 'Header' off Request/Response would be best (Did I 
ask you add it previous?  If so, sorry... I misunderstood.  Thanks for being 
accomodating).   Yes, on a new issue to make it pb rather than opaque bytes.  
Do you have to do something here -- make bytes optional? -- to allow for the 
later pb replacement?
bq.  
bq.  On length, thats probably good to keep.  For us, we'll give the stream 
to a pb deserializer but other clients might want to know how many bytes on the 
line so keep it I'd say.
bq.  
bq.  Devaraj Das wrote:
bq.  Yes, I'll take off the 'header' from the message name. I could make 
the 'bytes' field optional.
bq.  
bq.  Actually, on the length, I am not sure I understand why we need it in 
the PB model. Generally speaking, clients talking to servers have to be aware 
of the PB encoding in order for them to make any sense of the PB data.. The PB 
type 'bytes' has the length taken care of in the implementation of 
serialization/deserialization internally. In that sense, I don't think having 
an explicit length field is required. Does this reasoning make sense?
bq.  
bq.  (Also note that the top level RPC request envelope has the length 
preceding the request data)

If the top level rpc request envelope has the length, then I agree w/ you, its 
not needed as prefix on pb messages.


bq.  On 2012-04-02 00:21:20, Michael Stack wrote:
bq.   
http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto,
 line 93
bq.   https://reviews.apache.org/r/4096/diff/3/?file=97744#file97744line93
bq.  
bq.   Should this precede the response?  So if false, a response follows 
else an exception?  Do we need a length here?  Where is the header that the 
message name refers too?
bq.  
bq.  Devaraj Das wrote:
bq.  Length will be taken care of by the protobuf 
serialization/deserialization. The header is the combination of callId, error. 
If the 'header' is confusing, I can take it off the object name. Let me know.
bq.  
bq.  Michael Stack wrote:
bq.  Yeah, take away the header.  Length I think is good.  Makes it more 
robust (IIRC, we went out of our way to add length to the old RPC to help 
clients figure how much to pull).
bq.  
bq.  Devaraj Das wrote:
bq.  The argument above for 'length' applies here too...

Agreed.  So high level, the response and request have a length of the total 
message?  If so, don't need it down inside preceeding pb messages.


- Michael


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4096/#review6613
---


On 2012-03-30 23:29:32, Devaraj Das wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4096/
bq.  ---
bq.  
bq.  (Updated 2012-03-30 23:29:32)
bq.  
bq.  
bq.  Review request for Michael Stack and Benoit Sigoure.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Switch RPC call envelope/headers to PBs
bq.  
bq.  
bq.  This addresses bug HBASE-5451.
bq.  https://issues.apache.org/jira/browse/HBASE-5451
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/DataOutputOutputStream.java
 1307644 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java
 

[jira] [Commented] (HBASE-5682) Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only)

2012-04-02 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244367#comment-13244367
 ] 

stack commented on HBASE-5682:
--

@Nkeywal  Hows' this relate to your TRUNK work (if at all)?

 Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 
 only)
 --

 Key: HBASE-5682
 URL: https://issues.apache.org/jira/browse/HBASE-5682
 Project: HBase
  Issue Type: Improvement
  Components: client
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.94.0

 Attachments: 5682-all-v2.txt, 5682-all-v3.txt, 5682-all.txt, 
 5682-v2.txt, 5682.txt


 Just realized that without this HBASE-4805 is broken.
 I.e. there's no point keeping a persistent HConnection around if it can be 
 rendered permanently unusable if the ZK connection is lost temporarily.
 Note that this is fixed in 0.96 with HBASE-5399 (but that seems to big to 
 backport)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5451) Switch RPC call envelope/headers to PBs

2012-04-02 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244368#comment-13244368
 ] 

jirapos...@reviews.apache.org commented on HBASE-5451:
--



bq.  On 2012-04-02 00:21:20, Michael Stack wrote:
bq.   
http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto,
 line 66
bq.   https://reviews.apache.org/r/4096/diff/3/?file=97744#file97744line66
bq.  
bq.   Which part in here is the 'header'?  How does it relate to 
ConnectionHeaderProto?
bq.   
bq.   request can be an Invocation/Writable?  Or a protobuf?  Do we need a 
length in here?
bq.  
bq.  Devaraj Das wrote:
bq.  Today the only 'header' is the callId.. There is no relation to 
ConnectionHeaderProto. If the 'header' is confusing, I can take it off the 
object name. Let me know.
bq.  
bq.  'request' in this patch is only a Invocation/Writable. In theory, it 
could be a protobuf object as well (since it is just bytes), but, for protobuf, 
we could make things more explicit by defining a protobuf object rather than a 
opaque set of bytes. But that's another jira (ProtoBufRpcEngine implementation 
similar to Hadoop). Length is not needed - the protobuf 
serialization/deserialization will take care of it..
bq.  
bq.  Michael Stack wrote:
bq.  I think taking the 'Header' off Request/Response would be best (Did I 
ask you add it previous?  If so, sorry... I misunderstood.  Thanks for being 
accomodating).   Yes, on a new issue to make it pb rather than opaque bytes.  
Do you have to do something here -- make bytes optional? -- to allow for the 
later pb replacement?
bq.  
bq.  On length, thats probably good to keep.  For us, we'll give the stream 
to a pb deserializer but other clients might want to know how many bytes on the 
line so keep it I'd say.
bq.  
bq.  Devaraj Das wrote:
bq.  Yes, I'll take off the 'header' from the message name. I could make 
the 'bytes' field optional.
bq.  
bq.  Actually, on the length, I am not sure I understand why we need it in 
the PB model. Generally speaking, clients talking to servers have to be aware 
of the PB encoding in order for them to make any sense of the PB data.. The PB 
type 'bytes' has the length taken care of in the implementation of 
serialization/deserialization internally. In that sense, I don't think having 
an explicit length field is required. Does this reasoning make sense?
bq.  
bq.  (Also note that the top level RPC request envelope has the length 
preceding the request data)
bq.  
bq.  Michael Stack wrote:
bq.  If the top level rpc request envelope has the length, then I agree w/ 
you, its not needed as prefix on pb messages.

cool


bq.  On 2012-04-02 00:21:20, Michael Stack wrote:
bq.   
http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto,
 line 93
bq.   https://reviews.apache.org/r/4096/diff/3/?file=97744#file97744line93
bq.  
bq.   Should this precede the response?  So if false, a response follows 
else an exception?  Do we need a length here?  Where is the header that the 
message name refers too?
bq.  
bq.  Devaraj Das wrote:
bq.  Length will be taken care of by the protobuf 
serialization/deserialization. The header is the combination of callId, error. 
If the 'header' is confusing, I can take it off the object name. Let me know.
bq.  
bq.  Michael Stack wrote:
bq.  Yeah, take away the header.  Length I think is good.  Makes it more 
robust (IIRC, we went out of our way to add length to the old RPC to help 
clients figure how much to pull).
bq.  
bq.  Devaraj Das wrote:
bq.  The argument above for 'length' applies here too...
bq.  
bq.  Michael Stack wrote:
bq.  Agreed.  So high level, the response and request have a length of the 
total message?  If so, don't need it down inside preceeding pb messages.

I meant the argument on the PB encoding.. 

The RPC response envelope, even today, doesn't include the length. For 
instance, the client side of the method HBaseClient.receiveResponse starts with 
reading the callId. 


- Devaraj


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4096/#review6613
---


On 2012-03-30 23:29:32, Devaraj Das wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4096/
bq.  ---
bq.  
bq.  (Updated 2012-03-30 23:29:32)
bq.  
bq.  
bq.  Review request for Michael Stack and Benoit Sigoure.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Switch RPC call envelope/headers to PBs
bq.  
bq.  
bq.  This addresses bug HBASE-5451.
bq.  

[jira] [Commented] (HBASE-5451) Switch RPC call envelope/headers to PBs

2012-04-02 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244377#comment-13244377
 ] 

jirapos...@reviews.apache.org commented on HBASE-5451:
--



bq.  On 2012-04-02 00:21:20, Michael Stack wrote:
bq.   
http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto,
 line 93
bq.   https://reviews.apache.org/r/4096/diff/3/?file=97744#file97744line93
bq.  
bq.   Should this precede the response?  So if false, a response follows 
else an exception?  Do we need a length here?  Where is the header that the 
message name refers too?
bq.  
bq.  Devaraj Das wrote:
bq.  Length will be taken care of by the protobuf 
serialization/deserialization. The header is the combination of callId, error. 
If the 'header' is confusing, I can take it off the object name. Let me know.
bq.  
bq.  Michael Stack wrote:
bq.  Yeah, take away the header.  Length I think is good.  Makes it more 
robust (IIRC, we went out of our way to add length to the old RPC to help 
clients figure how much to pull).
bq.  
bq.  Devaraj Das wrote:
bq.  The argument above for 'length' applies here too...
bq.  
bq.  Michael Stack wrote:
bq.  Agreed.  So high level, the response and request have a length of the 
total message?  If so, don't need it down inside preceeding pb messages.
bq.  
bq.  Devaraj Das wrote:
bq.  I meant the argument on the PB encoding.. 
bq.  
bq.  The RPC response envelope, even today, doesn't include the length. For 
instance, the client side of the method HBaseClient.receiveResponse starts with 
reading the callId.

Ok.  We are replicating what was there previous.  Lets make new jira for doing 
things like a length prefix.


- Michael


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4096/#review6613
---


On 2012-03-30 23:29:32, Devaraj Das wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4096/
bq.  ---
bq.  
bq.  (Updated 2012-03-30 23:29:32)
bq.  
bq.  
bq.  Review request for Michael Stack and Benoit Sigoure.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Switch RPC call envelope/headers to PBs
bq.  
bq.  
bq.  This addresses bug HBASE-5451.
bq.  https://issues.apache.org/jira/browse/HBASE-5451
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/DataOutputOutputStream.java
 1307644 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java
 1307644 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java
 1307644 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/RPCMessageProtos.java
 PRE-CREATION 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/security/User.java
 1307644 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto
 PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/4096/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Devaraj
bq.  
bq.



 Switch RPC call envelope/headers to PBs
 ---

 Key: HBASE-5451
 URL: https://issues.apache.org/jira/browse/HBASE-5451
 Project: HBase
  Issue Type: Sub-task
  Components: ipc, master, migration, regionserver
Affects Versions: 0.94.0
Reporter: Todd Lipcon
Assignee: Devaraj Das
 Fix For: 0.96.0

 Attachments: rpc-proto.2.txt, rpc-proto.3.txt, rpc-proto.patch.1_2, 
 rpc-proto.r5.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5451) Switch RPC call envelope/headers to PBs

2012-04-02 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244378#comment-13244378
 ] 

jirapos...@reviews.apache.org commented on HBASE-5451:
--



bq.  On 2012-04-02 00:21:20, Michael Stack wrote:
bq.   
http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto,
 line 93
bq.   https://reviews.apache.org/r/4096/diff/3/?file=97744#file97744line93
bq.  
bq.   Should this precede the response?  So if false, a response follows 
else an exception?  Do we need a length here?  Where is the header that the 
message name refers too?
bq.  
bq.  Devaraj Das wrote:
bq.  Length will be taken care of by the protobuf 
serialization/deserialization. The header is the combination of callId, error. 
If the 'header' is confusing, I can take it off the object name. Let me know.
bq.  
bq.  Michael Stack wrote:
bq.  Yeah, take away the header.  Length I think is good.  Makes it more 
robust (IIRC, we went out of our way to add length to the old RPC to help 
clients figure how much to pull).
bq.  
bq.  Devaraj Das wrote:
bq.  The argument above for 'length' applies here too...
bq.  
bq.  Michael Stack wrote:
bq.  Agreed.  So high level, the response and request have a length of the 
total message?  If so, don't need it down inside preceeding pb messages.
bq.  
bq.  Devaraj Das wrote:
bq.  I meant the argument on the PB encoding.. 
bq.  
bq.  The RPC response envelope, even today, doesn't include the length. For 
instance, the client side of the method HBaseClient.receiveResponse starts with 
reading the callId.
bq.  
bq.  Michael Stack wrote:
bq.  Ok.  We are replicating what was there previous.  Lets make new jira 
for doing things like a length prefix.

Okay let's discuss that in a separate jira.. 

Otherwise, do you think the patch is good to go? If so, I'll submit a new patch 
with some of the comments incorporated.


- Devaraj


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4096/#review6613
---


On 2012-03-30 23:29:32, Devaraj Das wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4096/
bq.  ---
bq.  
bq.  (Updated 2012-03-30 23:29:32)
bq.  
bq.  
bq.  Review request for Michael Stack and Benoit Sigoure.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Switch RPC call envelope/headers to PBs
bq.  
bq.  
bq.  This addresses bug HBASE-5451.
bq.  https://issues.apache.org/jira/browse/HBASE-5451
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/DataOutputOutputStream.java
 1307644 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java
 1307644 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java
 1307644 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/RPCMessageProtos.java
 PRE-CREATION 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/security/User.java
 1307644 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto
 PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/4096/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Devaraj
bq.  
bq.



 Switch RPC call envelope/headers to PBs
 ---

 Key: HBASE-5451
 URL: https://issues.apache.org/jira/browse/HBASE-5451
 Project: HBase
  Issue Type: Sub-task
  Components: ipc, master, migration, regionserver
Affects Versions: 0.94.0
Reporter: Todd Lipcon
Assignee: Devaraj Das
 Fix For: 0.96.0

 Attachments: rpc-proto.2.txt, rpc-proto.3.txt, rpc-proto.patch.1_2, 
 rpc-proto.r5.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-5694) getRowsWithColumnsTs() in Thrift service handles timestamps incorrectly

2012-04-02 Thread stack (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-5694.
--

   Resolution: Fixed
Fix Version/s: (was: 0.92.2)
   0.94.0
 Hadoop Flags: Reviewed

Applied to 0.94 and to trunk.  Thanks for the patch Wouter (I ran 
testthriftserver local and it passed).  Doesn't look like thrift2 has similar 
code so passed on trying to apply the patch there.

 getRowsWithColumnsTs() in Thrift service handles timestamps incorrectly
 ---

 Key: HBASE-5694
 URL: https://issues.apache.org/jira/browse/HBASE-5694
 Project: HBase
  Issue Type: Bug
  Components: thrift
Affects Versions: 0.92.1
Reporter: Wouter Bolsterlee
 Fix For: 0.94.0

 Attachments: HBASE-5694-trunk-20120402.patch, HBASE-5694.patch


 The getRowsWithColumnsTs() method in the Thrift interface only applies the 
 timestamp if columns are explicitly specified. However, this method also 
 allows for columns to be unspecified (this is even used internally to 
 implement e.g. getRows()). The cause of the bug is a minor scoping issue: the 
 time range is set inside a wrong if statement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5451) Switch RPC call envelope/headers to PBs

2012-04-02 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244389#comment-13244389
 ] 

jirapos...@reviews.apache.org commented on HBASE-5451:
--



bq.  On 2012-04-02 00:21:20, Michael Stack wrote:
bq.   
http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto,
 line 93
bq.   https://reviews.apache.org/r/4096/diff/3/?file=97744#file97744line93
bq.  
bq.   Should this precede the response?  So if false, a response follows 
else an exception?  Do we need a length here?  Where is the header that the 
message name refers too?
bq.  
bq.  Devaraj Das wrote:
bq.  Length will be taken care of by the protobuf 
serialization/deserialization. The header is the combination of callId, error. 
If the 'header' is confusing, I can take it off the object name. Let me know.
bq.  
bq.  Michael Stack wrote:
bq.  Yeah, take away the header.  Length I think is good.  Makes it more 
robust (IIRC, we went out of our way to add length to the old RPC to help 
clients figure how much to pull).
bq.  
bq.  Devaraj Das wrote:
bq.  The argument above for 'length' applies here too...
bq.  
bq.  Michael Stack wrote:
bq.  Agreed.  So high level, the response and request have a length of the 
total message?  If so, don't need it down inside preceeding pb messages.
bq.  
bq.  Devaraj Das wrote:
bq.  I meant the argument on the PB encoding.. 
bq.  
bq.  The RPC response envelope, even today, doesn't include the length. For 
instance, the client side of the method HBaseClient.receiveResponse starts with 
reading the callId.
bq.  
bq.  Michael Stack wrote:
bq.  Ok.  We are replicating what was there previous.  Lets make new jira 
for doing things like a length prefix.
bq.  
bq.  Devaraj Das wrote:
bq.  Okay let's discuss that in a separate jira.. 
bq.  
bq.  Otherwise, do you think the patch is good to go? If so, I'll submit a 
new patch with some of the comments incorporated.

There items above you said you'd address such as removing Header from the 
request and response and cleaning up doc in the .proto file, right?


- Michael


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4096/#review6613
---


On 2012-03-30 23:29:32, Devaraj Das wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4096/
bq.  ---
bq.  
bq.  (Updated 2012-03-30 23:29:32)
bq.  
bq.  
bq.  Review request for Michael Stack and Benoit Sigoure.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Switch RPC call envelope/headers to PBs
bq.  
bq.  
bq.  This addresses bug HBASE-5451.
bq.  https://issues.apache.org/jira/browse/HBASE-5451
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/DataOutputOutputStream.java
 1307644 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java
 1307644 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java
 1307644 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/RPCMessageProtos.java
 PRE-CREATION 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/security/User.java
 1307644 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto
 PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/4096/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Devaraj
bq.  
bq.



 Switch RPC call envelope/headers to PBs
 ---

 Key: HBASE-5451
 URL: https://issues.apache.org/jira/browse/HBASE-5451
 Project: HBase
  Issue Type: Sub-task
  Components: ipc, master, migration, regionserver
Affects Versions: 0.94.0
Reporter: Todd Lipcon
Assignee: Devaraj Das
 Fix For: 0.96.0

 Attachments: rpc-proto.2.txt, rpc-proto.3.txt, rpc-proto.patch.1_2, 
 rpc-proto.r5.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5451) Switch RPC call envelope/headers to PBs

2012-04-02 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244400#comment-13244400
 ] 

jirapos...@reviews.apache.org commented on HBASE-5451:
--



bq.  On 2012-04-02 00:21:20, Michael Stack wrote:
bq.   
http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto,
 line 93
bq.   https://reviews.apache.org/r/4096/diff/3/?file=97744#file97744line93
bq.  
bq.   Should this precede the response?  So if false, a response follows 
else an exception?  Do we need a length here?  Where is the header that the 
message name refers too?
bq.  
bq.  Devaraj Das wrote:
bq.  Length will be taken care of by the protobuf 
serialization/deserialization. The header is the combination of callId, error. 
If the 'header' is confusing, I can take it off the object name. Let me know.
bq.  
bq.  Michael Stack wrote:
bq.  Yeah, take away the header.  Length I think is good.  Makes it more 
robust (IIRC, we went out of our way to add length to the old RPC to help 
clients figure how much to pull).
bq.  
bq.  Devaraj Das wrote:
bq.  The argument above for 'length' applies here too...
bq.  
bq.  Michael Stack wrote:
bq.  Agreed.  So high level, the response and request have a length of the 
total message?  If so, don't need it down inside preceeding pb messages.
bq.  
bq.  Devaraj Das wrote:
bq.  I meant the argument on the PB encoding.. 
bq.  
bq.  The RPC response envelope, even today, doesn't include the length. For 
instance, the client side of the method HBaseClient.receiveResponse starts with 
reading the callId.
bq.  
bq.  Michael Stack wrote:
bq.  Ok.  We are replicating what was there previous.  Lets make new jira 
for doing things like a length prefix.
bq.  
bq.  Devaraj Das wrote:
bq.  Okay let's discuss that in a separate jira.. 
bq.  
bq.  Otherwise, do you think the patch is good to go? If so, I'll submit a 
new patch with some of the comments incorporated.
bq.  
bq.  Michael Stack wrote:
bq.  There items above you said you'd address such as removing Header from 
the request and response and cleaning up doc in the .proto file, right?

Correct .. that's what I meant to include in the new patch.


- Devaraj


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4096/#review6613
---


On 2012-03-30 23:29:32, Devaraj Das wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4096/
bq.  ---
bq.  
bq.  (Updated 2012-03-30 23:29:32)
bq.  
bq.  
bq.  Review request for Michael Stack and Benoit Sigoure.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Switch RPC call envelope/headers to PBs
bq.  
bq.  
bq.  This addresses bug HBASE-5451.
bq.  https://issues.apache.org/jira/browse/HBASE-5451
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/DataOutputOutputStream.java
 1307644 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java
 1307644 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java
 1307644 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/RPCMessageProtos.java
 PRE-CREATION 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/security/User.java
 1307644 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto
 PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/4096/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Devaraj
bq.  
bq.



 Switch RPC call envelope/headers to PBs
 ---

 Key: HBASE-5451
 URL: https://issues.apache.org/jira/browse/HBASE-5451
 Project: HBase
  Issue Type: Sub-task
  Components: ipc, master, migration, regionserver
Affects Versions: 0.94.0
Reporter: Todd Lipcon
Assignee: Devaraj Das
 Fix For: 0.96.0

 Attachments: rpc-proto.2.txt, rpc-proto.3.txt, rpc-proto.patch.1_2, 
 rpc-proto.r5.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5699) Should we use muti HLog or Writer in HLog in a HRegionServer

2012-04-02 Thread binlijin (Created) (JIRA)
Should we use muti HLog or Writer in HLog in a HRegionServer


 Key: HBASE-5699
 URL: https://issues.apache.org/jira/browse/HBASE-5699
 Project: HBase
  Issue Type: Improvement
Reporter: binlijin




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5689) Skipping RecoveredEdits may cause data loss

2012-04-02 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244418#comment-13244418
 ] 

stack commented on HBASE-5689:
--

Good one Chunhui.  I think the patch good.

Nice reproduction of the problem in a test.  Where in the test do you find that 
we've lost the third edit?

So we name the file when we write it for its first edit, then when we move it 
into place, we rename it to be by last edit in the file?  Add a comment to that 
effect I'd say else could be confusing.  Hmm... I suppose you have it here on 
the doc for getCompletedRecoveredEditsFilePath.  Thats probably good enough.. 
but no harm explaining why we go from naming file w/ first edit to instead name 
it for the last edit.



 Skipping RecoveredEdits may cause data loss
 ---

 Key: HBASE-5689
 URL: https://issues.apache.org/jira/browse/HBASE-5689
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.94.0
Reporter: chunhui shen
Assignee: chunhui shen
Priority: Critical
 Fix For: 0.94.0

 Attachments: 5689-simplified.txt, 5689-testcase.patch, 
 HBASE-5689.patch


 Let's see the following scenario:
 1.Region is on the server A
 2.put KV(r1-v1) to the region
 3.move region from server A to server B
 4.put KV(r2-v2) to the region
 5.move region from server B to server A
 6.put KV(r3-v3) to the region
 7.kill -9 server B and start it
 8.kill -9 server A and start it 
 9.scan the region, we could only get two KV(r1-v1,r2-v2), the third 
 KV(r3-v3) is lost.
 Let's analyse the upper scenario from the code:
 1.the edit logs of KV(r1-v1) and KV(r3-v3) are both recorded in the same 
 hlog file on server A.
 2.when we split server B's hlog file in the process of ServerShutdownHandler, 
 we create one RecoveredEdits file f1 for the region.
 2.when we split server A's hlog file in the process of ServerShutdownHandler, 
 we create another RecoveredEdits file f2 for the region.
 3.however, RecoveredEdits file f2 will be skiped when initializing region
 HRegion#replayRecoveredEditsIfAny
 {code}
  for (Path edits: files) {
   if (edits == null || !this.fs.exists(edits)) {
 LOG.warn(Null or non-existent edits file:  + edits);
 continue;
   }
   if (isZeroLengthThenDelete(this.fs, edits)) continue;
   if (checkSafeToSkip) {
 Path higher = files.higher(edits);
 long maxSeqId = Long.MAX_VALUE;
 if (higher != null) {
   // Edit file name pattern, HLog.EDITFILES_NAME_PATTERN: -?[0-9]+
   String fileName = higher.getName();
   maxSeqId = Math.abs(Long.parseLong(fileName));
 }
 if (maxSeqId = minSeqId) {
   String msg = Maximum possible sequenceid for this log is  + 
 maxSeqId
   + , skipped the whole file, path= + edits;
   LOG.debug(msg);
   continue;
 } else {
   checkSafeToSkip = false;
 }
   }
 {code}
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5699) Should we use muti HLog or Writer in HLog in a HRegionServer

2012-04-02 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244419#comment-13244419
 ] 

stack commented on HBASE-5699:
--

Please provide more detail on what this issue is about and correct the subject 
so its properly spelled.  Thanks.

 Should we use muti HLog or Writer in HLog in a HRegionServer
 

 Key: HBASE-5699
 URL: https://issues.apache.org/jira/browse/HBASE-5699
 Project: HBase
  Issue Type: Improvement
Reporter: binlijin



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5682) Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only)

2012-04-02 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244420#comment-13244420
 ] 

Lars Hofhansl commented on HBASE-5682:
--

One other strangeness I found is that none of ZKUtil methods actually throw 
exceptions. They retry (via RecoverableZooKeeper) and then just log a message 
if there is a failure. This is especially annoying with ZooKeeperWatcher, 
because there is no way of telling whether the connection succeeded of not from 
the outside.

 Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 
 only)
 --

 Key: HBASE-5682
 URL: https://issues.apache.org/jira/browse/HBASE-5682
 Project: HBase
  Issue Type: Improvement
  Components: client
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.94.0

 Attachments: 5682-all-v2.txt, 5682-all-v3.txt, 5682-all.txt, 
 5682-v2.txt, 5682.txt


 Just realized that without this HBASE-4805 is broken.
 I.e. there's no point keeping a persistent HConnection around if it can be 
 rendered permanently unusable if the ZK connection is lost temporarily.
 Note that this is fixed in 0.96 with HBASE-5399 (but that seems to big to 
 backport)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5682) Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only)

2012-04-02 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244436#comment-13244436
 ] 

stack commented on HBASE-5682:
--

Can we add an isAlive to ZKW?

 Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 
 only)
 --

 Key: HBASE-5682
 URL: https://issues.apache.org/jira/browse/HBASE-5682
 Project: HBase
  Issue Type: Improvement
  Components: client
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.94.0

 Attachments: 5682-all-v2.txt, 5682-all-v3.txt, 5682-all.txt, 
 5682-v2.txt, 5682.txt


 Just realized that without this HBASE-4805 is broken.
 I.e. there's no point keeping a persistent HConnection around if it can be 
 rendered permanently unusable if the ZK connection is lost temporarily.
 Note that this is fixed in 0.96 with HBASE-5399 (but that seems to big to 
 backport)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (HBASE-5699) Should we use muti HLog or Writer in HLog in a HRegionServer

2012-04-02 Thread stack (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244419#comment-13244419
 ] 

stack edited comment on HBASE-5699 at 4/2/12 6:48 PM:
--

Please provide more detail on what this issue is about and correct the subject 
so it's properly spelled.  Thanks.

  was (Author: stack):
Please provide more detail on what this issue is about and correct the 
subject so its properly spelled.  Thanks.
  
 Should we use muti HLog or Writer in HLog in a HRegionServer
 

 Key: HBASE-5699
 URL: https://issues.apache.org/jira/browse/HBASE-5699
 Project: HBase
  Issue Type: Improvement
Reporter: binlijin



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5682) Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only)

2012-04-02 Thread nkeywal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1320#comment-1320
 ] 

nkeywal commented on HBASE-5682:


.bq none of ZKUtil methods actually throw exceptions
From what is see on 0.96 it should, as the return is not reached: the pattern 
is too call keeperException, and keeperException throws an exception.
{noformat}
  public void keeperException(KeeperException ke)
  throws KeeperException {
LOG.error(prefix(Received unexpected KeeperException, re-throwing 
exception), ke);
throw ke;
  }
{noformat}


 Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 
 only)
 --

 Key: HBASE-5682
 URL: https://issues.apache.org/jira/browse/HBASE-5682
 Project: HBase
  Issue Type: Improvement
  Components: client
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.94.0

 Attachments: 5682-all-v2.txt, 5682-all-v3.txt, 5682-all.txt, 
 5682-v2.txt, 5682.txt


 Just realized that without this HBASE-4805 is broken.
 I.e. there's no point keeping a persistent HConnection around if it can be 
 rendered permanently unusable if the ZK connection is lost temporarily.
 Note that this is fixed in 0.96 with HBASE-5399 (but that seems to big to 
 backport)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3909) Add dynamic config

2012-04-02 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244456#comment-13244456
 ] 

stack commented on HBASE-3909:
--

Nothing in hadoop-7001 guarantees that what is in the *.xml files is in 
agreement w/ what gets POSTed to the daemon, right? 

 Add dynamic config
 --

 Key: HBASE-3909
 URL: https://issues.apache.org/jira/browse/HBASE-3909
 Project: HBase
  Issue Type: Bug
Reporter: stack
 Fix For: 0.96.0


 I'm sure this issue exists already, at least as part of the discussion around 
 making online schema edits possible, but no hard this having its own issue.  
 Ted started a conversation on this topic up on dev and Todd suggested we 
 lookd at how Hadoop did it over in HADOOP-7001

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5666) RegionServer doesn't retry to check if base node is available

2012-04-02 Thread Matteo Bertozzi (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-5666:
---

Attachment: HBASE-5666-v4.patch

 RegionServer doesn't retry to check if base node is available
 -

 Key: HBASE-5666
 URL: https://issues.apache.org/jira/browse/HBASE-5666
 Project: HBase
  Issue Type: Bug
  Components: regionserver, zookeeper
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Attachments: HBASE-5666-v1.patch, HBASE-5666-v2.patch, 
 HBASE-5666-v3.patch, HBASE-5666-v4.patch, hbase-1-regionserver.log, 
 hbase-2-regionserver.log, hbase-3-regionserver.log, hbase-master.log, 
 hbase-regionserver.log, hbase-zookeeper.log


 I've a script that starts hbase and a couple of region servers in distributed 
 mode (hbase.cluster.distributed = true)
 {code}
 $HBASE_HOME/bin/start-hbase.sh
 $HBASE_HOME/bin/local-regionservers.sh start 1 2 3
 {code}
 but the region servers are not able to start...
 It seems that during the RS start the the znode is still not available, and 
 HRegionServer.initializeZooKeeper() check just once if the base not is 
 available.
 {code}
 2012-03-28 21:54:05,013 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Check the value 
 configured in 'zookeeper.znode.parent'. There could be a mismatch with the 
 one configured in the master.
 2012-03-28 21:54:08,598 FATAL 
 org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server 
 localhost,60202,133296824: Initialization of RS failed.  Hence aborting 
 RS.
 java.io.IOException: Received the shutdown message while waiting.
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:626)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:596)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:558)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:672)
   at java.lang.Thread.run(Thread.java:662)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5694) getRowsWithColumnsTs() in Thrift service handles timestamps incorrectly

2012-04-02 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244462#comment-13244462
 ] 

Hudson commented on HBASE-5694:
---

Integrated in HBase-0.94 #78 (See 
[https://builds.apache.org/job/HBase-0.94/78/])
HBASE-5694 getRowsWithColumnsTs() in Thrift service handles timestamps 
incorrectly (Revision 1308446)

 Result = FAILURE
stack : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/thrift/ThriftServerRunner.java


 getRowsWithColumnsTs() in Thrift service handles timestamps incorrectly
 ---

 Key: HBASE-5694
 URL: https://issues.apache.org/jira/browse/HBASE-5694
 Project: HBase
  Issue Type: Bug
  Components: thrift
Affects Versions: 0.92.1
Reporter: Wouter Bolsterlee
 Fix For: 0.94.0

 Attachments: HBASE-5694-trunk-20120402.patch, HBASE-5694.patch


 The getRowsWithColumnsTs() method in the Thrift interface only applies the 
 timestamp if columns are explicitly specified. However, this method also 
 allows for columns to be unspecified (this is even used internally to 
 implement e.g. getRows()). The cause of the bug is a minor scoping issue: the 
 time range is set inside a wrong if statement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5682) Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only)

2012-04-02 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244468#comment-13244468
 ] 

Lars Hofhansl commented on HBASE-5682:
--

Yeah, my comment was wrong. It's not generally doing that.

What I do find is if the ZK quorum is down, none of getZookeeperWatcher(), 
masterAddressTracker.start(), and rootRegionTracker.start() actually fail. They 
just retry and then happily return, which is as designed, because they are 
asynchronous.
Would be nice to have a isAlive or waitForConnect method on ZKW that would 
throw if the connection could not be established.

The attached patch is still a vast improvement, but it could be made better 
(even with zk timeout set to 100ms and retries to 3, it still take 22s for 
ensureZookeeperTrackers to finish).


 Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 
 only)
 --

 Key: HBASE-5682
 URL: https://issues.apache.org/jira/browse/HBASE-5682
 Project: HBase
  Issue Type: Improvement
  Components: client
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.94.0

 Attachments: 5682-all-v2.txt, 5682-all-v3.txt, 5682-all.txt, 
 5682-v2.txt, 5682.txt


 Just realized that without this HBASE-4805 is broken.
 I.e. there's no point keeping a persistent HConnection around if it can be 
 rendered permanently unusable if the ZK connection is lost temporarily.
 Note that this is fixed in 0.96 with HBASE-5399 (but that seems to big to 
 backport)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5682) Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only)

2012-04-02 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244470#comment-13244470
 ] 

Lars Hofhansl commented on HBASE-5682:
--

Even isAlive or waitForConnect would need to rely on a timeout, so we wouldn't 
have won anything really.

 Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 
 only)
 --

 Key: HBASE-5682
 URL: https://issues.apache.org/jira/browse/HBASE-5682
 Project: HBase
  Issue Type: Improvement
  Components: client
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.94.0

 Attachments: 5682-all-v2.txt, 5682-all-v3.txt, 5682-all.txt, 
 5682-v2.txt, 5682.txt


 Just realized that without this HBASE-4805 is broken.
 I.e. there's no point keeping a persistent HConnection around if it can be 
 rendered permanently unusable if the ZK connection is lost temporarily.
 Note that this is fixed in 0.96 with HBASE-5399 (but that seems to big to 
 backport)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5666) RegionServer doesn't retry to check if base node is available

2012-04-02 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244471#comment-13244471
 ] 

stack commented on HBASE-5666:
--

On creation of ZooKeeperWatcher, we do following.  Why is it not sufficient?

{code}
  // The first call against zk can fail with connection loss.  Seems common.
  // Apparently this is recoverable.  Retry a while.
  // See http://wiki.apache.org/hadoop/ZooKeeper/ErrorHandling
  // TODO: Generalize out in ZKUtil.
  long wait = conf.getLong(HConstants.ZOOKEEPER_RECOVERABLE_WAITTIME,
  HConstants.DEFAULT_ZOOKEPER_RECOVERABLE_WAITIME);
  long finished = System.currentTimeMillis() + wait;
  KeeperException ke = null;
  do {
try {
  ZKUtil.createAndFailSilent(this, baseZNode);
  ke = null;
  break;
} catch (KeeperException.ConnectionLossException e) {
  if (LOG.isDebugEnabled()  
(isFinishedRetryingRecoverable(finished))) {
LOG.debug(Retrying zk create for another  +
  (finished - System.currentTimeMillis()) +
  ms; set 'hbase.zookeeper.recoverable.waittime' to change  +
  wait time);  + e.getMessage());
  }
  ke = e;
}
  } while (isFinishedRetryingRecoverable(finished));
{code}

Is the wait too short?

 RegionServer doesn't retry to check if base node is available
 -

 Key: HBASE-5666
 URL: https://issues.apache.org/jira/browse/HBASE-5666
 Project: HBase
  Issue Type: Bug
  Components: regionserver, zookeeper
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Attachments: HBASE-5666-v1.patch, HBASE-5666-v2.patch, 
 HBASE-5666-v3.patch, HBASE-5666-v4.patch, hbase-1-regionserver.log, 
 hbase-2-regionserver.log, hbase-3-regionserver.log, hbase-master.log, 
 hbase-regionserver.log, hbase-zookeeper.log


 I've a script that starts hbase and a couple of region servers in distributed 
 mode (hbase.cluster.distributed = true)
 {code}
 $HBASE_HOME/bin/start-hbase.sh
 $HBASE_HOME/bin/local-regionservers.sh start 1 2 3
 {code}
 but the region servers are not able to start...
 It seems that during the RS start the the znode is still not available, and 
 HRegionServer.initializeZooKeeper() check just once if the base not is 
 available.
 {code}
 2012-03-28 21:54:05,013 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Check the value 
 configured in 'zookeeper.znode.parent'. There could be a mismatch with the 
 one configured in the master.
 2012-03-28 21:54:08,598 FATAL 
 org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server 
 localhost,60202,133296824: Initialization of RS failed.  Hence aborting 
 RS.
 java.io.IOException: Received the shutdown message while waiting.
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:626)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:596)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:558)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:672)
   at java.lang.Thread.run(Thread.java:662)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5682) Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only)

2012-04-02 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5682:
-

Attachment: 5682-all-v4.txt

I think this is as good as we can get in 0.94.
# Removed the exception handling from ensureZookeeperTrackers none of these 
methods throw.
# added getZookeeperWatcher to two methods that just need a ZKW.

The key is that an HConnection will never be left in a permanently useless 
state. Can file another jira for better timeouts.

 Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 
 only)
 --

 Key: HBASE-5682
 URL: https://issues.apache.org/jira/browse/HBASE-5682
 Project: HBase
  Issue Type: Improvement
  Components: client
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.94.0

 Attachments: 5682-all-v2.txt, 5682-all-v3.txt, 5682-all-v4.txt, 
 5682-all.txt, 5682-v2.txt, 5682.txt


 Just realized that without this HBASE-4805 is broken.
 I.e. there's no point keeping a persistent HConnection around if it can be 
 rendered permanently unusable if the ZK connection is lost temporarily.
 Note that this is fixed in 0.96 with HBASE-5399 (but that seems to big to 
 backport)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3909) Add dynamic config

2012-04-02 Thread Uma Maheswara Rao G (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244482#comment-13244482
 ] 

Uma Maheswara Rao G commented on HBASE-3909:


Yes, As per my understanding, HADOOP-7001 will assume that, OM/other tools will 
update *.xml and POST the same configs to Daemon for updating in-memory values. 

 Add dynamic config
 --

 Key: HBASE-3909
 URL: https://issues.apache.org/jira/browse/HBASE-3909
 Project: HBase
  Issue Type: Bug
Reporter: stack
 Fix For: 0.96.0


 I'm sure this issue exists already, at least as part of the discussion around 
 making online schema edits possible, but no hard this having its own issue.  
 Ted started a conversation on this topic up on dev and Todd suggested we 
 lookd at how Hadoop did it over in HADOOP-7001

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5682) Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only)

2012-04-02 Thread nkeywal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244484#comment-13244484
 ] 

nkeywal commented on HBASE-5682:


In 0.96 this should work, with the restriction that the logic is that you can 
get a non working connection, that will get fixed when you try to use it. It's 
a different mechanism than the one for HBaseAdmin, as HBaseAdmin first check 
the connection. Thz ZK mechanism is more efficient (you save a remote call to 
check that the connection is really working), but is more complex.

However it seems it does not work at the end:
bq. What I saw in 0.96 is that the client was blocked for a very long time 
(gave up after a few minutes), even though I had set all timeouts to low 
values. This is also deadly in an app server setting. Might be a simple fix 
there, didn't dig deeper.

@lars What did you exactly do? I can do the fix it on 0.96.

 Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 
 only)
 --

 Key: HBASE-5682
 URL: https://issues.apache.org/jira/browse/HBASE-5682
 Project: HBase
  Issue Type: Improvement
  Components: client
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.94.0

 Attachments: 5682-all-v2.txt, 5682-all-v3.txt, 5682-all-v4.txt, 
 5682-all.txt, 5682-v2.txt, 5682.txt


 Just realized that without this HBASE-4805 is broken.
 I.e. there's no point keeping a persistent HConnection around if it can be 
 rendered permanently unusable if the ZK connection is lost temporarily.
 Note that this is fixed in 0.96 with HBASE-5399 (but that seems to big to 
 backport)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5666) RegionServer doesn't retry to check if base node is available

2012-04-02 Thread Matteo Bertozzi (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244488#comment-13244488
 ] 

Matteo Bertozzi commented on HBASE-5666:


The problem here is that there's no ConnectionLossException... if you take a 
look at the log you can see that there's no KeeperException but zookeeper 
respond that the base node doesn't exists.

 RegionServer doesn't retry to check if base node is available
 -

 Key: HBASE-5666
 URL: https://issues.apache.org/jira/browse/HBASE-5666
 Project: HBase
  Issue Type: Bug
  Components: regionserver, zookeeper
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Attachments: HBASE-5666-v1.patch, HBASE-5666-v2.patch, 
 HBASE-5666-v3.patch, HBASE-5666-v4.patch, hbase-1-regionserver.log, 
 hbase-2-regionserver.log, hbase-3-regionserver.log, hbase-master.log, 
 hbase-regionserver.log, hbase-zookeeper.log


 I've a script that starts hbase and a couple of region servers in distributed 
 mode (hbase.cluster.distributed = true)
 {code}
 $HBASE_HOME/bin/start-hbase.sh
 $HBASE_HOME/bin/local-regionservers.sh start 1 2 3
 {code}
 but the region servers are not able to start...
 It seems that during the RS start the the znode is still not available, and 
 HRegionServer.initializeZooKeeper() check just once if the base not is 
 available.
 {code}
 2012-03-28 21:54:05,013 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Check the value 
 configured in 'zookeeper.znode.parent'. There could be a mismatch with the 
 one configured in the master.
 2012-03-28 21:54:08,598 FATAL 
 org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server 
 localhost,60202,133296824: Initialization of RS failed.  Hence aborting 
 RS.
 java.io.IOException: Received the shutdown message while waiting.
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:626)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:596)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:558)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:672)
   at java.lang.Thread.run(Thread.java:662)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3909) Add dynamic config

2012-04-02 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244489#comment-13244489
 ] 

Todd Lipcon commented on HBASE-3909:


I think adding a refreshConfigs admin command is a good idea. It can re-read 
the configs off the local disk, and emit warnings for any configs that changed 
that were not runtime-reconfigurable.

 Add dynamic config
 --

 Key: HBASE-3909
 URL: https://issues.apache.org/jira/browse/HBASE-3909
 Project: HBase
  Issue Type: Bug
Reporter: stack
 Fix For: 0.96.0


 I'm sure this issue exists already, at least as part of the discussion around 
 making online schema edits possible, but no hard this having its own issue.  
 Ted started a conversation on this topic up on dev and Todd suggested we 
 lookd at how Hadoop did it over in HADOOP-7001

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4393) Implement a canary monitoring program

2012-04-02 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244491#comment-13244491
 ] 

stack commented on HBASE-4393:
--

Please format your contrib as a patch (git or svn add then do a git diff 
--no-prefix or svn diff).  Thanks.

This line is not necessary any more:

{code}
 * Copyright 2012 The Apache Software Foundation
{code}

Please fix this doc: ' * HBase Canary Tool, that that can be used to do'  (too 
many 'that's)

On the Sink interface, its not going to be used by anyone else if its private?  
That might be fine for first checkin.  Later when other Sinks we can open it up?

I think filesink is the wrong sink to do as first implementation.  Your first 
Sink should be StdOutSink using Logging system.  Notice how anything that is 
started with bin/hbase-daemon.sh gets log files set up for it (master, 
regionserver, but also rest, thrift, etc.).  Doing this, your emissions will be 
in a well-known place in files that are named with a format that matches other 
loggings made by hbase, etc.

This method is oddly named:

{code}
 public void publish(HRegionInfo region, HColumnDescriptor column, long msTime) 
{
{code}

It seems like its for logging messages like this: %s read from region %s 
column family %s in %dms\n,

... should method name be logReadTime?  Or publishReadTiming?

Whats the BasicParser do?  It matches what Tool does?  We don't want GnuParser?

I like this comment:

{code}
// user has specified an interval for canary breaths
{code}

Thats cute.

Put on one line:

{code}
if (conf == null)
  conf = HBaseConfiguration.create();
{code}

I think I should be able to run this once OR run it as a daemon.   Pass an arg 
if its to run as daemon process?

Can this code use any of the utility that is in hbck?

I like the Tool improvements.

Thanks Matteo.







 Implement a canary monitoring program
 -

 Key: HBASE-4393
 URL: https://issues.apache.org/jira/browse/HBASE-4393
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Matteo Bertozzi
 Attachments: Canary-v0.java, HBaseCanary.java


 This JIRA is to implement a standalone program that can be used to do canary 
 monitoring of a running HBase cluster. This program would gather a list of 
 the regions in the cluster, then iterate over them doing lightweight 
 operations (eg short scans) to provide metrics about latency as well as alert 
 on availability issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5666) RegionServer doesn't retry to check if base node is available

2012-04-02 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244492#comment-13244492
 ] 

stack commented on HBASE-5666:
--

This is the code that is supposed to create the base node right?  If we come 
out of here and there is no base node, then thats a problem?  Should the fix be 
down here in ZKW rather than up in regionserver?

 RegionServer doesn't retry to check if base node is available
 -

 Key: HBASE-5666
 URL: https://issues.apache.org/jira/browse/HBASE-5666
 Project: HBase
  Issue Type: Bug
  Components: regionserver, zookeeper
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Attachments: HBASE-5666-v1.patch, HBASE-5666-v2.patch, 
 HBASE-5666-v3.patch, HBASE-5666-v4.patch, hbase-1-regionserver.log, 
 hbase-2-regionserver.log, hbase-3-regionserver.log, hbase-master.log, 
 hbase-regionserver.log, hbase-zookeeper.log


 I've a script that starts hbase and a couple of region servers in distributed 
 mode (hbase.cluster.distributed = true)
 {code}
 $HBASE_HOME/bin/start-hbase.sh
 $HBASE_HOME/bin/local-regionservers.sh start 1 2 3
 {code}
 but the region servers are not able to start...
 It seems that during the RS start the the znode is still not available, and 
 HRegionServer.initializeZooKeeper() check just once if the base not is 
 available.
 {code}
 2012-03-28 21:54:05,013 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Check the value 
 configured in 'zookeeper.znode.parent'. There could be a mismatch with the 
 one configured in the master.
 2012-03-28 21:54:08,598 FATAL 
 org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server 
 localhost,60202,133296824: Initialization of RS failed.  Hence aborting 
 RS.
 java.io.IOException: Received the shutdown message while waiting.
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:626)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:596)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:558)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:672)
   at java.lang.Thread.run(Thread.java:662)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3909) Add dynamic config

2012-04-02 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244494#comment-13244494
 ] 

stack commented on HBASE-3909:
--

But it should go via zk I'd say since we have it rather than have us POST 
refreshConfigs to a servlet on each server

 Add dynamic config
 --

 Key: HBASE-3909
 URL: https://issues.apache.org/jira/browse/HBASE-3909
 Project: HBase
  Issue Type: Bug
Reporter: stack
 Fix For: 0.96.0


 I'm sure this issue exists already, at least as part of the discussion around 
 making online schema edits possible, but no hard this having its own issue.  
 Ted started a conversation on this topic up on dev and Todd suggested we 
 lookd at how Hadoop did it over in HADOOP-7001

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5693) When creating a region, the master initializes it and creates a memstore within the master server

2012-04-02 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244495#comment-13244495
 ] 

stack commented on HBASE-5693:
--

The failures because of your patch N?

 When creating a region, the master initializes it and creates a memstore 
 within the master server
 -

 Key: HBASE-5693
 URL: https://issues.apache.org/jira/browse/HBASE-5693
 Project: HBase
  Issue Type: Improvement
  Components: master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5593.v2.patch, 5693.v1.patch


 I didn't do a complete analysis, but the attached patch saves more than 0.25s 
 for each region creation and locally all the unit tests work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5697) Audit HBase for usage of deprecated hadoop 0.20.x property names.

2012-04-02 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244500#comment-13244500
 ] 

stack commented on HBASE-5697:
--

ooo... nelly.  Thats a long list Jon.

 Audit HBase for usage of deprecated hadoop 0.20.x property names.
 -

 Key: HBASE-5697
 URL: https://issues.apache.org/jira/browse/HBASE-5697
 Project: HBase
  Issue Type: Task
Reporter: Jonathan Hsieh

 Many xml config properties in Hadoop have changed in 0.23.  We should audit 
 hbase to insulate it from hadoop property name changes.
 Here is a list of the hadoop property name changes:
 http://hadoop.apache.org/common/docs/r0.23.1/hadoop-project-dist/hadoop-common/DeprecatedProperties.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3909) Add dynamic config

2012-04-02 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244509#comment-13244509
 ] 

stack commented on HBASE-3909:
--

An argument for redoing hadoop-7001 in hbase would be that you can reset 
configs in hbase the way you do it in hadoop.  I could go for that.

 Add dynamic config
 --

 Key: HBASE-3909
 URL: https://issues.apache.org/jira/browse/HBASE-3909
 Project: HBase
  Issue Type: Bug
Reporter: stack
 Fix For: 0.96.0


 I'm sure this issue exists already, at least as part of the discussion around 
 making online schema edits possible, but no hard this having its own issue.  
 Ted started a conversation on this topic up on dev and Todd suggested we 
 lookd at how Hadoop did it over in HADOOP-7001

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5693) When creating a region, the master initializes it and creates a memstore within the master server

2012-04-02 Thread nkeywal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244519#comment-13244519
 ] 

nkeywal commented on HBASE-5693:


I don't think so. I didn't see them locally.




 When creating a region, the master initializes it and creates a memstore 
 within the master server
 -

 Key: HBASE-5693
 URL: https://issues.apache.org/jira/browse/HBASE-5693
 Project: HBase
  Issue Type: Improvement
  Components: master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5593.v2.patch, 5693.v1.patch


 I didn't do a complete analysis, but the attached patch saves more than 0.25s 
 for each region creation and locally all the unit tests work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5443) Add PB-based calls to HRegionInterface

2012-04-02 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244522#comment-13244522
 ] 

stack commented on HBASE-5443:
--

There is also this write up of Todd's on why pb in first place over in hdfs: 
https://issues.apache.org/jira/browse/HDFS-2058?focusedCommentId=13047289page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13047289

 Add PB-based calls to HRegionInterface
 --

 Key: HBASE-5443
 URL: https://issues.apache.org/jira/browse/HBASE-5443
 Project: HBase
  Issue Type: Task
  Components: ipc, master, migration, regionserver
Reporter: Todd Lipcon
Assignee: Jimmy Xiang
 Fix For: 0.96.0

 Attachments: region_java-proto-mapping.pdf




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5666) RegionServer doesn't retry to check if base node is available

2012-04-02 Thread Matteo Bertozzi (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244537#comment-13244537
 ] 

Matteo Bertozzi commented on HBASE-5666:


m... maybe i've lost something but, in 0.92 and trunk that code was removed and 
there's just a call to ZKUtil.createAndFailSilent() that doesn't retry. Any 
idea?

https://github.com/apache/hbase/commit/6dc7ccf3779add13188bd73011e0d25bbab77a05

 RegionServer doesn't retry to check if base node is available
 -

 Key: HBASE-5666
 URL: https://issues.apache.org/jira/browse/HBASE-5666
 Project: HBase
  Issue Type: Bug
  Components: regionserver, zookeeper
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Attachments: HBASE-5666-v1.patch, HBASE-5666-v2.patch, 
 HBASE-5666-v3.patch, HBASE-5666-v4.patch, hbase-1-regionserver.log, 
 hbase-2-regionserver.log, hbase-3-regionserver.log, hbase-master.log, 
 hbase-regionserver.log, hbase-zookeeper.log


 I've a script that starts hbase and a couple of region servers in distributed 
 mode (hbase.cluster.distributed = true)
 {code}
 $HBASE_HOME/bin/start-hbase.sh
 $HBASE_HOME/bin/local-regionservers.sh start 1 2 3
 {code}
 but the region servers are not able to start...
 It seems that during the RS start the the znode is still not available, and 
 HRegionServer.initializeZooKeeper() check just once if the base not is 
 available.
 {code}
 2012-03-28 21:54:05,013 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Check the value 
 configured in 'zookeeper.znode.parent'. There could be a mismatch with the 
 one configured in the master.
 2012-03-28 21:54:08,598 FATAL 
 org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server 
 localhost,60202,133296824: Initialization of RS failed.  Hence aborting 
 RS.
 java.io.IOException: Received the shutdown message while waiting.
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:626)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:596)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:558)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:672)
   at java.lang.Thread.run(Thread.java:662)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5666) RegionServer doesn't retry to check if base node is available

2012-04-02 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244560#comment-13244560
 ] 

stack commented on HBASE-5666:
--

Thanks for digging.  Seems like the RecoverableZK is failing silently (smile).  
Seriously, it may be retrying any ConnectionLossException but if no base dir up 
on in zk, there's nothing for ZKW to 'watch'... it should fail construction (or 
this test needs to be moved out to an init method or something ...).  What you 
reckon?

 RegionServer doesn't retry to check if base node is available
 -

 Key: HBASE-5666
 URL: https://issues.apache.org/jira/browse/HBASE-5666
 Project: HBase
  Issue Type: Bug
  Components: regionserver, zookeeper
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Attachments: HBASE-5666-v1.patch, HBASE-5666-v2.patch, 
 HBASE-5666-v3.patch, HBASE-5666-v4.patch, hbase-1-regionserver.log, 
 hbase-2-regionserver.log, hbase-3-regionserver.log, hbase-master.log, 
 hbase-regionserver.log, hbase-zookeeper.log


 I've a script that starts hbase and a couple of region servers in distributed 
 mode (hbase.cluster.distributed = true)
 {code}
 $HBASE_HOME/bin/start-hbase.sh
 $HBASE_HOME/bin/local-regionservers.sh start 1 2 3
 {code}
 but the region servers are not able to start...
 It seems that during the RS start the the znode is still not available, and 
 HRegionServer.initializeZooKeeper() check just once if the base not is 
 available.
 {code}
 2012-03-28 21:54:05,013 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Check the value 
 configured in 'zookeeper.znode.parent'. There could be a mismatch with the 
 one configured in the master.
 2012-03-28 21:54:08,598 FATAL 
 org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server 
 localhost,60202,133296824: Initialization of RS failed.  Hence aborting 
 RS.
 java.io.IOException: Received the shutdown message while waiting.
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:626)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:596)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:558)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:672)
   at java.lang.Thread.run(Thread.java:662)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5666) RegionServer doesn't retry to check if base node is available

2012-04-02 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244564#comment-13244564
 ] 

Hadoop QA commented on HBASE-5666:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12521011/HBASE-5666-v4.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1370//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1370//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1370//console

This message is automatically generated.

 RegionServer doesn't retry to check if base node is available
 -

 Key: HBASE-5666
 URL: https://issues.apache.org/jira/browse/HBASE-5666
 Project: HBase
  Issue Type: Bug
  Components: regionserver, zookeeper
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Attachments: HBASE-5666-v1.patch, HBASE-5666-v2.patch, 
 HBASE-5666-v3.patch, HBASE-5666-v4.patch, hbase-1-regionserver.log, 
 hbase-2-regionserver.log, hbase-3-regionserver.log, hbase-master.log, 
 hbase-regionserver.log, hbase-zookeeper.log


 I've a script that starts hbase and a couple of region servers in distributed 
 mode (hbase.cluster.distributed = true)
 {code}
 $HBASE_HOME/bin/start-hbase.sh
 $HBASE_HOME/bin/local-regionservers.sh start 1 2 3
 {code}
 but the region servers are not able to start...
 It seems that during the RS start the the znode is still not available, and 
 HRegionServer.initializeZooKeeper() check just once if the base not is 
 available.
 {code}
 2012-03-28 21:54:05,013 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Check the value 
 configured in 'zookeeper.znode.parent'. There could be a mismatch with the 
 one configured in the master.
 2012-03-28 21:54:08,598 FATAL 
 org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server 
 localhost,60202,133296824: Initialization of RS failed.  Hence aborting 
 RS.
 java.io.IOException: Received the shutdown message while waiting.
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:626)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:596)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:558)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:672)
   at java.lang.Thread.run(Thread.java:662)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5693) When creating a region, the master initializes it and creates a memstore within the master server

2012-04-02 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5693:
-

   Resolution: Fixed
Fix Version/s: 0.96.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I tried the first three locally.  They pass for me.  Committed trunk.  Thanks 
for the patch N.

 When creating a region, the master initializes it and creates a memstore 
 within the master server
 -

 Key: HBASE-5693
 URL: https://issues.apache.org/jira/browse/HBASE-5693
 Project: HBase
  Issue Type: Improvement
  Components: master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5593.v2.patch, 5693.v1.patch


 I didn't do a complete analysis, but the attached patch saves more than 0.25s 
 for each region creation and locally all the unit tests work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5682) Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only)

2012-04-02 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244577#comment-13244577
 ] 

Lars Hofhansl commented on HBASE-5682:
--

Let me dig into 0.96 after I get this into 0.94... Wanna cut RC1 soon.

From the past comments here I see no objections to posted patch... Will commit 
soon. Please speak up if you disagree.

 Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 
 only)
 --

 Key: HBASE-5682
 URL: https://issues.apache.org/jira/browse/HBASE-5682
 Project: HBase
  Issue Type: Improvement
  Components: client
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.94.0

 Attachments: 5682-all-v2.txt, 5682-all-v3.txt, 5682-all-v4.txt, 
 5682-all.txt, 5682-v2.txt, 5682.txt


 Just realized that without this HBASE-4805 is broken.
 I.e. there's no point keeping a persistent HConnection around if it can be 
 rendered permanently unusable if the ZK connection is lost temporarily.
 Note that this is fixed in 0.96 with HBASE-5399 (but that seems to big to 
 backport)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5665) Repeated split causes HRegionServer failures and breaks table

2012-04-02 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5665:
-

   Resolution: Fixed
Fix Version/s: 0.94.0
   0.92.2
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to 0.92, 0.94, and trunk.  Thanks Cosmin and Matteo.

 Repeated split causes HRegionServer failures and breaks table 
 --

 Key: HBASE-5665
 URL: https://issues.apache.org/jira/browse/HBASE-5665
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.92.0, 0.92.1
Reporter: Cosmin Lehene
Assignee: Cosmin Lehene
Priority: Blocker
 Fix For: 0.92.2, 0.94.0

 Attachments: 5665trunk.v2.patch, HBASE-5665-0.92.patch, 
 HBASE-5665-trunk.patch


 Repeated splits on large tables (2 consecutive would suffice) will 
 essentially break the table (and the cluster), unrecoverable.
 The regionserver doing the split dies and the master will get into an 
 infinite loop trying to assign regions that seem to have the files missing 
 from HDFS.
 The table can be disabled once. upon trying to re-enable it, it will remain 
 in an intermediary state forever.
 I was able to reproduce this on a smaller table consistently.
 {code}
 hbase(main):030:0 (0..1).each{|x| put 't1', #{x}, 'f1:t', 'dd'}
 hbase(main):030:0 (0..1000).each{|x| split 't1', #{x*10}}
 {code}
 Running overlapping splits in parallel (e.g. #{x*10+1}, #{x*10+2}... ) 
 will reproduce the issue almost instantly and consistently. 
 {code}
 2012-03-28 10:57:16,320 INFO org.apache.hadoop.hbase.catalog.MetaEditor: 
 Offlined parent region t1,,1332957435767.2fb0473f4e71339e88dab0ee0d4dffa1. in 
 META
 2012-03-28 10:57:16,321 DEBUG 
 org.apache.hadoop.hbase.regionserver.CompactSplitThread: Split requested for 
 t1,5,1332957435767.648d30de55a5cec6fc2f56dcb3c7eee1..  
 compaction_queue=(0:1), split_queue=10
 2012-03-28 10:57:16,343 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of t1,,1332957435767.2fb0473f4e71339e88dab0ee0d4dffa1.; 
 Failed ld2,60020,1332957343833-daughterOpener=2469c5650ea2aeed631eb85d3cdc3124
 java.io.IOException: Failed 
 ld2,60020,1332957343833-daughterOpener=2469c5650ea2aeed631eb85d3cdc3124
 at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.openDaughters(SplitTransaction.java:363)
 at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:451)
 at 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: java.io.FileNotFoundException: File does not exist: 
 /hbase/t1/589c44cabba419c6ad8c9b427e5894e3.2fb0473f4e71339e88dab0ee0d4dffa1/f1/d62a852c25ad44e09518e102ca557237
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1822)
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSInputStream.init(DFSClient.java:1813)
 at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:544)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:187)
 at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:456)
 at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:341)
 at 
 org.apache.hadoop.hbase.regionserver.StoreFile$Reader.init(StoreFile.java:1008)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader.init(HalfStoreFileReader.java:65)
 at 
 org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:467)
 at 
 org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:548)
 at 
 org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:284)
 at org.apache.hadoop.hbase.regionserver.Store.init(Store.java:221)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2511)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:450)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3229)
 at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.openDaughterRegion(SplitTransaction.java:504)
 at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction$DaughterOpener.run(SplitTransaction.java:484)
 ... 1 more
 2012-03-28 10:57:16,345 FATAL 
 org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server 
 ld2,60020,1332957343833: Abort; we got an error after point-of-no-return
 {code}
 

[jira] [Updated] (HBASE-5665) Repeated split causes HRegionServer failures and breaks table

2012-04-02 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5665:
-

Attachment: 5665trunk.v2.patch

Same as last patch but w/ fixed javadoc... isAvailable is not closed and not 
closing.

 Repeated split causes HRegionServer failures and breaks table 
 --

 Key: HBASE-5665
 URL: https://issues.apache.org/jira/browse/HBASE-5665
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.92.0, 0.92.1
Reporter: Cosmin Lehene
Assignee: Cosmin Lehene
Priority: Blocker
 Fix For: 0.92.2, 0.94.0

 Attachments: 5665trunk.v2.patch, HBASE-5665-0.92.patch, 
 HBASE-5665-trunk.patch


 Repeated splits on large tables (2 consecutive would suffice) will 
 essentially break the table (and the cluster), unrecoverable.
 The regionserver doing the split dies and the master will get into an 
 infinite loop trying to assign regions that seem to have the files missing 
 from HDFS.
 The table can be disabled once. upon trying to re-enable it, it will remain 
 in an intermediary state forever.
 I was able to reproduce this on a smaller table consistently.
 {code}
 hbase(main):030:0 (0..1).each{|x| put 't1', #{x}, 'f1:t', 'dd'}
 hbase(main):030:0 (0..1000).each{|x| split 't1', #{x*10}}
 {code}
 Running overlapping splits in parallel (e.g. #{x*10+1}, #{x*10+2}... ) 
 will reproduce the issue almost instantly and consistently. 
 {code}
 2012-03-28 10:57:16,320 INFO org.apache.hadoop.hbase.catalog.MetaEditor: 
 Offlined parent region t1,,1332957435767.2fb0473f4e71339e88dab0ee0d4dffa1. in 
 META
 2012-03-28 10:57:16,321 DEBUG 
 org.apache.hadoop.hbase.regionserver.CompactSplitThread: Split requested for 
 t1,5,1332957435767.648d30de55a5cec6fc2f56dcb3c7eee1..  
 compaction_queue=(0:1), split_queue=10
 2012-03-28 10:57:16,343 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of t1,,1332957435767.2fb0473f4e71339e88dab0ee0d4dffa1.; 
 Failed ld2,60020,1332957343833-daughterOpener=2469c5650ea2aeed631eb85d3cdc3124
 java.io.IOException: Failed 
 ld2,60020,1332957343833-daughterOpener=2469c5650ea2aeed631eb85d3cdc3124
 at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.openDaughters(SplitTransaction.java:363)
 at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:451)
 at 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: java.io.FileNotFoundException: File does not exist: 
 /hbase/t1/589c44cabba419c6ad8c9b427e5894e3.2fb0473f4e71339e88dab0ee0d4dffa1/f1/d62a852c25ad44e09518e102ca557237
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1822)
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSInputStream.init(DFSClient.java:1813)
 at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:544)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:187)
 at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:456)
 at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:341)
 at 
 org.apache.hadoop.hbase.regionserver.StoreFile$Reader.init(StoreFile.java:1008)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader.init(HalfStoreFileReader.java:65)
 at 
 org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:467)
 at 
 org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:548)
 at 
 org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:284)
 at org.apache.hadoop.hbase.regionserver.Store.init(Store.java:221)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2511)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:450)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3229)
 at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.openDaughterRegion(SplitTransaction.java:504)
 at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction$DaughterOpener.run(SplitTransaction.java:484)
 ... 1 more
 2012-03-28 10:57:16,345 FATAL 
 org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server 
 ld2,60020,1332957343833: Abort; we got an error after point-of-no-return
 {code}
 http://hastebin.com/diqinibajo.avrasm
 later edit:
 (I'm using the last 4 characters from each string)
 Region 94e3 has storefile 

[jira] [Commented] (HBASE-5688) Convert zk root-region-server znode content to pb

2012-04-02 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244612#comment-13244612
 ] 

jirapos...@reviews.apache.org commented on HBASE-5688:
--



bq.  On 2012-04-01 16:41:00, Ted Yu wrote:
bq.   src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java, line 48
bq.   https://reviews.apache.org/r/4600/diff/1/?file=97843#file97843line48
bq.  
bq.   I think prefixedWithPBMagic would be a better name for this method.

Disagree.


bq.  On 2012-04-01 16:41:00, Ted Yu wrote:
bq.   
src/test/java/org/apache/hadoop/hbase/zookeeper/TestRootRegionTracker.java, 
line 28
bq.   https://reviews.apache.org/r/4600/diff/1/?file=97852#file97852line28
bq.  
bq.   Javadoc would be desirable.

Classname says what it does.


bq.  On 2012-04-01 16:41:00, Ted Yu wrote:
bq.   
src/test/java/org/apache/hadoop/hbase/zookeeper/TestRootRegionTracker.java, 
line 43
bq.   https://reviews.apache.org/r/4600/diff/1/?file=97852#file97852line43
bq.  
bq.   White space.

Will fix on commit.


- Michael


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4600/#review6606
---


On 2012-04-01 00:18:54, Michael Stack wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4600/
bq.  ---
bq.  
bq.  (Updated 2012-04-01 00:18:54)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Changes the content of the root location znode, root-region-server, to be
bq.  four magic bytes ('PBUF') followed by a protobuf message that holds the
bq.  ServerName of the server currently hosting root.
bq.  
bq.  D src/main/java/org/apache/hadoop/hbase/catalog/RootLocationEditor.java
bq.Removed. Had two methods, one to add root-region-server znode and another
bq.to removed it.  Rather, put these methods in RootRegionTracker.  It
bq.tracks root-region-server znode.  Having all to do w/ root-region-server
bq.is more cohesive.  Also makes it so can encapsulate in one class
bq.all to do w/ create, delete, and reading of root-region-server.
bq.We also want to purge the catalog package (See note at head of
bq.CatalogTracker).
bq.  M src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
bq.  M src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
bq.  M src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
bq.Get root region location from RootRegionTracker rather than from 
RootLocationEditor.
bq.  A src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java
bq.Utility to do w/ protobuf handling.  Has methods to help prefixing
bq.and stripping from serialized protobuf messages some 'magic'.
bq.  A 
src/main/java/org/apache/hadoop/hbase/protobuf/generated/ZooKeeperProtos.java
bq.PB generated.
bq.  M src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java
bq.Use new RootRegionTracker method for getting content of znode rather
bq.than do it all here (going via RootRegionTracker, we can keep how
bq.the znode content is serialized private to the RootRegionTracker class.
bq.  M src/main/java/org/apache/hadoop/hbase/zookeeper/RootRegionTracker.java
bq.Has the methods that used to be in RootLocationEditor plus a new
bq.  
bq.  
bq.  This addresses bug hbase-5688.
bq.  https://issues.apache.org/jira/browse/hbase-5688
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.src/main/java/org/apache/hadoop/hbase/catalog/RootLocationEditor.java 
c90864a 
bq.src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 
b2a5463 
bq.src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 
64def15 
bq.src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java 
PRE-CREATION 
bq.
src/main/java/org/apache/hadoop/hbase/protobuf/generated/ZooKeeperProtos.java 
PRE-CREATION 
bq.src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 
9c215b4 
bq.src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 2f05005 
bq.src/main/java/org/apache/hadoop/hbase/zookeeper/RootRegionTracker.java 
33e4e71 
bq.src/main/protobuf/ZooKeeper.proto PRE-CREATION 
bq.src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java 
533b2bf 
bq.
src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTrackerOnCluster.java 
fe37156 
bq.src/test/java/org/apache/hadoop/hbase/master/TestMasterNoCluster.java 
2132036 
bq.
src/test/java/org/apache/hadoop/hbase/zookeeper/TestRootRegionTracker.java 
PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/4600/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq. 

[jira] [Updated] (HBASE-5688) Convert zk root-region-server znode content to pb

2012-04-02 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5688:
-

Attachment: 5688v5.txt

v5 removes a single white space.  Its what I'll commit.

 Convert zk root-region-server znode content to pb
 -

 Key: HBASE-5688
 URL: https://issues.apache.org/jira/browse/HBASE-5688
 Project: HBase
  Issue Type: Task
Reporter: stack
Assignee: stack
 Fix For: 0.96.0

 Attachments: 5688.txt, 5688v4.txt, 5688v5.txt


 Move the root-region-server znode content from the versioned bytes that 
 ServerName.getVersionedBytes outputs to instead be pb.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5688) Convert zk root-region-server znode content to pb

2012-04-02 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5688:
-

  Resolution: Fixed
Hadoop Flags: Incompatible change,Reviewed
  Status: Resolved  (was: Patch Available)

Committed to trunk.  Thanks for review T{o,e}dd.

 Convert zk root-region-server znode content to pb
 -

 Key: HBASE-5688
 URL: https://issues.apache.org/jira/browse/HBASE-5688
 Project: HBase
  Issue Type: Task
Reporter: stack
Assignee: stack
 Fix For: 0.96.0

 Attachments: 5688.txt, 5688v4.txt, 5688v5.txt


 Move the root-region-server znode content from the versioned bytes that 
 ServerName.getVersionedBytes outputs to instead be pb.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5688) Convert zk root-region-server znode content to pb

2012-04-02 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244621#comment-13244621
 ] 

stack commented on HBASE-5688:
--

Oh, and thanks Jimmy for review.

 Convert zk root-region-server znode content to pb
 -

 Key: HBASE-5688
 URL: https://issues.apache.org/jira/browse/HBASE-5688
 Project: HBase
  Issue Type: Task
Reporter: stack
Assignee: stack
 Fix For: 0.96.0

 Attachments: 5688.txt, 5688v4.txt, 5688v5.txt


 Move the root-region-server znode content from the versioned bytes that 
 ServerName.getVersionedBytes outputs to instead be pb.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5692) Add real action time for HLogPrettyPrinter

2012-04-02 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5692:
-

Attachment: HBASE-5665-trunk.v2.patch

Same as v1 w/ some formatting changes.  +1 on this patch.

 Add real action time for HLogPrettyPrinter
 --

 Key: HBASE-5692
 URL: https://issues.apache.org/jira/browse/HBASE-5692
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Xing Shi
Priority: Minor
 Attachments: HBASE-5665-trunk.v2.patch, HBASE-5692.patch


 Now the HLogPrettyPrinter print the log without real op time but the timestamp
 {quote}
 Sequence 4 from region ee9877dfd55624f50b20acf572416a88 in table t5
   Action:
 row: r
 column: f3:q
 at time: Thu Jan 01 08:02:03 CST 1970
 {quote}
 Maybe we need to know the real op time like this
 {quote}
 Sequence 4 from region ee9877dfd55624f50b20acf572416a88 in table t5 at time: 
 Sun Apr 01 10:42:53 CST 2012
   Action:
 row: r
 column: f3:q
 timestamp: Thu Jan 01 08:02:03 CST 1970
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5692) Add real action time for HLogPrettyPrinter

2012-04-02 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5692:
-

Status: Patch Available  (was: Open)

 Add real action time for HLogPrettyPrinter
 --

 Key: HBASE-5692
 URL: https://issues.apache.org/jira/browse/HBASE-5692
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Xing Shi
Priority: Minor
 Attachments: HBASE-5665-trunk.v2.patch, HBASE-5692.patch


 Now the HLogPrettyPrinter print the log without real op time but the timestamp
 {quote}
 Sequence 4 from region ee9877dfd55624f50b20acf572416a88 in table t5
   Action:
 row: r
 column: f3:q
 at time: Thu Jan 01 08:02:03 CST 1970
 {quote}
 Maybe we need to know the real op time like this
 {quote}
 Sequence 4 from region ee9877dfd55624f50b20acf572416a88 in table t5 at time: 
 Sun Apr 01 10:42:53 CST 2012
   Action:
 row: r
 column: f3:q
 timestamp: Thu Jan 01 08:02:03 CST 1970
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5697) Audit HBase for usage of deprecated hadoop 0.20.x property names.

2012-04-02 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244671#comment-13244671
 ] 

Jonathan Hsieh commented on HBASE-5697:
---

Hopefully that is a comprehensive list.  My guess is that only a handful are 
relevant.  I started testing on hadoop 23 and there are definitely some new 
deprecation warnings that show up in logs/console.  Also, some have been bugs 
in previous versions - I've gotten snagged on this one before:

fs.default.name - fs.defaultFS 



 Audit HBase for usage of deprecated hadoop 0.20.x property names.
 -

 Key: HBASE-5697
 URL: https://issues.apache.org/jira/browse/HBASE-5697
 Project: HBase
  Issue Type: Task
Reporter: Jonathan Hsieh

 Many xml config properties in Hadoop have changed in 0.23.  We should audit 
 hbase to insulate it from hadoop property name changes.
 Here is a list of the hadoop property name changes:
 http://hadoop.apache.org/common/docs/r0.23.1/hadoop-project-dist/hadoop-common/DeprecatedProperties.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-2186) hbase master should publish more stats

2012-04-02 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-2186:
-

Attachment: screenshot-1.jpg

The Master bean

 hbase master should publish more stats
 --

 Key: HBASE-2186
 URL: https://issues.apache.org/jira/browse/HBASE-2186
 Project: HBase
  Issue Type: Bug
  Components: metrics
Reporter: ryan rawson
 Attachments: screenshot-1.jpg, screenshot-2.jpg


 hbase master only publishes cluster.requests to ganglia. we should also 
 publish regionserver count and other interesting metrics.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-2186) hbase master should publish more stats

2012-04-02 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-2186:
-

Attachment: screenshot-2.jpg

The master stats bean

 hbase master should publish more stats
 --

 Key: HBASE-2186
 URL: https://issues.apache.org/jira/browse/HBASE-2186
 Project: HBase
  Issue Type: Bug
  Components: metrics
Reporter: ryan rawson
 Attachments: screenshot-1.jpg, screenshot-2.jpg


 hbase master only publishes cluster.requests to ganglia. we should also 
 publish regionserver count and other interesting metrics.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-2186) hbase master should publish more stats

2012-04-02 Thread stack (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-2186.
--

Resolution: Duplicate

Resolving at Otis's suggestion.  Master has more stats now.  Could do w/ more 
but let this be enough to close this issue.

 hbase master should publish more stats
 --

 Key: HBASE-2186
 URL: https://issues.apache.org/jira/browse/HBASE-2186
 Project: HBase
  Issue Type: Bug
  Components: metrics
Reporter: ryan rawson
 Attachments: screenshot-1.jpg, screenshot-2.jpg


 hbase master only publishes cluster.requests to ganglia. we should also 
 publish regionserver count and other interesting metrics.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5700) [89-fb] Fix TestMiniClusterLoad* test failures

2012-04-02 Thread Mikhail Bautin (Created) (JIRA)
[89-fb] Fix TestMiniClusterLoad* test failures
--

 Key: HBASE-5700
 URL: https://issues.apache.org/jira/browse/HBASE-5700
 Project: HBase
  Issue Type: Bug
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
Priority: Minor


Porting TestMiniClusterLoad* tests to 89-fb in HBASE-5679 uncovered certain 
problems with mini-cluster setup in 89-fb that need to be fixed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5701) Put RegionServerDynamicStatistics under RegionServer in MBean hierarchy rather than have it as a peer.

2012-04-02 Thread stack (Created) (JIRA)
Put RegionServerDynamicStatistics under RegionServer in MBean hierarchy rather 
than have it as a peer.
--

 Key: HBASE-5701
 URL: https://issues.apache.org/jira/browse/HBASE-5701
 Project: HBase
  Issue Type: Bug
Reporter: stack




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5692) Add real action time for HLogPrettyPrinter

2012-04-02 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244689#comment-13244689
 ] 

Hadoop QA commented on HBASE-5692:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12521048/HBASE-5665-trunk.v2.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1371//console

This message is automatically generated.

 Add real action time for HLogPrettyPrinter
 --

 Key: HBASE-5692
 URL: https://issues.apache.org/jira/browse/HBASE-5692
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Xing Shi
Priority: Minor
 Attachments: HBASE-5665-trunk.v2.patch, HBASE-5692.patch


 Now the HLogPrettyPrinter print the log without real op time but the timestamp
 {quote}
 Sequence 4 from region ee9877dfd55624f50b20acf572416a88 in table t5
   Action:
 row: r
 column: f3:q
 at time: Thu Jan 01 08:02:03 CST 1970
 {quote}
 Maybe we need to know the real op time like this
 {quote}
 Sequence 4 from region ee9877dfd55624f50b20acf572416a88 in table t5 at time: 
 Sun Apr 01 10:42:53 CST 2012
   Action:
 row: r
 column: f3:q
 timestamp: Thu Jan 01 08:02:03 CST 1970
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5701) Put RegionServerDynamicStatistics under RegionServer in MBean hierarchy rather than have it as a peer.

2012-04-02 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5701:
-

Attachment: screenshot-1.jpg

See how the mbeans are currently arrayed where hadoop is top level and then we 
have master and regionserver AND regionserverdynamic levels.  In regionserver 
we have the regionserver mbean and another regionserverstatistics mbean.  Over 
in regionserverdynamic we have regionserverdynamicstatistics mbean.

 Put RegionServerDynamicStatistics under RegionServer in MBean hierarchy 
 rather than have it as a peer.
 --

 Key: HBASE-5701
 URL: https://issues.apache.org/jira/browse/HBASE-5701
 Project: HBase
  Issue Type: Bug
Reporter: stack
 Attachments: screenshot-1.jpg




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5701) Put RegionServerDynamicStatistics under RegionServer in MBean hierarchy rather than have it as a peer.

2012-04-02 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244691#comment-13244691
 ] 

stack commented on HBASE-5701:
--

This commit added the dynamic mbean:

{code}
r1185835 | nspiegelberg | 2011-10-18 13:23:28 -0700 (Tue, 18 Oct 2011) | 1 line

HBASE-4219 Per Column Family Metrics
{code}


 Put RegionServerDynamicStatistics under RegionServer in MBean hierarchy 
 rather than have it as a peer.
 --

 Key: HBASE-5701
 URL: https://issues.apache.org/jira/browse/HBASE-5701
 Project: HBase
  Issue Type: Bug
Reporter: stack
 Attachments: screenshot-1.jpg




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5692) Add real action time for HLogPrettyPrinter

2012-04-02 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5692:
-

Attachment: 5692v2.patch

Attach the 'right' patch

 Add real action time for HLogPrettyPrinter
 --

 Key: HBASE-5692
 URL: https://issues.apache.org/jira/browse/HBASE-5692
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Xing Shi
Priority: Minor
 Attachments: 5692v2.patch, HBASE-5665-trunk.v2.patch, HBASE-5692.patch


 Now the HLogPrettyPrinter print the log without real op time but the timestamp
 {quote}
 Sequence 4 from region ee9877dfd55624f50b20acf572416a88 in table t5
   Action:
 row: r
 column: f3:q
 at time: Thu Jan 01 08:02:03 CST 1970
 {quote}
 Maybe we need to know the real op time like this
 {quote}
 Sequence 4 from region ee9877dfd55624f50b20acf572416a88 in table t5 at time: 
 Sun Apr 01 10:42:53 CST 2012
   Action:
 row: r
 column: f3:q
 timestamp: Thu Jan 01 08:02:03 CST 1970
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5692) Add real action time for HLogPrettyPrinter

2012-04-02 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5692:
-

Status: Open  (was: Patch Available)

 Add real action time for HLogPrettyPrinter
 --

 Key: HBASE-5692
 URL: https://issues.apache.org/jira/browse/HBASE-5692
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Xing Shi
Priority: Minor
 Attachments: 5692v2.patch, HBASE-5665-trunk.v2.patch, HBASE-5692.patch


 Now the HLogPrettyPrinter print the log without real op time but the timestamp
 {quote}
 Sequence 4 from region ee9877dfd55624f50b20acf572416a88 in table t5
   Action:
 row: r
 column: f3:q
 at time: Thu Jan 01 08:02:03 CST 1970
 {quote}
 Maybe we need to know the real op time like this
 {quote}
 Sequence 4 from region ee9877dfd55624f50b20acf572416a88 in table t5 at time: 
 Sun Apr 01 10:42:53 CST 2012
   Action:
 row: r
 column: f3:q
 timestamp: Thu Jan 01 08:02:03 CST 1970
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




  1   2   >