[jira] [Updated] (HBASE-5633) NPE reading ZK config in HBase

2012-03-24 Thread Matteo Bertozzi (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-5633:
---

Attachment: HBASE-5633-v1.patch

Added a default value (HConstants.DEFAULT_CLUSTER_DISTRIBUTED) for the 
cluster.distributed property to config.get(HConstants.CLUSTER_DISTRIBUTED) in 
parseZooCfg().

We get a Null if the property doesn't exists, and the result of conf.get() is 
compared directly without checking for Null.

 NPE reading ZK config in HBase
 --

 Key: HBASE-5633
 URL: https://issues.apache.org/jira/browse/HBASE-5633
 Project: HBase
  Issue Type: Bug
  Components: zookeeper
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
 Attachments: HBASE-5633-v1.patch


 If zoo.cfg contains server.* (server.0=server0:2888:3888\n) and 
 cluster.distributed property (in hbase-site.xml) is empty we get an NPE in 
 parseZooCfg().
 The easy way to reproduce the bug is running 
 org.apache.hbase.zookeeper.TestHQuorumPeer with hbase-site.xml containing:
 {code}
 property
   namehbase.cluster.distributed/name
   value/value
 /property
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5633) NPE reading ZK config in HBase

2012-03-24 Thread Matteo Bertozzi (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-5633:
---

Status: Patch Available  (was: Open)

 NPE reading ZK config in HBase
 --

 Key: HBASE-5633
 URL: https://issues.apache.org/jira/browse/HBASE-5633
 Project: HBase
  Issue Type: Bug
  Components: zookeeper
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
 Attachments: HBASE-5633-v1.patch


 If zoo.cfg contains server.* (server.0=server0:2888:3888\n) and 
 cluster.distributed property (in hbase-site.xml) is empty we get an NPE in 
 parseZooCfg().
 The easy way to reproduce the bug is running 
 org.apache.hbase.zookeeper.TestHQuorumPeer with hbase-site.xml containing:
 {code}
 property
   namehbase.cluster.distributed/name
   value/value
 /property
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5633) NPE reading ZK config in HBase

2012-03-24 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237465#comment-13237465
 ] 

Hadoop QA commented on HBASE-5633:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12519785/HBASE-5633-v1.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 1 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.io.hfile.TestForceCacheImportantBlocks
  org.apache.hadoop.hbase.mapreduce.TestImportTsv
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1297//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1297//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1297//console

This message is automatically generated.

 NPE reading ZK config in HBase
 --

 Key: HBASE-5633
 URL: https://issues.apache.org/jira/browse/HBASE-5633
 Project: HBase
  Issue Type: Bug
  Components: zookeeper
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
 Attachments: HBASE-5633-v1.patch


 If zoo.cfg contains server.* (server.0=server0:2888:3888\n) and 
 cluster.distributed property (in hbase-site.xml) is empty we get an NPE in 
 parseZooCfg().
 The easy way to reproduce the bug is running 
 org.apache.hbase.zookeeper.TestHQuorumPeer with hbase-site.xml containing:
 {code}
 property
   namehbase.cluster.distributed/name
   value/value
 /property
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5451) Switch RPC call envelope/headers to PBs

2012-03-24 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237472#comment-13237472
 ] 

jirapos...@reviews.apache.org commented on HBASE-5451:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4096/#review6302
---



http://svn.apache.org/repos/asf/hbase/trunk/pom.xml
https://reviews.apache.org/r/4096/#comment13673

Trailing whitespaces.



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java
https://reviews.apache.org/r/4096/#comment13674

You already import RpcRequestWithHeaderProto, so just use 
RpcRequestWithHeaderProto.Builder here, drop the leading RPCMessageProtos.



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java
https://reviews.apache.org/r/4096/#comment13675

Trailing whitespaces here and below.  Kill them all.



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java
https://reviews.apache.org/r/4096/#comment13677

If you cast it to RpcRequestProto, then why not check if param is an 
instance of RpcRequestProto?  Also you're missing a space right before param.



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java
https://reviews.apache.org/r/4096/#comment13678

Throw a ClassCastException.



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java
https://reviews.apache.org/r/4096/#comment13679

Argh, no, don't change this!  I got other HBase devs to promise to not 
change this as it makes backwards compatible clients impossibly complicated.



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java
https://reviews.apache.org/r/4096/#comment13680

Trailing whitespace.



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java
https://reviews.apache.org/r/4096/#comment13739

Is there a way to avoid code duplication and unify this method with the on 
in the HBaseClient class?



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java
https://reviews.apache.org/r/4096/#comment13740

Why wrap this line and the next around?  I think this fits on one line 
without exceeding 80 columns.



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/security/User.java
https://reviews.apache.org/r/4096/#comment13670

just do return create(null) ?



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/security/User.java
https://reviews.apache.org/r/4096/#comment13671

Why use the fully qualified names here?

Also kill the trailing whitespace.



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/security/User.java
https://reviews.apache.org/r/4096/#comment13672

Trailing whitespaces.



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/security/User.java
https://reviews.apache.org/r/4096/#comment13741

This seems unnecessary.



http://svn.apache.org/repos/asf/hbase/trunk/src/main/proto/RPCMessageProto.proto
https://reviews.apache.org/r/4096/#comment13742

Kill all the trailing whitespaces!



http://svn.apache.org/repos/asf/hbase/trunk/src/main/proto/RPCMessageProto.proto
https://reviews.apache.org/r/4096/#comment13743

I don't see how this is graceful.



http://svn.apache.org/repos/asf/hbase/trunk/src/main/proto/RPCMessageProto.proto
https://reviews.apache.org/r/4096/#comment13744

Why keep this oddity of Hadoop RPC?  Either rely on TCP keepalive, or add a 
Ping method to the RPC interface.



http://svn.apache.org/repos/asf/hbase/trunk/src/main/proto/RPCMessageProto.proto
https://reviews.apache.org/r/4096/#comment13745

What's the point of this message?  Why not just put the callId in 
RpcRequestProto and be done with it?



http://svn.apache.org/repos/asf/hbase/trunk/src/main/proto/RPCMessageProto.proto
https://reviews.apache.org/r/4096/#comment13746

Why is this optional?



http://svn.apache.org/repos/asf/hbase/trunk/src/main/proto/RPCMessageProto.proto
https://reviews.apache.org/r/4096/#comment13747

Why is this optional?  It should be required and it should be first.



http://svn.apache.org/repos/asf/hbase/trunk/src/main/proto/RPCMessageProto.proto
https://reviews.apache.org/r/4096/#comment13748

Ditto, why have an extra PB?



http://svn.apache.org/repos/asf/hbase/trunk/src/main/proto/RPCMessageProto.proto
https://reviews.apache.org/r/4096/#comment13749

This should be first and it should be required.




[jira] [Commented] (HBASE-5128) [uber hbck] Online automated repair of table integrity and region consistency problems

2012-03-24 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237474#comment-13237474
 ] 

Jonathan Hsieh commented on HBASE-5128:
---

Thanks Ted.  I've updated the rest.  Will do better next time. :)

 [uber hbck] Online automated repair of table integrity and region consistency 
 problems
 --

 Key: HBASE-5128
 URL: https://issues.apache.org/jira/browse/HBASE-5128
 Project: HBase
  Issue Type: New Feature
  Components: hbck
Affects Versions: 0.90.5, 0.92.0, 0.94.0, 0.96.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0

 Attachments: 5128-trunk.addendum, hbase-5128-0.90-v2.patch, 
 hbase-5128-0.90-v2b.patch, hbase-5128-0.90-v4.patch, 
 hbase-5128-0.92-v2.patch, hbase-5128-0.92-v4.patch, hbase-5128-0.94-v2.patch, 
 hbase-5128-0.94-v4.patch, hbase-5128-trunk-v2.patch, hbase-5128-trunk.patch, 
 hbase-5128-v3.patch, hbase-5128-v4.patch


 The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region 
 consistency and table integrity invariant violations.  However with '-fix' it 
 can only automatically repair region consistency cases having to do with 
 deployment problems.  This updated version should be able to handle all cases 
 (including a new orphan regiondir case).  When complete will likely deprecate 
 the OfflineMetaRepair tool and subsume several open META-hole related issue.
 Here's the approach (from the comment of at the top of the new version of the 
 file).
 {code}
 /**
  * HBaseFsck (hbck) is a tool for checking and repairing region consistency 
 and
  * table integrity.  
  * 
  * Region consistency checks verify that META, region deployment on
  * region servers and the state of data in HDFS (.regioninfo files) all are in
  * accordance. 
  * 
  * Table integrity checks verify that that all possible row keys can resolve 
 to
  * exactly one region of a table.  This means there are no individual 
 degenerate
  * or backwards regions; no holes between regions; and that there no 
 overlapping
  * regions. 
  * 
  * The general repair strategy works in these steps.
  * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
  * 2) Repair Region Consistency with META and assignments
  * 
  * For table integrity repairs, the tables their region directories are 
 scanned
  * for .regioninfo files.  Each table's integrity is then verified.  If there 
  * are any orphan regions (regions with no .regioninfo files), or holes, new 
  * regions are fabricated.  Backwards regions are sidelined as well as empty
  * degenerate (endkey==startkey) regions.  If there are any overlapping 
 regions,
  * a new region is created and all data is merged into the new region.  
  * 
  * Table integrity repairs deal solely with HDFS and can be done offline -- 
 the
  * hbase region servers or master do not need to be running.  These phase can 
 be
  * use to completely reconstruct the META table in an offline fashion. 
  * 
  * Region consistency requires three conditions -- 1) valid .regioninfo file 
  * present in an hdfs region dir,  2) valid row with .regioninfo data in META,
  * and 3) a region is deployed only at the regionserver that is was assigned 
 to.
  * 
  * Region consistency requires hbck to contact the HBase master and region
  * servers, so the connect() must first be called successfully.  Much of the
  * region consistency information is transient and less risky to repair.
  */
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5563) HRegionInfo#compareTo should compare regionId as well

2012-03-24 Thread Jonathan Hsieh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-5563:
--

Fix Version/s: 0.90.7

 HRegionInfo#compareTo should compare regionId as well
 -

 Key: HBASE-5563
 URL: https://issues.apache.org/jira/browse/HBASE-5563
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.5, 0.92.0, 0.94.0, 0.96.0
Reporter: chunhui shen
Assignee: chunhui shen
 Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0

 Attachments: HBASE-5563.patch, HBASE-5563v2.patch, 
 HBASE-5563v2.patch, hbase-5563-0.90.patch, hbase-5563-v3-0.92.patch, 
 hbase-5563-v3.patch


 In the one region multi assigned case,  we could find that two regions have 
 the same table name, same startKey, same endKey, and different regionId, so 
 these two regions are same in TreeMap but different in HashMap.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5615) the master never do balance becauseof balance the parent region

2012-03-24 Thread gaojinchao (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237484#comment-13237484
 ] 

gaojinchao commented on HBASE-5615:
---

+1 

 the master never do balance becauseof  balance the parent region
 

 Key: HBASE-5615
 URL: https://issues.apache.org/jira/browse/HBASE-5615
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.7
Reporter: xufeng
Assignee: xufeng
Priority: Critical
 Attachments: HBASE-5615-90.patch, HBASE-5615.patch, 
 NoPatched-surefire-report-5615-90.html, Patched_surefire-report-5615-90.html


 the master never do balance becauseof when master do rebuildUserRegions(),it 
 will add the parent region into  AssignmentManager#servers,
 if balancer let the parent region to move,the parent will in RIT forever.thus 
 balance will never be executed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5128) [uber hbck] Online automated repair of table integrity and region consistency problems

2012-03-24 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237490#comment-13237490
 ] 

Hudson commented on HBASE-5128:
---

Integrated in HBase-0.94 #51 (See 
[https://builds.apache.org/job/HBase-0.94/51/])
HBASE-5128 Addendum adds two missing new files (Revision 1304722)

 Result = FAILURE
jmhsieh : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/util/hbck/TableIntegrityErrorHandler.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/util/hbck/TableIntegrityErrorHandlerImpl.java


 [uber hbck] Online automated repair of table integrity and region consistency 
 problems
 --

 Key: HBASE-5128
 URL: https://issues.apache.org/jira/browse/HBASE-5128
 Project: HBase
  Issue Type: New Feature
  Components: hbck
Affects Versions: 0.90.5, 0.92.0, 0.94.0, 0.96.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0

 Attachments: 5128-trunk.addendum, hbase-5128-0.90-v2.patch, 
 hbase-5128-0.90-v2b.patch, hbase-5128-0.90-v4.patch, 
 hbase-5128-0.92-v2.patch, hbase-5128-0.92-v4.patch, hbase-5128-0.94-v2.patch, 
 hbase-5128-0.94-v4.patch, hbase-5128-trunk-v2.patch, hbase-5128-trunk.patch, 
 hbase-5128-v3.patch, hbase-5128-v4.patch


 The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region 
 consistency and table integrity invariant violations.  However with '-fix' it 
 can only automatically repair region consistency cases having to do with 
 deployment problems.  This updated version should be able to handle all cases 
 (including a new orphan regiondir case).  When complete will likely deprecate 
 the OfflineMetaRepair tool and subsume several open META-hole related issue.
 Here's the approach (from the comment of at the top of the new version of the 
 file).
 {code}
 /**
  * HBaseFsck (hbck) is a tool for checking and repairing region consistency 
 and
  * table integrity.  
  * 
  * Region consistency checks verify that META, region deployment on
  * region servers and the state of data in HDFS (.regioninfo files) all are in
  * accordance. 
  * 
  * Table integrity checks verify that that all possible row keys can resolve 
 to
  * exactly one region of a table.  This means there are no individual 
 degenerate
  * or backwards regions; no holes between regions; and that there no 
 overlapping
  * regions. 
  * 
  * The general repair strategy works in these steps.
  * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
  * 2) Repair Region Consistency with META and assignments
  * 
  * For table integrity repairs, the tables their region directories are 
 scanned
  * for .regioninfo files.  Each table's integrity is then verified.  If there 
  * are any orphan regions (regions with no .regioninfo files), or holes, new 
  * regions are fabricated.  Backwards regions are sidelined as well as empty
  * degenerate (endkey==startkey) regions.  If there are any overlapping 
 regions,
  * a new region is created and all data is merged into the new region.  
  * 
  * Table integrity repairs deal solely with HDFS and can be done offline -- 
 the
  * hbase region servers or master do not need to be running.  These phase can 
 be
  * use to completely reconstruct the META table in an offline fashion. 
  * 
  * Region consistency requires three conditions -- 1) valid .regioninfo file 
  * present in an hdfs region dir,  2) valid row with .regioninfo data in META,
  * and 3) a region is deployed only at the regionserver that is was assigned 
 to.
  * 
  * Region consistency requires hbck to contact the HBase master and region
  * servers, so the connect() must first be called successfully.  Much of the
  * region consistency information is transient and less risky to repair.
  */
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5128) [uber hbck] Online automated repair of table integrity and region consistency problems

2012-03-24 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237493#comment-13237493
 ] 

Hudson commented on HBASE-5128:
---

Integrated in HBase-0.92 #338 (See 
[https://builds.apache.org/job/HBase-0.92/338/])
HBASE-5128 Addendum adds two missing new files (Revision 1304723)

 Result = FAILURE
jmhsieh : 
Files : 
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/util/hbck/TableIntegrityErrorHandler.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/util/hbck/TableIntegrityErrorHandlerImpl.java


 [uber hbck] Online automated repair of table integrity and region consistency 
 problems
 --

 Key: HBASE-5128
 URL: https://issues.apache.org/jira/browse/HBASE-5128
 Project: HBase
  Issue Type: New Feature
  Components: hbck
Affects Versions: 0.90.5, 0.92.0, 0.94.0, 0.96.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0

 Attachments: 5128-trunk.addendum, hbase-5128-0.90-v2.patch, 
 hbase-5128-0.90-v2b.patch, hbase-5128-0.90-v4.patch, 
 hbase-5128-0.92-v2.patch, hbase-5128-0.92-v4.patch, hbase-5128-0.94-v2.patch, 
 hbase-5128-0.94-v4.patch, hbase-5128-trunk-v2.patch, hbase-5128-trunk.patch, 
 hbase-5128-v3.patch, hbase-5128-v4.patch


 The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region 
 consistency and table integrity invariant violations.  However with '-fix' it 
 can only automatically repair region consistency cases having to do with 
 deployment problems.  This updated version should be able to handle all cases 
 (including a new orphan regiondir case).  When complete will likely deprecate 
 the OfflineMetaRepair tool and subsume several open META-hole related issue.
 Here's the approach (from the comment of at the top of the new version of the 
 file).
 {code}
 /**
  * HBaseFsck (hbck) is a tool for checking and repairing region consistency 
 and
  * table integrity.  
  * 
  * Region consistency checks verify that META, region deployment on
  * region servers and the state of data in HDFS (.regioninfo files) all are in
  * accordance. 
  * 
  * Table integrity checks verify that that all possible row keys can resolve 
 to
  * exactly one region of a table.  This means there are no individual 
 degenerate
  * or backwards regions; no holes between regions; and that there no 
 overlapping
  * regions. 
  * 
  * The general repair strategy works in these steps.
  * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
  * 2) Repair Region Consistency with META and assignments
  * 
  * For table integrity repairs, the tables their region directories are 
 scanned
  * for .regioninfo files.  Each table's integrity is then verified.  If there 
  * are any orphan regions (regions with no .regioninfo files), or holes, new 
  * regions are fabricated.  Backwards regions are sidelined as well as empty
  * degenerate (endkey==startkey) regions.  If there are any overlapping 
 regions,
  * a new region is created and all data is merged into the new region.  
  * 
  * Table integrity repairs deal solely with HDFS and can be done offline -- 
 the
  * hbase region servers or master do not need to be running.  These phase can 
 be
  * use to completely reconstruct the META table in an offline fashion. 
  * 
  * Region consistency requires three conditions -- 1) valid .regioninfo file 
  * present in an hdfs region dir,  2) valid row with .regioninfo data in META,
  * and 3) a region is deployed only at the regionserver that is was assigned 
 to.
  * 
  * Region consistency requires hbck to contact the HBase master and region
  * servers, so the connect() must first be called successfully.  Much of the
  * region consistency information is transient and less risky to repair.
  */
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5128) [uber hbck] Online automated repair of table integrity and region consistency problems

2012-03-24 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237531#comment-13237531
 ] 

Hudson commented on HBASE-5128:
---

Integrated in HBase-TRUNK #2694 (See 
[https://builds.apache.org/job/HBase-TRUNK/2694/])
HBASE-5128 Addendum adds two new files Jon forgot to add (Revision 1304702)
HBASE-5128 [uber hbck] Online automated repair of table integrity and region 
consistency problems (Revision 1304665)

 Result = SUCCESS
tedyu : 
Files : 
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/hbck/TableIntegrityErrorHandler.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/hbck/TableIntegrityErrorHandlerImpl.java

jmhsieh : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRepair.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsckComparator.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/hbck/HbckTestingUtil.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildBase.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildHole.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildOverlap.java


 [uber hbck] Online automated repair of table integrity and region consistency 
 problems
 --

 Key: HBASE-5128
 URL: https://issues.apache.org/jira/browse/HBASE-5128
 Project: HBase
  Issue Type: New Feature
  Components: hbck
Affects Versions: 0.90.5, 0.92.0, 0.94.0, 0.96.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0

 Attachments: 5128-trunk.addendum, hbase-5128-0.90-v2.patch, 
 hbase-5128-0.90-v2b.patch, hbase-5128-0.90-v4.patch, 
 hbase-5128-0.92-v2.patch, hbase-5128-0.92-v4.patch, hbase-5128-0.94-v2.patch, 
 hbase-5128-0.94-v4.patch, hbase-5128-trunk-v2.patch, hbase-5128-trunk.patch, 
 hbase-5128-v3.patch, hbase-5128-v4.patch


 The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region 
 consistency and table integrity invariant violations.  However with '-fix' it 
 can only automatically repair region consistency cases having to do with 
 deployment problems.  This updated version should be able to handle all cases 
 (including a new orphan regiondir case).  When complete will likely deprecate 
 the OfflineMetaRepair tool and subsume several open META-hole related issue.
 Here's the approach (from the comment of at the top of the new version of the 
 file).
 {code}
 /**
  * HBaseFsck (hbck) is a tool for checking and repairing region consistency 
 and
  * table integrity.  
  * 
  * Region consistency checks verify that META, region deployment on
  * region servers and the state of data in HDFS (.regioninfo files) all are in
  * accordance. 
  * 
  * Table integrity checks verify that that all possible row keys can resolve 
 to
  * exactly one region of a table.  This means there are no individual 
 degenerate
  * or backwards regions; no holes between regions; and that there no 
 overlapping
  * regions. 
  * 
  * The general repair strategy works in these steps.
  * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
  * 2) Repair Region Consistency with META and assignments
  * 
  * For table integrity repairs, the tables their region directories are 
 scanned
  * for .regioninfo files.  Each table's integrity is then verified.  If there 
  * are any orphan regions (regions with no .regioninfo files), or holes, new 
  * regions are fabricated.  Backwards regions are sidelined as well as empty
  * degenerate (endkey==startkey) regions.  If there are any overlapping 
 regions,
  * a new region is created and all data is merged into the new region.  
  * 
  * Table integrity repairs deal solely with HDFS and can be done offline -- 
 the
  * hbase region servers or master do not need to be running.  These phase can 
 be
  * use to completely reconstruct the META table in an offline fashion. 
  * 
  * Region consistency requires three conditions -- 1) valid .regioninfo file 
  * present in an hdfs region dir,  

[jira] [Commented] (HBASE-5616) Make compaction code standalone

2012-03-24 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237532#comment-13237532
 ] 

Hudson commented on HBASE-5616:
---

Integrated in HBase-TRUNK #2694 (See 
[https://builds.apache.org/job/HBase-TRUNK/2694/])
HBASE-5616 Make compaction code standalone; ADDENDUM -- ADD LICENSES 
(Revision 1304624)
HBASE-5616 Make compaction code standalone (Revision 1304616)

 Result = SUCCESS
stack : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Compactor.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/CompactionTool.java

stack : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Compactor.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/CompactionProgress.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/FSUtils.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/CompactionTool.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java


 Make compaction code standalone
 ---

 Key: HBASE-5616
 URL: https://issues.apache.org/jira/browse/HBASE-5616
 Project: HBase
  Issue Type: Improvement
Reporter: stack
Assignee: stack
 Fix For: 0.96.0

 Attachments: 5616.txt, 5616v3.txt, 5616v6.txt, 5616v7.txt, 
 5616v7.txt, addlicense.txt, standalone.txt


 This is part of hbase-2462.  Make the compaction code standalone so can run 
 it independent of hbase.  Will make it easier to profile and try stuff out.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5190) Limit the IPC queue size based on calls' payload size

2012-03-24 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237529#comment-13237529
 ] 

Hudson commented on HBASE-5190:
---

Integrated in HBase-TRUNK #2694 (See 
[https://builds.apache.org/job/HBase-TRUNK/2694/])
HBASE-5190  Limit the IPC queue size based on calls' payload size (Revision 
1304634)

 Result = SUCCESS
jdcryans : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java


 Limit the IPC queue size based on calls' payload size
 -

 Key: HBASE-5190
 URL: https://issues.apache.org/jira/browse/HBASE-5190
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.90.5
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
 Fix For: 0.94.0, 0.96.0

 Attachments: HBASE-5190-v2.patch, HBASE-5190-v3.patch, 
 HBASE-5190.patch


 Currently we limit the number of calls in the IPC queue only on their count. 
 It used to be really high and was dropped down recently to num_handlers * 10 
 (so 100 by default) because it was easy to OOME yourself when huge calls were 
 being queued. It's still possible to hit this problem if you use really big 
 values and/or a lot of handlers, so the idea is that we should take into 
 account the payload size. I can see 3 solutions:
  - Do the accounting outside of the queue itself for all calls coming in and 
 out and when a call doesn't fit, throw a retryable exception.
  - Same accounting but instead block the call when it comes in until space is 
 made available.
  - Add a new parameter for the maximum size (in bytes) of a Call and then set 
 the size the IPC queue (in terms of the number of items) so that it could 
 only contain as many items as some predefined maximum size (in bytes) for the 
 whole queue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5469) Add baseline compression efficiency to DataBlockEncodingTool

2012-03-24 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237530#comment-13237530
 ] 

Hudson commented on HBASE-5469:
---

Integrated in HBase-TRUNK #2694 (See 
[https://builds.apache.org/job/HBase-TRUNK/2694/])
[jira] [HBASE-5469] Add baseline compression efficiency to 
DataBlockEncodingTool

Summary:
DataBlockEncodingTool currently does not provide baseline compression
efficiency, e.g. Hadoop compression codec applied to unencoded data. E.g. if
we are using LZO to compress blocks, we would like to have the following
columns in the report (possibly as percentages of raw data size).

Baseline K+V in blockcache | Baseline K + V on disk (LZO compressed) | K + V
DataBlockEncoded in block cache | K + V DataBlockEncoded + LZOCompressed (on
disk)

Background: we never store compressed blocks in cache, but we always store
encoded data blocks in cache if data block encoding is enabled for the column
family.

This patch also has multiple bugfixes and improvements to DataBlockEncodingTool,
including presentation format, memory requirements (reduced 3x) and fixing the
handling of compression.

Test Plan:
* Run unit tests.
* Run DataBlockEncodingTool on a variety of real-world HFiles.

Reviewers: JIRA, dhruba, tedyu, stack, heyongqiang

Reviewed By: tedyu

Differential Revision: https://reviews.facebook.net/D2409 (Revision 1304626)

 Result = SUCCESS
mbautin : 
Files : 
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java


 Add baseline compression efficiency to DataBlockEncodingTool
 

 Key: HBASE-5469
 URL: https://issues.apache.org/jira/browse/HBASE-5469
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
Priority: Minor
 Attachments: D2409.1.patch, D2409.2.patch, 
 jira-HBASE-5469-Add-baseline-compression-efficiency--2012-03-23_15_04_41.patch


 DataBlockEncodingTool currently does not provide baseline compression 
 efficiency, e.g. Hadoop compression codec applied to unencoded data. E.g. if 
 we are using LZO to compress blocks, we would like to have the following 
 columns in the report (possibly as percentages of raw data size).
 Baseline K+V in blockcache  |   Baseline K + V on disk  (LZO compressed)  | K 
 + V  DataBlockEncoded in block cache |   K + V DataBlockEncoded + 
 LZOCompressed (on disk)
 Background: we never store compressed blocks in cache, but we always store 
 encoded data blocks in cache if data block encoding is enabled for the column 
 family.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5604) HLog replay tool that generates HFiles for use by LoadIncrementalHFiles.

2012-03-24 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237543#comment-13237543
 ] 

Jonathan Hsieh commented on HBASE-5604:
---

Hm, so this is similar to this 
http://www.postgresql.org/docs/8.2/static/continuous-archiving.html? For hbase, 
this would to enable a consistent warm backup (though no strong guarantees 
across regions) that would be cheaper than the full replication mechanism?

 HLog replay tool that generates HFiles for use by LoadIncrementalHFiles.
 

 Key: HBASE-5604
 URL: https://issues.apache.org/jira/browse/HBASE-5604
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl

 Just an idea I had. Might be useful for restore of a backup using the HLogs.
 This could an M/R (with a mapper per HLog file).
 The tool would get a timerange and a (set of) table(s). We'd pick the right 
 HLogs based on time before the M/R job is started and then have a mapper per 
 HLog file.
 The mapper would then go through the HLog, filter all WALEdits that didn't 
 fit into the time range or are not any of the tables and then uses 
 HFileOutputFormat to generate HFiles.
 Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5618) SplitLogManager - prevent unnecessary attempts to resubmits

2012-03-24 Thread Jonathan Hsieh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-5618:
--

Component/s: zookeeper
 wal

 SplitLogManager - prevent unnecessary attempts to resubmits
 ---

 Key: HBASE-5618
 URL: https://issues.apache.org/jira/browse/HBASE-5618
 Project: HBase
  Issue Type: Improvement
  Components: wal, zookeeper
Reporter: Prakash Khemani

 Currently once a watch fires that the task node has been updated (hearbeated) 
 by the worker, the splitlogmanager still quite some time before it updates 
 the last heard from time. This is because the manager currently schedules 
 another getDataSetWatch() and only after that finishes will it update the 
 task's last heard from time.
 This leads to a large number of zk-BadVersion warnings when resubmission is 
 continuously attempted and it fails.
 Two changes should be made
 (1) On a resubmission failure because of BadVersion the task's lastUpdate 
 time should get upped.
 (2) The task's lastUpdate time should get upped as soon as the 
 nodeDataChanged() watch fires and without waiting for getDataSetWatch() to 
 complete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4910) thrift scannerstopwithfilter not honoring stop row

2012-03-24 Thread Jonathan Hsieh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-4910:
--

Component/s: thrift

 thrift scannerstopwithfilter not honoring stop row
 --

 Key: HBASE-4910
 URL: https://issues.apache.org/jira/browse/HBASE-4910
 Project: HBase
  Issue Type: Sub-task
  Components: client, regionserver, thrift
Reporter: Nicolas Spiegelberg
 Fix For: 0.96.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5599) The hbkc tool can not fix the six scenarios, it is NO_VERSION_FILE, NOT_IN_META_OR_DEPLOYED, NOT_IN_META, SHOULD_NOT_BE_DEPLOYED, FIRST_REGION_STARTKEY_NOT_EMPTY, HOLE_

2012-03-24 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237554#comment-13237554
 ] 

Jonathan Hsieh commented on HBASE-5599:
---

Hi Fulin,

I've committed HBASE-5128 so let's work together to port the functionality 
you've implemented into it as well. 

* Some of the methods you've added in HBaseFsckRepair are similar in 
HBASE-5128.  Let's consolidate.
* It seems that you are focused on the 0.90 versions -- it is important that 
whatever changes we make also make it into the newer 0.92/0.94/trunk versions.  
I can definitely help out there.
* It is important to add some testing for the new scenarios as well -- checkout 
TestHbckFsck for some example to emulate.  I've basically started to have have 
one test for each error condition.




 The hbkc tool can not fix the six scenarios, it is NO_VERSION_FILE, 
 NOT_IN_META_OR_DEPLOYED, NOT_IN_META, SHOULD_NOT_BE_DEPLOYED, 
 FIRST_REGION_STARTKEY_NOT_EMPTY, HOLE_IN_REGION_CHAIN.
 

 Key: HBASE-5599
 URL: https://issues.apache.org/jira/browse/HBASE-5599
 Project: HBase
  Issue Type: New Feature
  Components: hbck
Affects Versions: 0.90.6
Reporter: fulin wang
 Fix For: 0.90.6

 Attachments: hbase-5599-0.90.patch, hbase-5599-0.90_v2.patch, 
 hbase-5599-0.90_v3.patch


 The hbck tool can not fix the six scenarios.
 1. Version file does not exist in root dir.
Fix: I try to create a version file by 'FSUtils.setVersion' method.

 2. [REGIONNAME][KEY] on HDFS, but not listed in META or deployed on any 
 region server.
Fix: I get region info form the hdfs file, this region info write to 
 '.META.' table.

 3. [REGIONNAME][KEY] not in META, but deployed on [SERVERNAME]
Fix: I get region info form the hdfs file, this region info write to 
 '.META.' table.

 4. [REGIONNAME] should not be deployed according to META, but is deployed on 
 [SERVERNAME]
Fix: Close this region.

 5. First region should start with an empty key.  You need to  create a new 
 region and regioninfo in HDFS to plug the hole.
Fix: The region info is not in hdfs and .META., so it create a empty 
 region for this error.
 6. There is a hole in the region chain between [KEY] and [KEY]. You need to 
 create a new regioninfo and region dir in hdfs to plug the hole.
   Fix: The region info is not in hdfs and .META., so it create a empty region 
 for this hole.
   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5599) The hbck tool can not fix the six scenarios, it is NO_VERSION_FILE, NOT_IN_META_OR_DEPLOYED, NOT_IN_META, SHOULD_NOT_BE_DEPLOYED, FIRST_REGION_STARTKEY_NOT_EMPTY, HOLE_IN

2012-03-24 Thread Jonathan Hsieh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-5599:
--

Summary: The hbck tool can not fix the six scenarios, it is 
NO_VERSION_FILE, NOT_IN_META_OR_DEPLOYED, NOT_IN_META, SHOULD_NOT_BE_DEPLOYED, 
FIRST_REGION_STARTKEY_NOT_EMPTY, HOLE_IN_REGION_CHAIN.  (was: The hbkc tool can 
not fix the six scenarios, it is NO_VERSION_FILE, NOT_IN_META_OR_DEPLOYED, 
NOT_IN_META, SHOULD_NOT_BE_DEPLOYED, FIRST_REGION_STARTKEY_NOT_EMPTY, 
HOLE_IN_REGION_CHAIN.)

 The hbck tool can not fix the six scenarios, it is NO_VERSION_FILE, 
 NOT_IN_META_OR_DEPLOYED, NOT_IN_META, SHOULD_NOT_BE_DEPLOYED, 
 FIRST_REGION_STARTKEY_NOT_EMPTY, HOLE_IN_REGION_CHAIN.
 

 Key: HBASE-5599
 URL: https://issues.apache.org/jira/browse/HBASE-5599
 Project: HBase
  Issue Type: New Feature
  Components: hbck
Affects Versions: 0.90.6
Reporter: fulin wang
 Fix For: 0.90.6

 Attachments: hbase-5599-0.90.patch, hbase-5599-0.90_v2.patch, 
 hbase-5599-0.90_v3.patch


 The hbck tool can not fix the six scenarios.
 1. Version file does not exist in root dir.
Fix: I try to create a version file by 'FSUtils.setVersion' method.

 2. [REGIONNAME][KEY] on HDFS, but not listed in META or deployed on any 
 region server.
Fix: I get region info form the hdfs file, this region info write to 
 '.META.' table.

 3. [REGIONNAME][KEY] not in META, but deployed on [SERVERNAME]
Fix: I get region info form the hdfs file, this region info write to 
 '.META.' table.

 4. [REGIONNAME] should not be deployed according to META, but is deployed on 
 [SERVERNAME]
Fix: Close this region.

 5. First region should start with an empty key.  You need to  create a new 
 region and regioninfo in HDFS to plug the hole.
Fix: The region info is not in hdfs and .META., so it create a empty 
 region for this error.
 6. There is a hole in the region chain between [KEY] and [KEY]. You need to 
 create a new regioninfo and region dir in hdfs to plug the hole.
   Fix: The region info is not in hdfs and .META., so it create a empty region 
 for this hole.
   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5615) the master never does balance because of balancing the parent region

2012-03-24 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237556#comment-13237556
 ] 

Ted Yu commented on HBASE-5615:
---

Integrated to 0.90 branch.

Thanks for the patch Xufeng.

Thanks for the review Ramkrishna and Jinchao.

Patch for TRUNK to follow

 the master never does balance because of balancing the parent region
 

 Key: HBASE-5615
 URL: https://issues.apache.org/jira/browse/HBASE-5615
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.7
Reporter: xufeng
Assignee: xufeng
Priority: Critical
 Fix For: 0.90.7, 0.96.0

 Attachments: 5615-trunk.txt, HBASE-5615-90.patch, HBASE-5615.patch, 
 NoPatched-surefire-report-5615-90.html, Patched_surefire-report-5615-90.html


 the master never do balance becauseof when master do rebuildUserRegions(),it 
 will add the parent region into  AssignmentManager#servers,
 if balancer let the parent region to move,the parent will in RIT forever.thus 
 balance will never be executed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5634) document how to use uberhbck

2012-03-24 Thread Jonathan Hsieh (Created) (JIRA)
document how to use uberhbck


 Key: HBASE-5634
 URL: https://issues.apache.org/jira/browse/HBASE-5634
 Project: HBase
  Issue Type: Improvement
  Components: documentation, hbck
Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.96.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh


The updated hbck from HBASE-5128 introduces many new repair options and, as a 
side effect, offers many new opportunities to durably shoot oneself in the 
foot.  Docs need to be written and added to the ref guide to explain its usage 
and ramifications and discuss repair strategies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5615) the master never does balance because of balancing the parent region

2012-03-24 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-5615:
--

Attachment: 5615-trunk.txt

 the master never does balance because of balancing the parent region
 

 Key: HBASE-5615
 URL: https://issues.apache.org/jira/browse/HBASE-5615
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.7
Reporter: xufeng
Assignee: xufeng
Priority: Critical
 Fix For: 0.90.7, 0.96.0

 Attachments: 5615-trunk.txt, HBASE-5615-90.patch, HBASE-5615.patch, 
 NoPatched-surefire-report-5615-90.html, Patched_surefire-report-5615-90.html


 the master never do balance becauseof when master do rebuildUserRegions(),it 
 will add the parent region into  AssignmentManager#servers,
 if balancer let the parent region to move,the parent will in RIT forever.thus 
 balance will never be executed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5615) the master never does balance because of balancing the parent region

2012-03-24 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-5615:
--

Fix Version/s: 0.96.0
   Status: Patch Available  (was: Open)

 the master never does balance because of balancing the parent region
 

 Key: HBASE-5615
 URL: https://issues.apache.org/jira/browse/HBASE-5615
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.7
Reporter: xufeng
Assignee: xufeng
Priority: Critical
 Fix For: 0.90.7, 0.96.0

 Attachments: 5615-trunk.txt, HBASE-5615-90.patch, HBASE-5615.patch, 
 NoPatched-surefire-report-5615-90.html, Patched_surefire-report-5615-90.html


 the master never do balance becauseof when master do rebuildUserRegions(),it 
 will add the parent region into  AssignmentManager#servers,
 if balancer let the parent region to move,the parent will in RIT forever.thus 
 balance will never be executed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3852) ThriftServer leaks scanners

2012-03-24 Thread Jonathan Hsieh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-3852:
--

Component/s: thrift

 ThriftServer leaks scanners
 ---

 Key: HBASE-3852
 URL: https://issues.apache.org/jira/browse/HBASE-3852
 Project: HBase
  Issue Type: Bug
  Components: thrift
Affects Versions: 0.90.2
Reporter: Jean-Daniel Cryans
Priority: Critical
 Fix For: 0.94.1

 Attachments: 3852.txt, thrift2-scanner.patch


 The scannerMap in ThriftServer relies on the user to clean it by closing the 
 scanner. If that doesn't happen, the ResultScanner will stay in the thrift 
 server's memory and if any pre-fetching was done, it will also start 
 accumulating Results (with all their data).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5466) Opening a table also opens the metatable and never closes it.

2012-03-24 Thread Jonathan Hsieh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-5466:
--

Fix Version/s: 0.94.0

This happend pre-0.94 branch, added fixver that this is in 0.94.

 Opening a table also opens the metatable and never closes it.
 -

 Key: HBASE-5466
 URL: https://issues.apache.org/jira/browse/HBASE-5466
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.5, 0.92.0
Reporter: Ashley Taylor
Assignee: Ashley Taylor
 Fix For: 0.90.7, 0.92.1, 0.94.0

 Attachments: MetaScanner_HBASE_5466(2).patch, 
 MetaScanner_HBASE_5466(3).patch, MetaScanner_HBASE_5466.patch


 Having upgraded to CDH3U3 version of hbase we found we had a zookeeper 
 connection leak, tracking it down we found that closing the connection will 
 only close the zookeeper connection if all calls to get the connection have 
 been closed, there is incCount and decCount in the HConnection class,
 When a table is opened it makes a call to the metascanner class which opens a 
 connection to the meta table, this table never gets closed.
 This caused the count in the HConnection class to never return to zero 
 meaning that the zookeeper connection will not close when we close all the 
 tables or calling
 HConnectionManager.deleteConnection(config, true);

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5586) [replication] NPE in ReplicationSource when creating a stream to an inexistent cluster

2012-03-24 Thread Jonathan Hsieh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-5586:
--

Component/s: replication

 [replication] NPE in ReplicationSource when creating a stream to an 
 inexistent cluster
 --

 Key: HBASE-5586
 URL: https://issues.apache.org/jira/browse/HBASE-5586
 Project: HBase
  Issue Type: Bug
  Components: replication
Affects Versions: 0.90.5
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
 Fix For: 0.92.2, 0.94.0, 0.96.0

 Attachments: 5586-v2.txt, HBASE-5586-trunk.patch, HBASE-5586.java, 
 HBASE-5586.java


 This is from 0.92.1-ish:
 {noformat}
 2012-03-15 09:52:16,589 ERROR
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource:
 Unexpected exception in ReplicationSource, currentPath=null
 java.lang.NullPointerException
at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.chooseSinks(ReplicationSource.java:223)
at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.connectToPeers(ReplicationSource.java:442)
at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:246)
 {noformat}
 I wanted to add a replication stream to a cluster that wasn't existing yet so 
 that the logs would be buffered until then. This should just be treated as if 
 there was no region servers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5615) the master never does balance because of balancing the parent region

2012-03-24 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237581#comment-13237581
 ] 

Hadoop QA commented on HBASE-5615:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12519805/5615-trunk.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 1 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.io.hfile.TestForceCacheImportantBlocks
  org.apache.hadoop.hbase.mapreduce.TestImportTsv
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1298//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1298//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1298//console

This message is automatically generated.

 the master never does balance because of balancing the parent region
 

 Key: HBASE-5615
 URL: https://issues.apache.org/jira/browse/HBASE-5615
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.7
Reporter: xufeng
Assignee: xufeng
Priority: Critical
 Fix For: 0.90.7, 0.96.0

 Attachments: 5615-trunk.txt, HBASE-5615-90.patch, HBASE-5615.patch, 
 NoPatched-surefire-report-5615-90.html, Patched_surefire-report-5615-90.html


 the master never do balance becauseof when master do rebuildUserRegions(),it 
 will add the parent region into  AssignmentManager#servers,
 if balancer let the parent region to move,the parent will in RIT forever.thus 
 balance will never be executed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5598) Analyse and fix the findbugs reporting by QA and add invalid bugs into findbugs-excludeFilter file

2012-03-24 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237582#comment-13237582
 ] 

Jonathan Hsieh commented on HBASE-5598:
---

Currently we are somewhere around the 770 warnings/errors mark.  We should chop 
this into subtasks to break down the work and knock out related issues.

For this to last, once this we get the findbugs warnings to 0, committers need 
to enforce a no-new-findbugs errors policy on reviews.  Agreed?

 Analyse and fix the findbugs reporting by QA and add invalid bugs into 
 findbugs-excludeFilter file
 --

 Key: HBASE-5598
 URL: https://issues.apache.org/jira/browse/HBASE-5598
 Project: HBase
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.92.1, 0.94.0, 0.96.0
Reporter: Uma Maheswara Rao G
Priority: Minor

 There are many findbugs errors reporting by HbaseQA. HBASE-5597 is going to 
 up the OK count.
 This may lead to other issues when we re-factor the code, if we induce new 
 valid ones and remove invalid bugs also can not be reported by QA.
 So, I would propose to add the exclude filter file for findbugs(for the 
 invalid bugs). If we find any valid ones, we can fix under this JIRA.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4393) Implement a canary monitoring program

2012-03-24 Thread Matteo Bertozzi (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-4393:
---

Attachment: HBaseCanary.java

I've attached a simple draft canary tool, that foreach table (or for the 
specified tables) tries to fetch a row from each region server, collects and 
print failures and times.

should this tool be a service that collect/expose stats for each region/column 
family or just a tool to get an idea on the cluster state?

In case this should be just a tool, any idea on the output format, the metrics 
that we want collect and output?

 Implement a canary monitoring program
 -

 Key: HBASE-4393
 URL: https://issues.apache.org/jira/browse/HBASE-4393
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Amandeep Khurana
 Attachments: HBaseCanary.java


 This JIRA is to implement a standalone program that can be used to do canary 
 monitoring of a running HBase cluster. This program would gather a list of 
 the regions in the cluster, then iterate over them doing lightweight 
 operations (eg short scans) to provide metrics about latency as well as alert 
 on availability issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5606) SplitLogManger async delete node hangs log splitting when ZK connection is lost

2012-03-24 Thread Jimmy Xiang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237647#comment-13237647
 ] 

Jimmy Xiang commented on HBASE-5606:


This is similar issue as HBASE-5081, right?

Will my original fix proposed for HBASE-5081 help: don't retry distributed log 
splitting before tasks are actually deleted?
We can abort the master after several retry to delete the tasks.

 SplitLogManger async delete node hangs log splitting when ZK connection is 
 lost 
 

 Key: HBASE-5606
 URL: https://issues.apache.org/jira/browse/HBASE-5606
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.92.0
Reporter: Gopinathan A
Priority: Critical
 Fix For: 0.92.2

 Attachments: 5606.txt


 1. One rs died, the servershutdownhandler found it out and started the 
 distributed log splitting;
 2. All tasks are failed due to ZK connection lost, so the all the tasks were 
 deleted asynchronously;
 3. Servershutdownhandler retried the log splitting;
 4. The asynchronously deletion in step 2 finally happened for new task
 5. This made the SplitLogManger in hanging state.
 This leads to .META. region not assigened for long time
 {noformat}
 hbase-root-master-HOST-192-168-47-204.log.2012-03-14(55413,79):2012-03-14 
 19:28:47,932 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: put up 
 splitlog task at znode 
 /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
 hbase-root-master-HOST-192-168-47-204.log.2012-03-14(89303,79):2012-03-14 
 19:34:32,387 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: put up 
 splitlog task at znode 
 /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
 {noformat}
 {noformat}
 hbase-root-master-HOST-192-168-47-204.log.2012-03-14(80417,99):2012-03-14 
 19:34:31,196 DEBUG 
 org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback: deleted 
 /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
 hbase-root-master-HOST-192-168-47-204.log.2012-03-14(89456,99):2012-03-14 
 19:34:32,497 DEBUG 
 org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback: deleted 
 /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5623) Race condition when rolling the HLog and hlogFlush

2012-03-24 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237658#comment-13237658
 ] 

Lars Hofhansl commented on HBASE-5623:
--

Was going by the assumption that an IOException here is actually not bad.
Either the writer was concurrently closed (which means it was rolled as well), 
or any persistent HDFS problem will detected on next write attempt.

Could LOG.debug something like: Informational: Log roll failed. Will be 
retried.


 Race condition when rolling the HLog and hlogFlush
 --

 Key: HBASE-5623
 URL: https://issues.apache.org/jira/browse/HBASE-5623
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.94.0
Reporter: Enis Soztutar
Assignee: Enis Soztutar
Priority: Critical
 Fix For: 0.94.0

 Attachments: 5623-suggestion.txt, 5623-v7.txt, 5623-v8.txt, 5623.txt, 
 5623v2.txt, HBASE-5623_v0.patch, HBASE-5623_v4.patch, HBASE-5623_v5.patch, 
 HBASE-5623_v6-alt.patch, HBASE-5623_v6-alt.patch


 When doing a ycsb test with a large number of handlers 
 (regionserver.handler.count=60), I get the following exceptions:
 {code}
 Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: 
 java.lang.NullPointerException
   at 
 org.apache.hadoop.io.SequenceFile$Writer.getLength(SequenceFile.java:1099)
   at 
 org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.getLength(SequenceFileLogWriter.java:314)
   at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1291)
   at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1388)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
   at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
   at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:920)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:152)
   at $Proxy1.multi(Unknown Source)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1691)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1689)
   at 
 org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:214)
 {code}
 and 
 {code}
   java.lang.NullPointerException
   at 
 org.apache.hadoop.io.SequenceFile$Writer.checkAndWriteSync(SequenceFile.java:1026)
   at 
 org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1068)
   at 
 org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1035)
   at 
 org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.append(SequenceFileLogWriter.java:279)
   at 
 org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.hlogFlush(HLog.java:1237)
   at 
 org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1271)
   at 
 org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1391)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
   at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
 {code}
 It seems the root cause of the issue is that we open a new log writer and 
 close the old one at HLog#rollWriter() holding the updateLock, but the other 
 threads doing syncer() calls
 {code} 
 logSyncerThread.hlogFlush(this.writer);
 {code}
 without holding the updateLock. LogSyncer only synchronizes against 
 concurrent appends and flush(), but not on the passed writer, which can be 
 closed already by rollWriter(). 

[jira] [Updated] (HBASE-5613) ThriftServer getTableRegions does not return serverName and port

2012-03-24 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5613:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

 ThriftServer getTableRegions does not return serverName and port
 

 Key: HBASE-5613
 URL: https://issues.apache.org/jira/browse/HBASE-5613
 Project: HBase
  Issue Type: Bug
  Components: thrift
Reporter: Scott Chen
Assignee: Scott Chen
Priority: Minor
 Fix For: 0.94.0, 0.96.0

 Attachments: HBASE-5613.0.94.2.txt, HBASE-5613.0.94.txt, 
 HBASE-5613.D2403.1.patch, HBASE-5613.D2403.2.patch, HBASE-5613.D2403.3.patch, 
 HBASE-5613.D2403.4.patch, HBASE-5613.D2403.5.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4957) Clean up some log messages, code in RecoverableZooKeeper

2012-03-24 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-4957:
-

Status: Patch Available  (was: Open)

 Clean up some log messages, code in RecoverableZooKeeper
 

 Key: HBASE-4957
 URL: https://issues.apache.org/jira/browse/HBASE-4957
 Project: HBase
  Issue Type: Improvement
  Components: zookeeper
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Fix For: 0.94.0

 Attachments: hbase-4957.txt, hbase-4957.txt, hbase-4957.txt, 
 hbase-4957.txt


 In RecoverableZooKeeper, there are a number of log messages and comments 
 which don't really read correctly, and some other pieces of code that can be 
 cleaned up. Simple cleanup - shouldn't be any actual behavioral changes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4720) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server

2012-03-24 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-4720:
-

Fix Version/s: (was: 0.94.0)
   0.94.1

Let's try for 0.94.1

 Implement atomic update operations (checkAndPut, checkAndDelete) for REST 
 client/server 
 

 Key: HBASE-4720
 URL: https://issues.apache.org/jira/browse/HBASE-4720
 Project: HBase
  Issue Type: Improvement
Reporter: Daniel Lord
Assignee: Mubarak Seyed
 Fix For: 0.94.1

 Attachments: HBASE-4720.trunk.v1.patch, HBASE-4720.trunk.v2.patch, 
 HBASE-4720.trunk.v3.patch, HBASE-4720.trunk.v4.patch, 
 HBASE-4720.trunk.v5.patch, HBASE-4720.trunk.v6.patch, 
 HBASE-4720.trunk.v7.patch, HBASE-4720.v1.patch, HBASE-4720.v3.patch


 I have several large application/HBase clusters where an application node 
 will occasionally need to talk to HBase from a different cluster.  In order 
 to help ensure some of my consistency guarantees I have a sentinel table that 
 is updated atomically as users interact with the system.  This works quite 
 well for the regular hbase client but the REST client does not implement 
 the checkAndPut and checkAndDelete operations.  This exposes the application 
 to some race conditions that have to be worked around.  It would be ideal if 
 the same checkAndPut/checkAndDelete operations could be supported by the REST 
 client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5598) Analyse and fix the findbugs reporting by QA and add invalid bugs into findbugs-excludeFilter file

2012-03-24 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237668#comment-13237668
 ] 

stack commented on HBASE-5598:
--

Propose up on dev list I'd say Jon.  More folks will see your question.

 Analyse and fix the findbugs reporting by QA and add invalid bugs into 
 findbugs-excludeFilter file
 --

 Key: HBASE-5598
 URL: https://issues.apache.org/jira/browse/HBASE-5598
 Project: HBase
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.92.1, 0.94.0, 0.96.0
Reporter: Uma Maheswara Rao G
Priority: Minor

 There are many findbugs errors reporting by HbaseQA. HBASE-5597 is going to 
 up the OK count.
 This may lead to other issues when we re-factor the code, if we induce new 
 valid ones and remove invalid bugs also can not be reported by QA.
 So, I would propose to add the exclude filter file for findbugs(for the 
 invalid bugs). If we find any valid ones, we can fix under this JIRA.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5128) [uber hbck] Online automated repair of table integrity and region consistency problems

2012-03-24 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237669#comment-13237669
 ] 

stack commented on HBASE-5128:
--

Hurray

Would suggest you stick something in the release note section Jon as means of 
spreading the good news about this fat tool.  What about this section in the 
reference manual: http://hbase.apache.org/book.html#hbck  Should we update it 
some?

Good stuff

 [uber hbck] Online automated repair of table integrity and region consistency 
 problems
 --

 Key: HBASE-5128
 URL: https://issues.apache.org/jira/browse/HBASE-5128
 Project: HBase
  Issue Type: New Feature
  Components: hbck
Affects Versions: 0.90.5, 0.92.0, 0.94.0, 0.96.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0

 Attachments: 5128-trunk.addendum, hbase-5128-0.90-v2.patch, 
 hbase-5128-0.90-v2b.patch, hbase-5128-0.90-v4.patch, 
 hbase-5128-0.92-v2.patch, hbase-5128-0.92-v4.patch, hbase-5128-0.94-v2.patch, 
 hbase-5128-0.94-v4.patch, hbase-5128-trunk-v2.patch, hbase-5128-trunk.patch, 
 hbase-5128-v3.patch, hbase-5128-v4.patch


 The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region 
 consistency and table integrity invariant violations.  However with '-fix' it 
 can only automatically repair region consistency cases having to do with 
 deployment problems.  This updated version should be able to handle all cases 
 (including a new orphan regiondir case).  When complete will likely deprecate 
 the OfflineMetaRepair tool and subsume several open META-hole related issue.
 Here's the approach (from the comment of at the top of the new version of the 
 file).
 {code}
 /**
  * HBaseFsck (hbck) is a tool for checking and repairing region consistency 
 and
  * table integrity.  
  * 
  * Region consistency checks verify that META, region deployment on
  * region servers and the state of data in HDFS (.regioninfo files) all are in
  * accordance. 
  * 
  * Table integrity checks verify that that all possible row keys can resolve 
 to
  * exactly one region of a table.  This means there are no individual 
 degenerate
  * or backwards regions; no holes between regions; and that there no 
 overlapping
  * regions. 
  * 
  * The general repair strategy works in these steps.
  * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
  * 2) Repair Region Consistency with META and assignments
  * 
  * For table integrity repairs, the tables their region directories are 
 scanned
  * for .regioninfo files.  Each table's integrity is then verified.  If there 
  * are any orphan regions (regions with no .regioninfo files), or holes, new 
  * regions are fabricated.  Backwards regions are sidelined as well as empty
  * degenerate (endkey==startkey) regions.  If there are any overlapping 
 regions,
  * a new region is created and all data is merged into the new region.  
  * 
  * Table integrity repairs deal solely with HDFS and can be done offline -- 
 the
  * hbase region servers or master do not need to be running.  These phase can 
 be
  * use to completely reconstruct the META table in an offline fashion. 
  * 
  * Region consistency requires three conditions -- 1) valid .regioninfo file 
  * present in an hdfs region dir,  2) valid row with .regioninfo data in META,
  * and 3) a region is deployed only at the regionserver that is was assigned 
 to.
  * 
  * Region consistency requires hbck to contact the HBase master and region
  * servers, so the connect() must first be called successfully.  Much of the
  * region consistency information is transient and less risky to repair.
  */
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5434) [REST] Include more metrics in cluster status request

2012-03-24 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5434:
-

   Resolution: Fixed
Fix Version/s: 0.96.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to 0.94 and 0.96.

 [REST] Include more metrics in cluster status request
 -

 Key: HBASE-5434
 URL: https://issues.apache.org/jira/browse/HBASE-5434
 Project: HBase
  Issue Type: Improvement
  Components: metrics, rest
Affects Versions: 0.94.0
Reporter: Mubarak Seyed
Assignee: Mubarak Seyed
Priority: Minor
  Labels: noob
 Fix For: 0.94.0, 0.96.0

 Attachments: HBASE-5434.trunk.v1.patch, HBASE-5434.trunk.v2.patch, 
 HBASE-5434.trunk.v2.patch, HBASE-5434.trunk.v2.patch


 /status/cluster shows only
 {code}
 stores=2
 storefiless=0
 storefileSizeMB=0
 memstoreSizeMB=0
 storefileIndexSizeMB=0
 {code}
 for a region but master web-ui shows
 {code}
 stores=1,
 storefiles=0,
 storefileUncompressedSizeMB=0
 storefileSizeMB=0
 memstoreSizeMB=0
 storefileIndexSizeMB=0
 readRequestsCount=0
 writeRequestsCount=0
 rootIndexSizeKB=0
 totalStaticIndexSizeKB=0
 totalStaticBloomSizeKB=0
 totalCompactingKVs=0
 currentCompactedKVs=0
 compactionProgressPct=NaN
 {code}
 In a write-heavy REST gateway based production environment, ops team needs to 
 verify whether write counters are getting incremented per region (they do run 
 /status/cluster on each REST server), we can get the same values from 
 *rpc.metrics.put_num_ops* and *hbase.regionserver.writeRequestsCount* but 
 some home-grown tools needs to parse the output of /status/cluster and 
 updates the dashboard.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5128) [uber hbck] Online automated repair of table integrity and region consistency problems

2012-03-24 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237675#comment-13237675
 ] 

Lars Hofhansl commented on HBASE-5128:
--

Thanks for getting this done for 0.94, Jon!
+1 on release notes and book update, but doesn't need to hold up 0.94rc

 [uber hbck] Online automated repair of table integrity and region consistency 
 problems
 --

 Key: HBASE-5128
 URL: https://issues.apache.org/jira/browse/HBASE-5128
 Project: HBase
  Issue Type: New Feature
  Components: hbck
Affects Versions: 0.90.5, 0.92.0, 0.94.0, 0.96.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0

 Attachments: 5128-trunk.addendum, hbase-5128-0.90-v2.patch, 
 hbase-5128-0.90-v2b.patch, hbase-5128-0.90-v4.patch, 
 hbase-5128-0.92-v2.patch, hbase-5128-0.92-v4.patch, hbase-5128-0.94-v2.patch, 
 hbase-5128-0.94-v4.patch, hbase-5128-trunk-v2.patch, hbase-5128-trunk.patch, 
 hbase-5128-v3.patch, hbase-5128-v4.patch


 The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region 
 consistency and table integrity invariant violations.  However with '-fix' it 
 can only automatically repair region consistency cases having to do with 
 deployment problems.  This updated version should be able to handle all cases 
 (including a new orphan regiondir case).  When complete will likely deprecate 
 the OfflineMetaRepair tool and subsume several open META-hole related issue.
 Here's the approach (from the comment of at the top of the new version of the 
 file).
 {code}
 /**
  * HBaseFsck (hbck) is a tool for checking and repairing region consistency 
 and
  * table integrity.  
  * 
  * Region consistency checks verify that META, region deployment on
  * region servers and the state of data in HDFS (.regioninfo files) all are in
  * accordance. 
  * 
  * Table integrity checks verify that that all possible row keys can resolve 
 to
  * exactly one region of a table.  This means there are no individual 
 degenerate
  * or backwards regions; no holes between regions; and that there no 
 overlapping
  * regions. 
  * 
  * The general repair strategy works in these steps.
  * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
  * 2) Repair Region Consistency with META and assignments
  * 
  * For table integrity repairs, the tables their region directories are 
 scanned
  * for .regioninfo files.  Each table's integrity is then verified.  If there 
  * are any orphan regions (regions with no .regioninfo files), or holes, new 
  * regions are fabricated.  Backwards regions are sidelined as well as empty
  * degenerate (endkey==startkey) regions.  If there are any overlapping 
 regions,
  * a new region is created and all data is merged into the new region.  
  * 
  * Table integrity repairs deal solely with HDFS and can be done offline -- 
 the
  * hbase region servers or master do not need to be running.  These phase can 
 be
  * use to completely reconstruct the META table in an offline fashion. 
  * 
  * Region consistency requires three conditions -- 1) valid .regioninfo file 
  * present in an hdfs region dir,  2) valid row with .regioninfo data in META,
  * and 3) a region is deployed only at the regionserver that is was assigned 
 to.
  * 
  * Region consistency requires hbck to contact the HBase master and region
  * servers, so the connect() must first be called successfully.  Much of the
  * region consistency information is transient and less risky to repair.
  */
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5434) [REST] Include more metrics in cluster status request

2012-03-24 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237677#comment-13237677
 ] 

stack commented on HBASE-5434:
--

+1 on commit

 [REST] Include more metrics in cluster status request
 -

 Key: HBASE-5434
 URL: https://issues.apache.org/jira/browse/HBASE-5434
 Project: HBase
  Issue Type: Improvement
  Components: metrics, rest
Affects Versions: 0.94.0
Reporter: Mubarak Seyed
Assignee: Mubarak Seyed
Priority: Minor
  Labels: noob
 Fix For: 0.94.0, 0.96.0

 Attachments: HBASE-5434.trunk.v1.patch, HBASE-5434.trunk.v2.patch, 
 HBASE-5434.trunk.v2.patch, HBASE-5434.trunk.v2.patch


 /status/cluster shows only
 {code}
 stores=2
 storefiless=0
 storefileSizeMB=0
 memstoreSizeMB=0
 storefileIndexSizeMB=0
 {code}
 for a region but master web-ui shows
 {code}
 stores=1,
 storefiles=0,
 storefileUncompressedSizeMB=0
 storefileSizeMB=0
 memstoreSizeMB=0
 storefileIndexSizeMB=0
 readRequestsCount=0
 writeRequestsCount=0
 rootIndexSizeKB=0
 totalStaticIndexSizeKB=0
 totalStaticBloomSizeKB=0
 totalCompactingKVs=0
 currentCompactedKVs=0
 compactionProgressPct=NaN
 {code}
 In a write-heavy REST gateway based production environment, ops team needs to 
 verify whether write counters are getting incremented per region (they do run 
 /status/cluster on each REST server), we can get the same values from 
 *rpc.metrics.put_num_ops* and *hbase.regionserver.writeRequestsCount* but 
 some home-grown tools needs to parse the output of /status/cluster and 
 updates the dashboard.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4393) Implement a canary monitoring program

2012-03-24 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237678#comment-13237678
 ] 

Lars Hofhansl commented on HBASE-4393:
--

@Matteo: Ideally this could be used for trending. So output that is suitable 
for Ganglia or OpenTSDB (whatever that means in both cases) would be cool.
Even just a cluster state tool is great.

 Implement a canary monitoring program
 -

 Key: HBASE-4393
 URL: https://issues.apache.org/jira/browse/HBASE-4393
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Amandeep Khurana
 Attachments: HBaseCanary.java


 This JIRA is to implement a standalone program that can be used to do canary 
 monitoring of a running HBase cluster. This program would gather a list of 
 the regions in the cluster, then iterate over them doing lightweight 
 operations (eg short scans) to provide metrics about latency as well as alert 
 on availability issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5633) NPE reading ZK config in HBase

2012-03-24 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5633:
-

Attachment: HBASE-5633-v2.patch

What I'm committing... wraps a very long line else what Matteo suppied.

 NPE reading ZK config in HBase
 --

 Key: HBASE-5633
 URL: https://issues.apache.org/jira/browse/HBASE-5633
 Project: HBase
  Issue Type: Bug
  Components: zookeeper
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
 Attachments: HBASE-5633-v1.patch, HBASE-5633-v2.patch


 If zoo.cfg contains server.* (server.0=server0:2888:3888\n) and 
 cluster.distributed property (in hbase-site.xml) is empty we get an NPE in 
 parseZooCfg().
 The easy way to reproduce the bug is running 
 org.apache.hbase.zookeeper.TestHQuorumPeer with hbase-site.xml containing:
 {code}
 property
   namehbase.cluster.distributed/name
   value/value
 /property
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4393) Implement a canary monitoring program

2012-03-24 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237681#comment-13237681
 ] 

Lars Hofhansl commented on HBASE-4393:
--

Java code looks great. Maybe instead of using a scanner in sniffRegion, you 
could use a Get?


 Implement a canary monitoring program
 -

 Key: HBASE-4393
 URL: https://issues.apache.org/jira/browse/HBASE-4393
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Amandeep Khurana
 Fix For: 0.94.0

 Attachments: HBaseCanary.java


 This JIRA is to implement a standalone program that can be used to do canary 
 monitoring of a running HBase cluster. This program would gather a list of 
 the regions in the cluster, then iterate over them doing lightweight 
 operations (eg short scans) to provide metrics about latency as well as alert 
 on availability issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4393) Implement a canary monitoring program

2012-03-24 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-4393:
-

Fix Version/s: 0.94.0

I would like to get this into 0.94.
This needs some of usage description so that folks can find out what you are 
supposed to pass on the command line.

 Implement a canary monitoring program
 -

 Key: HBASE-4393
 URL: https://issues.apache.org/jira/browse/HBASE-4393
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Amandeep Khurana
 Fix For: 0.94.0

 Attachments: HBaseCanary.java


 This JIRA is to implement a standalone program that can be used to do canary 
 monitoring of a running HBase cluster. This program would gather a list of 
 the regions in the cluster, then iterate over them doing lightweight 
 operations (eg short scans) to provide metrics about latency as well as alert 
 on availability issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5633) NPE reading ZK config in HBase

2012-03-24 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5633:
-

   Resolution: Fixed
Fix Version/s: 0.94.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed 0.94 branch and trunk

 NPE reading ZK config in HBase
 --

 Key: HBASE-5633
 URL: https://issues.apache.org/jira/browse/HBASE-5633
 Project: HBase
  Issue Type: Bug
  Components: zookeeper
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
 Fix For: 0.94.0

 Attachments: HBASE-5633-v1.patch, HBASE-5633-v2.patch


 If zoo.cfg contains server.* (server.0=server0:2888:3888\n) and 
 cluster.distributed property (in hbase-site.xml) is empty we get an NPE in 
 parseZooCfg().
 The easy way to reproduce the bug is running 
 org.apache.hbase.zookeeper.TestHQuorumPeer with hbase-site.xml containing:
 {code}
 property
   namehbase.cluster.distributed/name
   value/value
 /property
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5633) NPE reading ZK config in HBase

2012-03-24 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237687#comment-13237687
 ] 

stack commented on HBASE-5633:
--

Thanks for the patch Matteo

 NPE reading ZK config in HBase
 --

 Key: HBASE-5633
 URL: https://issues.apache.org/jira/browse/HBASE-5633
 Project: HBase
  Issue Type: Bug
  Components: zookeeper
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
 Fix For: 0.94.0

 Attachments: HBASE-5633-v1.patch, HBASE-5633-v2.patch


 If zoo.cfg contains server.* (server.0=server0:2888:3888\n) and 
 cluster.distributed property (in hbase-site.xml) is empty we get an NPE in 
 parseZooCfg().
 The easy way to reproduce the bug is running 
 org.apache.hbase.zookeeper.TestHQuorumPeer with hbase-site.xml containing:
 {code}
 property
   namehbase.cluster.distributed/name
   value/value
 /property
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5633) NPE reading ZK config in HBase

2012-03-24 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237688#comment-13237688
 ] 

Lars Hofhansl commented on HBASE-5633:
--

+1

 NPE reading ZK config in HBase
 --

 Key: HBASE-5633
 URL: https://issues.apache.org/jira/browse/HBASE-5633
 Project: HBase
  Issue Type: Bug
  Components: zookeeper
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
 Fix For: 0.94.0

 Attachments: HBASE-5633-v1.patch, HBASE-5633-v2.patch


 If zoo.cfg contains server.* (server.0=server0:2888:3888\n) and 
 cluster.distributed property (in hbase-site.xml) is empty we get an NPE in 
 parseZooCfg().
 The easy way to reproduce the bug is running 
 org.apache.hbase.zookeeper.TestHQuorumPeer with hbase-site.xml containing:
 {code}
 property
   namehbase.cluster.distributed/name
   value/value
 /property
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5598) Analyse and fix the findbugs reporting by QA and add invalid bugs into findbugs-excludeFilter file

2012-03-24 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237694#comment-13237694
 ] 

Lars Hofhansl commented on HBASE-5598:
--

+1 on no-new-findbugs-policy once it's down to 0. (We do the same at 
Salesforce.)
Not sure, though, that right method to get this to 0 is to use an exclude 
filter, should use findbug annotation in the files.

 Analyse and fix the findbugs reporting by QA and add invalid bugs into 
 findbugs-excludeFilter file
 --

 Key: HBASE-5598
 URL: https://issues.apache.org/jira/browse/HBASE-5598
 Project: HBase
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.92.1, 0.94.0, 0.96.0
Reporter: Uma Maheswara Rao G
Priority: Minor

 There are many findbugs errors reporting by HbaseQA. HBASE-5597 is going to 
 up the OK count.
 This may lead to other issues when we re-factor the code, if we induce new 
 valid ones and remove invalid bugs also can not be reported by QA.
 So, I would propose to add the exclude filter file for findbugs(for the 
 invalid bugs). If we find any valid ones, we can fix under this JIRA.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4393) Implement a canary monitoring program

2012-03-24 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237697#comment-13237697
 ] 

stack commented on HBASE-4393:
--

I wrote a note to suggest this NOT be added to 0.94 because its only basic and 
its apart from hbase so we shouldn't have to hold up hbase to get it in.  Was 
also going to talk about this tool being too basic -- emissions are on stdout 
only rather than up in jmx, formatted as json or whatever -- but then I thought 
we have to start somewhere.   We can add to this basic tool later.

The class needs a license and a class comment.

Should be called Canary rather than HBaseCanary.

Put it into a package.  Would suggest we start a tool package so o.a.h.h.tool.

Should implement Tool and be run using ToolRunner.  Tool adds a little useful 
util.

Needs usage as per lars.

Could be added to bin/hbase as 'canary' -- could start/stop it like we 
start/stop region.  If you do this, then things like log name and location will 
be set up for you as it is for rest server and thrift server etc.

Should output be via LOG rather than stdout?   Then we can hook its output up 
variously.

Skip formatting in the output...the ' - Region ..' i.e. remove the ' - ' prefix.

I think make the few small changes above and we'd have a good start.  Thanks 
lads. Good stuff



 Implement a canary monitoring program
 -

 Key: HBASE-4393
 URL: https://issues.apache.org/jira/browse/HBASE-4393
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Amandeep Khurana
 Fix For: 0.94.0

 Attachments: HBaseCanary.java


 This JIRA is to implement a standalone program that can be used to do canary 
 monitoring of a running HBase cluster. This program would gather a list of 
 the regions in the cluster, then iterate over them doing lightweight 
 operations (eg short scans) to provide metrics about latency as well as alert 
 on availability issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4393) Implement a canary monitoring program

2012-03-24 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-4393:
-

Fix Version/s: (was: 0.94.0)

You are more iron-handy than me, stack. Your points are well taken, 
unscheduling.

 Implement a canary monitoring program
 -

 Key: HBASE-4393
 URL: https://issues.apache.org/jira/browse/HBASE-4393
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Amandeep Khurana
 Attachments: HBaseCanary.java


 This JIRA is to implement a standalone program that can be used to do canary 
 monitoring of a running HBase cluster. This program would gather a list of 
 the regions in the cluster, then iterate over them doing lightweight 
 operations (eg short scans) to provide metrics about latency as well as alert 
 on availability issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4393) Implement a canary monitoring program

2012-03-24 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237699#comment-13237699
 ] 

stack commented on HBASE-4393:
--

Hmm... this thing comes up reports and goes down immediately so some of my 
suggestions above may be OTT.  So, I don't think we need the following to get 
the script in (we can add it later):

Could be added to bin/hbase as 'canary' – could start/stop it like we 
start/stop region. If you do this, then things like log name and location will 
be set up for you as it is for rest server and thrift server etc.

 Implement a canary monitoring program
 -

 Key: HBASE-4393
 URL: https://issues.apache.org/jira/browse/HBASE-4393
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Amandeep Khurana
 Attachments: HBaseCanary.java


 This JIRA is to implement a standalone program that can be used to do canary 
 monitoring of a running HBase cluster. This program would gather a list of 
 the regions in the cluster, then iterate over them doing lightweight 
 operations (eg short scans) to provide metrics about latency as well as alert 
 on availability issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4957) Clean up some log messages, code in RecoverableZooKeeper

2012-03-24 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237700#comment-13237700
 ] 

Hadoop QA commented on HBASE-4957:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12519579/hbase-4957.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 1 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.mapreduce.TestImportTsv
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
  org.apache.hadoop.hbase.master.TestSplitLogManager

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1299//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1299//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1299//console

This message is automatically generated.

 Clean up some log messages, code in RecoverableZooKeeper
 

 Key: HBASE-4957
 URL: https://issues.apache.org/jira/browse/HBASE-4957
 Project: HBase
  Issue Type: Improvement
  Components: zookeeper
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Fix For: 0.94.0

 Attachments: hbase-4957.txt, hbase-4957.txt, hbase-4957.txt, 
 hbase-4957.txt


 In RecoverableZooKeeper, there are a number of log messages and comments 
 which don't really read correctly, and some other pieces of code that can be 
 cleaned up. Simple cleanup - shouldn't be any actual behavioral changes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5633) NPE reading ZK config in HBase

2012-03-24 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237701#comment-13237701
 ] 

Hudson commented on HBASE-5633:
---

Integrated in HBase-0.94 #53 (See 
[https://builds.apache.org/job/HBase-0.94/53/])
HBASE-5633 NPE reading ZK config in HBase (Revision 1304925)

 Result = SUCCESS
stack : 
Files : 
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/HConstants.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKConfig.java


 NPE reading ZK config in HBase
 --

 Key: HBASE-5633
 URL: https://issues.apache.org/jira/browse/HBASE-5633
 Project: HBase
  Issue Type: Bug
  Components: zookeeper
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
 Fix For: 0.94.0

 Attachments: HBASE-5633-v1.patch, HBASE-5633-v2.patch


 If zoo.cfg contains server.* (server.0=server0:2888:3888\n) and 
 cluster.distributed property (in hbase-site.xml) is empty we get an NPE in 
 parseZooCfg().
 The easy way to reproduce the bug is running 
 org.apache.hbase.zookeeper.TestHQuorumPeer with hbase-site.xml containing:
 {code}
 property
   namehbase.cluster.distributed/name
   value/value
 /property
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5434) [REST] Include more metrics in cluster status request

2012-03-24 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237702#comment-13237702
 ] 

Hudson commented on HBASE-5434:
---

Integrated in HBase-0.94 #53 (See 
[https://builds.apache.org/job/HBase-0.94/53/])
HBASE-5434 [REST] Include more metrics in cluster status request (Mubarak 
Seyed) (Revision 1304918)

 Result = SUCCESS
larsh : 
Files : 
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/HServerLoad.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/rest/StorageClusterStatusResource.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/rest/model/StorageClusterStatusModel.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/rest/protobuf/generated/StorageClusterStatusMessage.java
* 
/hbase/branches/0.94/src/main/resources/org/apache/hadoop/hbase/rest/XMLSchema.xsd
* 
/hbase/branches/0.94/src/main/resources/org/apache/hadoop/hbase/rest/protobuf/StorageClusterStatusMessage.proto
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/rest/model/TestStorageClusterStatusModel.java


 [REST] Include more metrics in cluster status request
 -

 Key: HBASE-5434
 URL: https://issues.apache.org/jira/browse/HBASE-5434
 Project: HBase
  Issue Type: Improvement
  Components: metrics, rest
Affects Versions: 0.94.0
Reporter: Mubarak Seyed
Assignee: Mubarak Seyed
Priority: Minor
  Labels: noob
 Fix For: 0.94.0, 0.96.0

 Attachments: HBASE-5434.trunk.v1.patch, HBASE-5434.trunk.v2.patch, 
 HBASE-5434.trunk.v2.patch, HBASE-5434.trunk.v2.patch


 /status/cluster shows only
 {code}
 stores=2
 storefiless=0
 storefileSizeMB=0
 memstoreSizeMB=0
 storefileIndexSizeMB=0
 {code}
 for a region but master web-ui shows
 {code}
 stores=1,
 storefiles=0,
 storefileUncompressedSizeMB=0
 storefileSizeMB=0
 memstoreSizeMB=0
 storefileIndexSizeMB=0
 readRequestsCount=0
 writeRequestsCount=0
 rootIndexSizeKB=0
 totalStaticIndexSizeKB=0
 totalStaticBloomSizeKB=0
 totalCompactingKVs=0
 currentCompactedKVs=0
 compactionProgressPct=NaN
 {code}
 In a write-heavy REST gateway based production environment, ops team needs to 
 verify whether write counters are getting incremented per region (they do run 
 /status/cluster on each REST server), we can get the same values from 
 *rpc.metrics.put_num_ops* and *hbase.regionserver.writeRequestsCount* but 
 some home-grown tools needs to parse the output of /status/cluster and 
 updates the dashboard.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4957) Clean up some log messages, code in RecoverableZooKeeper

2012-03-24 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237703#comment-13237703
 ] 

Lars Hofhansl commented on HBASE-4957:
--

TestSplitLogManager passed locally. Going to commit.

 Clean up some log messages, code in RecoverableZooKeeper
 

 Key: HBASE-4957
 URL: https://issues.apache.org/jira/browse/HBASE-4957
 Project: HBase
  Issue Type: Improvement
  Components: zookeeper
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Fix For: 0.94.0

 Attachments: hbase-4957.txt, hbase-4957.txt, hbase-4957.txt, 
 hbase-4957.txt


 In RecoverableZooKeeper, there are a number of log messages and comments 
 which don't really read correctly, and some other pieces of code that can be 
 cleaned up. Simple cleanup - shouldn't be any actual behavioral changes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4147) StoreFile query usage report

2012-03-24 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4147:
-

 Priority: Critical  (was: Minor)
Fix Version/s: 0.96.0

Upping priority and marking against 0.96.0 so it gets more consideration.

 StoreFile query usage report
 

 Key: HBASE-4147
 URL: https://issues.apache.org/jira/browse/HBASE-4147
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Priority: Critical
 Fix For: 0.96.0

 Attachments: hbase_4147_storefilereport.pdf, 
 hbase_4147_storefilereport_2011_08_10.pdf


 Detailed information on what HBase is doing in terms of reads is hard to come 
 by.
 What would be useful is to have a periodic StoreFile query report.  
 Specifically, this could run on a configured interval (e.g., every 30 
 seconds, 60 seconds) and dump the output to the log files.
 This would have all StoreFiles accessed during the reporting period (and with 
 the Path we would also know region, CF, and table), # of times the StoreFile 
 was accessed, the size of the StoreFile, and the total time (ms) spent 
 processing that StoreFile.
 Even this level of summary would be useful to detect a which tables  CFs are 
 being accessed the most, and including the StoreFile would provide insight 
 into relative uncompaction (i.e., lots of StoreFiles).
 I think the log-output, as opposed to UI, is an important facet with this.  
 I'm assuming that users will slice and dice this data on their own so I think 
 we should skip any kind of admin view for now (i.e., new JSPs, new APIs to 
 expose this data).  Just getting this to log-file would be a big improvement.
 Will this have a non-zero performance impact?  Yes.  Hopefully small, but yes 
 it will.  However, flying a plane without any instrumentation isn't fun.  :-) 
  
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4147) StoreFile query usage report

2012-03-24 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237708#comment-13237708
 ] 

Lars Hofhansl commented on HBASE-4147:
--

Should we output the statistics via JMX as well?

 StoreFile query usage report
 

 Key: HBASE-4147
 URL: https://issues.apache.org/jira/browse/HBASE-4147
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Priority: Critical
 Fix For: 0.96.0

 Attachments: hbase_4147_storefilereport.pdf, 
 hbase_4147_storefilereport_2011_08_10.pdf


 Detailed information on what HBase is doing in terms of reads is hard to come 
 by.
 What would be useful is to have a periodic StoreFile query report.  
 Specifically, this could run on a configured interval (e.g., every 30 
 seconds, 60 seconds) and dump the output to the log files.
 This would have all StoreFiles accessed during the reporting period (and with 
 the Path we would also know region, CF, and table), # of times the StoreFile 
 was accessed, the size of the StoreFile, and the total time (ms) spent 
 processing that StoreFile.
 Even this level of summary would be useful to detect a which tables  CFs are 
 being accessed the most, and including the StoreFile would provide insight 
 into relative uncompaction (i.e., lots of StoreFiles).
 I think the log-output, as opposed to UI, is an important facet with this.  
 I'm assuming that users will slice and dice this data on their own so I think 
 we should skip any kind of admin view for now (i.e., new JSPs, new APIs to 
 expose this data).  Just getting this to log-file would be a big improvement.
 Will this have a non-zero performance impact?  Yes.  Hopefully small, but yes 
 it will.  However, flying a plane without any instrumentation isn't fun.  :-) 
  
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5533) Add more metrics to HBase

2012-03-24 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5533:
-

Status: Open  (was: Patch Available)

 Add more metrics to HBase
 -

 Key: HBASE-5533
 URL: https://issues.apache.org/jira/browse/HBASE-5533
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.92.2, 0.94.0
Reporter: Shaneal Manek
Assignee: Shaneal Manek
Priority: Minor
 Attachments: BlockingQueueContention.java, HBASE-5533-0.92-v4.patch, 
 HBASE-5533-TRUNK-v6.patch, TimingOverhead.java, hbase-5533-0.92.patch, 
 hbase5533-0.92-v2.patch, hbase5533-0.92-v3.patch, hbase5533-0.92-v5.patch, 
 histogram_web_ui.png


 To debug/monitor production clusters, there are some more metrics I wish I 
 had available.
 In particular:
 - Although the average FS latencies are useful, a 'histogram' of recent 
 latencies (90% of reads completed in under 100ms, 99% in under 200ms, etc) 
 would be more useful
 - Similar histograms of latencies on common operations (GET, PUT, DELETE) 
 would be useful
 - Counting the number of accesses to each region to detect hotspotting
 - Exposing the current number of HLog files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5533) Add more metrics to HBase

2012-03-24 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5533:
-

Status: Patch Available  (was: Open)

 Add more metrics to HBase
 -

 Key: HBASE-5533
 URL: https://issues.apache.org/jira/browse/HBASE-5533
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.92.2, 0.94.0
Reporter: Shaneal Manek
Assignee: Shaneal Manek
Priority: Minor
 Attachments: BlockingQueueContention.java, HBASE-5533-0.92-v4.patch, 
 HBASE-5533-TRUNK-v6.patch, TimingOverhead.java, hbase-5533-0.92.patch, 
 hbase5533-0.92-v2.patch, hbase5533-0.92-v3.patch, hbase5533-0.92-v5.patch, 
 histogram_web_ui.png


 To debug/monitor production clusters, there are some more metrics I wish I 
 had available.
 In particular:
 - Although the average FS latencies are useful, a 'histogram' of recent 
 latencies (90% of reads completed in under 100ms, 99% in under 200ms, etc) 
 would be more useful
 - Similar histograms of latencies on common operations (GET, PUT, DELETE) 
 would be useful
 - Counting the number of accesses to each region to detect hotspotting
 - Exposing the current number of HLog files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4957) Clean up some log messages, code in RecoverableZooKeeper

2012-03-24 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-4957:
-

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Committed to 0.94 and 0.96.

 Clean up some log messages, code in RecoverableZooKeeper
 

 Key: HBASE-4957
 URL: https://issues.apache.org/jira/browse/HBASE-4957
 Project: HBase
  Issue Type: Improvement
  Components: zookeeper
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Fix For: 0.94.0

 Attachments: hbase-4957.txt, hbase-4957.txt, hbase-4957.txt, 
 hbase-4957.txt


 In RecoverableZooKeeper, there are a number of log messages and comments 
 which don't really read correctly, and some other pieces of code that can be 
 cleaned up. Simple cleanup - shouldn't be any actual behavioral changes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4957) Clean up some log messages, code in RecoverableZooKeeper

2012-03-24 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-4957:
-

Fix Version/s: 0.96.0

 Clean up some log messages, code in RecoverableZooKeeper
 

 Key: HBASE-4957
 URL: https://issues.apache.org/jira/browse/HBASE-4957
 Project: HBase
  Issue Type: Improvement
  Components: zookeeper
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Fix For: 0.94.0, 0.96.0

 Attachments: hbase-4957.txt, hbase-4957.txt, hbase-4957.txt, 
 hbase-4957.txt


 In RecoverableZooKeeper, there are a number of log messages and comments 
 which don't really read correctly, and some other pieces of code that can be 
 cleaned up. Simple cleanup - shouldn't be any actual behavioral changes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5623) Race condition when rolling the HLog and hlogFlush

2012-03-24 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237713#comment-13237713
 ] 

Lars Hofhansl commented on HBASE-5623:
--

Any strong opinions other than the log message?
This is the last 0.94.0 issue. I think my latest proposed patch fixes the 
problem while not impacting maintainability.


 Race condition when rolling the HLog and hlogFlush
 --

 Key: HBASE-5623
 URL: https://issues.apache.org/jira/browse/HBASE-5623
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.94.0
Reporter: Enis Soztutar
Assignee: Enis Soztutar
Priority: Critical
 Fix For: 0.94.0

 Attachments: 5623-suggestion.txt, 5623-v7.txt, 5623-v8.txt, 5623.txt, 
 5623v2.txt, HBASE-5623_v0.patch, HBASE-5623_v4.patch, HBASE-5623_v5.patch, 
 HBASE-5623_v6-alt.patch, HBASE-5623_v6-alt.patch


 When doing a ycsb test with a large number of handlers 
 (regionserver.handler.count=60), I get the following exceptions:
 {code}
 Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: 
 java.lang.NullPointerException
   at 
 org.apache.hadoop.io.SequenceFile$Writer.getLength(SequenceFile.java:1099)
   at 
 org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.getLength(SequenceFileLogWriter.java:314)
   at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1291)
   at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1388)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
   at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
   at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:920)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:152)
   at $Proxy1.multi(Unknown Source)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1691)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1689)
   at 
 org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:214)
 {code}
 and 
 {code}
   java.lang.NullPointerException
   at 
 org.apache.hadoop.io.SequenceFile$Writer.checkAndWriteSync(SequenceFile.java:1026)
   at 
 org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1068)
   at 
 org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1035)
   at 
 org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.append(SequenceFileLogWriter.java:279)
   at 
 org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.hlogFlush(HLog.java:1237)
   at 
 org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1271)
   at 
 org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1391)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
   at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
 {code}
 It seems the root cause of the issue is that we open a new log writer and 
 close the old one at HLog#rollWriter() holding the updateLock, but the other 
 threads doing syncer() calls
 {code} 
 logSyncerThread.hlogFlush(this.writer);
 {code}
 without holding the updateLock. LogSyncer only synchronizes against 
 concurrent appends and flush(), but not on the passed writer, which can be 
 closed already by rollWriter(). In this case, since 
 SequenceFile#Writer.close() sets it's out field as null, we get the NPE. 

--
This message is automatically 

[jira] [Commented] (HBASE-5623) Race condition when rolling the HLog and hlogFlush

2012-03-24 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237714#comment-13237714
 ] 

Ted Yu commented on HBASE-5623:
---

Adding LOG.debug should be fine.

 Race condition when rolling the HLog and hlogFlush
 --

 Key: HBASE-5623
 URL: https://issues.apache.org/jira/browse/HBASE-5623
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.94.0
Reporter: Enis Soztutar
Assignee: Enis Soztutar
Priority: Critical
 Fix For: 0.94.0

 Attachments: 5623-suggestion.txt, 5623-v7.txt, 5623-v8.txt, 5623.txt, 
 5623v2.txt, HBASE-5623_v0.patch, HBASE-5623_v4.patch, HBASE-5623_v5.patch, 
 HBASE-5623_v6-alt.patch, HBASE-5623_v6-alt.patch


 When doing a ycsb test with a large number of handlers 
 (regionserver.handler.count=60), I get the following exceptions:
 {code}
 Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: 
 java.lang.NullPointerException
   at 
 org.apache.hadoop.io.SequenceFile$Writer.getLength(SequenceFile.java:1099)
   at 
 org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.getLength(SequenceFileLogWriter.java:314)
   at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1291)
   at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1388)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
   at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
   at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:920)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:152)
   at $Proxy1.multi(Unknown Source)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1691)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1689)
   at 
 org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:214)
 {code}
 and 
 {code}
   java.lang.NullPointerException
   at 
 org.apache.hadoop.io.SequenceFile$Writer.checkAndWriteSync(SequenceFile.java:1026)
   at 
 org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1068)
   at 
 org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1035)
   at 
 org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.append(SequenceFileLogWriter.java:279)
   at 
 org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.hlogFlush(HLog.java:1237)
   at 
 org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1271)
   at 
 org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1391)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
   at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
 {code}
 It seems the root cause of the issue is that we open a new log writer and 
 close the old one at HLog#rollWriter() holding the updateLock, but the other 
 threads doing syncer() calls
 {code} 
 logSyncerThread.hlogFlush(this.writer);
 {code}
 without holding the updateLock. LogSyncer only synchronizes against 
 concurrent appends and flush(), but not on the passed writer, which can be 
 closed already by rollWriter(). In this case, since 
 SequenceFile#Writer.close() sets it's out field as null, we get the NPE. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 

[jira] [Commented] (HBASE-4957) Clean up some log messages, code in RecoverableZooKeeper

2012-03-24 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237716#comment-13237716
 ] 

Hudson commented on HBASE-4957:
---

Integrated in HBase-0.94 #54 (See 
[https://builds.apache.org/job/HBase-0.94/54/])
HBASE-4957 Clean up some log messages, code in RecoverableZooKeeper (Todd) 
(Revision 1304941)

 Result = SUCCESS
larsh : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/util/RetryCounter.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/zookeeper/RecoverableZooKeeper.java


 Clean up some log messages, code in RecoverableZooKeeper
 

 Key: HBASE-4957
 URL: https://issues.apache.org/jira/browse/HBASE-4957
 Project: HBase
  Issue Type: Improvement
  Components: zookeeper
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Fix For: 0.94.0, 0.96.0

 Attachments: hbase-4957.txt, hbase-4957.txt, hbase-4957.txt, 
 hbase-4957.txt


 In RecoverableZooKeeper, there are a number of log messages and comments 
 which don't really read correctly, and some other pieces of code that can be 
 cleaned up. Simple cleanup - shouldn't be any actual behavioral changes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5604) HLog replay tool that generates HFiles for use by LoadIncrementalHFiles.

2012-03-24 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237726#comment-13237726
 ] 

stack commented on HBASE-5604:
--

I like the postgres link.  Should point at that when we doc this tool.

bq. This could an M/R (with a mapper per HLog file).

You'll need a reduce (I think you deduce this yourself above but saying it for 
completeness).  If a map-only MR job, if many WALs, say 100s, you could end up 
w/ the same amount of hfiles per region (if each WAL had at least one edit for 
this region).  You'd need a reducer to coalesce by region.

This tool would not apply the edits in the order in which we received them.  
We'd be reliant on sort order only which should be fine I think since this is 
what happens if they were instead inserted via the memstore anyways.

bq. Hmm... Maybe this is only useful when we have a lot of logs maybe there 
would be no advantage here turning this in an M/R job, but maybe it should just 
be a standalone client...?

Well, even if tens of files only, you'd want to //ize it to do the filtering, 
etc., so MR sounds right.

Or you could hack on the distributed split code to add a 'filtering' 
facility... so it dropped edits that were outside of a range -- e.g. not one of 
the specified tables or not of a time range.  The output of distributed log 
splitting is only replayed on region open so you'd need to figure how to get 
the region to load the edits (An MR job to write hfiles sounds way more 
straightforward relatively).


 HLog replay tool that generates HFiles for use by LoadIncrementalHFiles.
 

 Key: HBASE-5604
 URL: https://issues.apache.org/jira/browse/HBASE-5604
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl

 Just an idea I had. Might be useful for restore of a backup using the HLogs.
 This could an M/R (with a mapper per HLog file).
 The tool would get a timerange and a (set of) table(s). We'd pick the right 
 HLogs based on time before the M/R job is started and then have a mapper per 
 HLog file.
 The mapper would then go through the HLog, filter all WALEdits that didn't 
 fit into the time range or are not any of the tables and then uses 
 HFileOutputFormat to generate HFiles.
 Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5604) HLog replay tool that generates HFiles for use by LoadIncrementalHFiles.

2012-03-24 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237744#comment-13237744
 ] 

Lars Hofhansl commented on HBASE-5604:
--

I went the distributed log splitting route first a while back. Had it all 
working in fact.
Or so I thought. Then I tried playing logs to a new table and realized that 
distributed log splitting only works for crash recovery (before the regions 
could split further) because it splits using the region name in the log.
That makes sense, because otherwise each region server participating in log 
splitting would need to look up the current region for each encountered row 
otherwise (the region could have split and the row in question needs to go one 
of the daugthers). That is essentially what the highlevel API does anyway.

Yes, definitely need a reducer.

Similar to what I did for Import, I can see this working in two modes:
# The mappers directly apply changes to a running HBase cluster 
(TableOutputformat). No reducers needed in this case.
# Create HFiles via HFileOutputFormat in the reduce phase.

In fact this tool would probably be very much like Import, just with a 
different InputFormat.


 HLog replay tool that generates HFiles for use by LoadIncrementalHFiles.
 

 Key: HBASE-5604
 URL: https://issues.apache.org/jira/browse/HBASE-5604
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl

 Just an idea I had. Might be useful for restore of a backup using the HLogs.
 This could an M/R (with a mapper per HLog file).
 The tool would get a timerange and a (set of) table(s). We'd pick the right 
 HLogs based on time before the M/R job is started and then have a mapper per 
 HLog file.
 The mapper would then go through the HLog, filter all WALEdits that didn't 
 fit into the time range or are not any of the tables and then uses 
 HFileOutputFormat to generate HFiles.
 Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5623) Race condition when rolling the HLog and hlogFlush

2012-03-24 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237746#comment-13237746
 ] 

Lars Hofhansl commented on HBASE-5623:
--

Will wait for Enis to have a look at latest patch.

 Race condition when rolling the HLog and hlogFlush
 --

 Key: HBASE-5623
 URL: https://issues.apache.org/jira/browse/HBASE-5623
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.94.0
Reporter: Enis Soztutar
Assignee: Enis Soztutar
Priority: Critical
 Fix For: 0.94.0

 Attachments: 5623-suggestion.txt, 5623-v7.txt, 5623-v8.txt, 5623.txt, 
 5623v2.txt, HBASE-5623_v0.patch, HBASE-5623_v4.patch, HBASE-5623_v5.patch, 
 HBASE-5623_v6-alt.patch, HBASE-5623_v6-alt.patch


 When doing a ycsb test with a large number of handlers 
 (regionserver.handler.count=60), I get the following exceptions:
 {code}
 Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: 
 java.lang.NullPointerException
   at 
 org.apache.hadoop.io.SequenceFile$Writer.getLength(SequenceFile.java:1099)
   at 
 org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.getLength(SequenceFileLogWriter.java:314)
   at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1291)
   at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1388)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
   at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
   at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:920)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:152)
   at $Proxy1.multi(Unknown Source)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1691)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1689)
   at 
 org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:214)
 {code}
 and 
 {code}
   java.lang.NullPointerException
   at 
 org.apache.hadoop.io.SequenceFile$Writer.checkAndWriteSync(SequenceFile.java:1026)
   at 
 org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1068)
   at 
 org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1035)
   at 
 org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.append(SequenceFileLogWriter.java:279)
   at 
 org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.hlogFlush(HLog.java:1237)
   at 
 org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1271)
   at 
 org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1391)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
   at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
 {code}
 It seems the root cause of the issue is that we open a new log writer and 
 close the old one at HLog#rollWriter() holding the updateLock, but the other 
 threads doing syncer() calls
 {code} 
 logSyncerThread.hlogFlush(this.writer);
 {code}
 without holding the updateLock. LogSyncer only synchronizes against 
 concurrent appends and flush(), but not on the passed writer, which can be 
 closed already by rollWriter(). In this case, since 
 SequenceFile#Writer.close() sets it's out field as null, we get the NPE. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 

[jira] [Commented] (HBASE-5604) HLog replay tool that generates HFiles for use by LoadIncrementalHFiles.

2012-03-24 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237778#comment-13237778
 ] 

stack commented on HBASE-5604:
--

So, just need to write a smart mapper that takes filtering params and a 
WALInputFormat?


 HLog replay tool that generates HFiles for use by LoadIncrementalHFiles.
 

 Key: HBASE-5604
 URL: https://issues.apache.org/jira/browse/HBASE-5604
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl

 Just an idea I had. Might be useful for restore of a backup using the HLogs.
 This could an M/R (with a mapper per HLog file).
 The tool would get a timerange and a (set of) table(s). We'd pick the right 
 HLogs based on time before the M/R job is started and then have a mapper per 
 HLog file.
 The mapper would then go through the HLog, filter all WALEdits that didn't 
 fit into the time range or are not any of the tables and then uses 
 HFileOutputFormat to generate HFiles.
 Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5533) Add more metrics to HBase

2012-03-24 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5533:
-

Status: Open  (was: Patch Available)

 Add more metrics to HBase
 -

 Key: HBASE-5533
 URL: https://issues.apache.org/jira/browse/HBASE-5533
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.92.2, 0.94.0
Reporter: Shaneal Manek
Assignee: Shaneal Manek
Priority: Minor
 Attachments: BlockingQueueContention.java, HBASE-5533-0.92-v4.patch, 
 HBASE-5533-TRUNK-v6.patch, TimingOverhead.java, hbase-5533-0.92.patch, 
 hbase5533-0.92-v2.patch, hbase5533-0.92-v3.patch, hbase5533-0.92-v5.patch, 
 histogram_web_ui.png


 To debug/monitor production clusters, there are some more metrics I wish I 
 had available.
 In particular:
 - Although the average FS latencies are useful, a 'histogram' of recent 
 latencies (90% of reads completed in under 100ms, 99% in under 200ms, etc) 
 would be more useful
 - Similar histograms of latencies on common operations (GET, PUT, DELETE) 
 would be useful
 - Counting the number of accesses to each region to detect hotspotting
 - Exposing the current number of HLog files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5533) Add more metrics to HBase

2012-03-24 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5533:
-

Attachment: HBASE-5533-TRUNK-v6.patch

Reupload to see if it triggers hadoopqa

 Add more metrics to HBase
 -

 Key: HBASE-5533
 URL: https://issues.apache.org/jira/browse/HBASE-5533
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.92.2, 0.94.0
Reporter: Shaneal Manek
Assignee: Shaneal Manek
Priority: Minor
 Attachments: BlockingQueueContention.java, HBASE-5533-0.92-v4.patch, 
 HBASE-5533-TRUNK-v6.patch, HBASE-5533-TRUNK-v6.patch, TimingOverhead.java, 
 hbase-5533-0.92.patch, hbase5533-0.92-v2.patch, hbase5533-0.92-v3.patch, 
 hbase5533-0.92-v5.patch, histogram_web_ui.png


 To debug/monitor production clusters, there are some more metrics I wish I 
 had available.
 In particular:
 - Although the average FS latencies are useful, a 'histogram' of recent 
 latencies (90% of reads completed in under 100ms, 99% in under 200ms, etc) 
 would be more useful
 - Similar histograms of latencies on common operations (GET, PUT, DELETE) 
 would be useful
 - Counting the number of accesses to each region to detect hotspotting
 - Exposing the current number of HLog files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5564) Bulkload is discarding duplicate records

2012-03-24 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237781#comment-13237781
 ] 

stack commented on HBASE-5564:
--

Patch seems reasonable.

Add curlies here:

{code}
+  if (parser.getTimestampKeyColumnIndex() != -1)
+ts = parsed.getTimestamp();
{code}

Convention is you can do w/o curlies if all in one line (as you do later in 
this file) but if not on one line, need curlies.

Can you confirm that current behavior -- setting ts to System.currentTimeMillis 
-- is default?  It seems to be ... we set System.currentTimeMillis as time to 
use setting up the job.

A define for NO_TIMESTAMP_KEYCOLUMN_INDEX instead of using -1 directly might 
help for timestampKeyColumnIndex == -1?  Or put this test into a method whose 
name makes it obvious what the test is about ... e.g. hasTimeStampColumn()

Patch adds nice usage commentary explaining new facility.

Looks good.

 Bulkload is discarding duplicate records
 

 Key: HBASE-5564
 URL: https://issues.apache.org/jira/browse/HBASE-5564
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.96.0
 Environment: HBase 0.92
Reporter: Laxman
Assignee: Laxman
  Labels: bulkloader
 Fix For: 0.96.0

 Attachments: 5564.lint, HBASE-5564_trunk.1.patch, 
 HBASE-5564_trunk.1.patch, HBASE-5564_trunk.patch


 Duplicate records are getting discarded when duplicate records exists in same 
 input file and more specifically if they exists in same split.
 Duplicate records are considered if the records are from diffrent different 
 splits.
 Version under test: HBase 0.92

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira