[jira] [Updated] (HBASE-5682) Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only)
[ https://issues.apache.org/jira/browse/HBASE-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-5682: - Attachment: 5682-all-v3.txt Patch that removes the log statement Stack mentioned (had it in there for earlier debugging, forgot to remove it). Also adds a simple test with an HConnection that is created before the mini-cluster is started to prove that initialization is indeed lazy. (can't test with stopping and restarting the minicluster as new random ports are used each time). Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only) -- Key: HBASE-5682 URL: https://issues.apache.org/jira/browse/HBASE-5682 Project: HBase Issue Type: Improvement Components: client Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Critical Fix For: 0.94.0 Attachments: 5682-all-v2.txt, 5682-all-v3.txt, 5682-all.txt, 5682-v2.txt, 5682.txt Just realized that without this HBASE-4805 is broken. I.e. there's no point keeping a persistent HConnection around if it can be rendered permanently unusable if the ZK connection is lost temporarily. Note that this is fixed in 0.96 with HBASE-5399 (but that seems to big to backport) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5682) Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only)
[ https://issues.apache.org/jira/browse/HBASE-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243992#comment-13243992 ] Lars Hofhansl commented on HBASE-5682: -- all-v3 is what I like to commit tomorrow if there are no objections. Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only) -- Key: HBASE-5682 URL: https://issues.apache.org/jira/browse/HBASE-5682 Project: HBase Issue Type: Improvement Components: client Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Critical Fix For: 0.94.0 Attachments: 5682-all-v2.txt, 5682-all-v3.txt, 5682-all.txt, 5682-v2.txt, 5682.txt Just realized that without this HBASE-4805 is broken. I.e. there's no point keeping a persistent HConnection around if it can be rendered permanently unusable if the ZK connection is lost temporarily. Note that this is fixed in 0.96 with HBASE-5399 (but that seems to big to backport) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5436) Right-size the map when reading attributes.
[ https://issues.apache.org/jira/browse/HBASE-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243995#comment-13243995 ] Hudson commented on HBASE-5436: --- Integrated in HBase-0.92 #349 (See [https://builds.apache.org/job/HBase-0.92/349/]) HBASE-5436 Right-size the map when reading attributes (Benoit) (Revision 1308232) Result = FAILURE tedyu : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/OperationWithAttributes.java Right-size the map when reading attributes. --- Key: HBASE-5436 URL: https://issues.apache.org/jira/browse/HBASE-5436 Project: HBase Issue Type: Improvement Affects Versions: 0.92.0 Reporter: Benoit Sigoure Assignee: Benoit Sigoure Priority: Trivial Labels: performance Fix For: 0.94.0 Attachments: 0001-Right-size-the-map-when-reading-attributes.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5451) Switch RPC call envelope/headers to PBs
[ https://issues.apache.org/jira/browse/HBASE-5451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244006#comment-13244006 ] jirapos...@reviews.apache.org commented on HBASE-5451: -- bq. On 2012-04-02 00:21:20, Michael Stack wrote: bq. Some more questions. Just being careful DD. That's fine. Hope the answers below are okay. Please let me know your response soon so that I can submit another patch. bq. On 2012-04-02 00:21:20, Michael Stack wrote: bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/DataOutputOutputStream.java, line 25 bq. https://reviews.apache.org/r/4096/diff/3/?file=97739#file97739line25 bq. bq. We should just be using the hadoop DOOS... looks like no diff (when I diff them). I'll make an issue to remove. cool bq. On 2012-04-02 00:21:20, Michael Stack wrote: bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java, line 446 bq. https://reviews.apache.org/r/4096/diff/3/?file=97740#file97740line446 bq. bq. Is this written up anywhere? That its hrpc, then version, then a length, then a protobuf? bq. bq. I see it in the proto definition. That'll do. cool bq. On 2012-04-02 00:21:20, Michael Stack wrote: bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java, line 548 bq. https://reviews.apache.org/r/4096/diff/3/?file=97740#file97740line548 bq. bq. We have an issue for removing this Invocation stuff? No not yet. But I'll create one to do with this issue once this patch is committed. bq. On 2012-04-02 00:21:20, Michael Stack wrote: bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto, line 25 bq. https://reviews.apache.org/r/4096/diff/3/?file=97744#file97744line25 bq. bq. Should we just remove them in the next iteration on rpc since 0.96 is to be a singularity? Why even bother trying to keep compatibility w/ older clients? bq. bq. What is 'failure compatibility'? We are telling the client to go away, nicely (smile). bq. bq. What you think we should replace hrpc0x0005 with? bq. bq. this - these Yeah, valid points. We can remove this version string and all in a follow up patch. bq. On 2012-04-02 00:21:20, Michael Stack wrote: bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto, line 28 bq. https://reviews.apache.org/r/4096/diff/3/?file=97744#file97744line28 bq. bq. How does RpcRequestWithHeaderProto relate to ConnectionHeaderProto? This text should say? bq. bq. Would be nice to have illustration on how the back and forth work. The latter is used only while establishing connections and the former for exchanging RPC requests/responses over a channel that is connected. Okay, will add some text. bq. On 2012-04-02 00:21:20, Michael Stack wrote: bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto, line 55 bq. https://reviews.apache.org/r/4096/diff/3/?file=97744#file97744line55 bq. bq. We'll send this String each time? Actually, I could make this field 'optional' since this has a default value. Will do so. bq. On 2012-04-02 00:21:20, Michael Stack wrote: bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto, line 66 bq. https://reviews.apache.org/r/4096/diff/3/?file=97744#file97744line66 bq. bq. Which part in here is the 'header'? How does it relate to ConnectionHeaderProto? bq. bq. request can be an Invocation/Writable? Or a protobuf? Do we need a length in here? Today the only 'header' is the callId.. There is no relation to ConnectionHeaderProto. If the 'header' is confusing, I can take it off the object name. Let me know. 'request' in this patch is only a Invocation/Writable. In theory, it could be a protobuf object as well (since it is just bytes), but, for protobuf, we could make things more explicit by defining a protobuf object rather than a opaque set of bytes. But that's another jira (ProtoBufRpcEngine implementation similar to Hadoop). Length is not needed - the protobuf serialization/deserialization will take care of it.. bq. On 2012-04-02 00:21:20, Michael Stack wrote: bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto, line 93 bq. https://reviews.apache.org/r/4096/diff/3/?file=97744#file97744line93 bq. bq. Should this precede the response? So if false, a response follows else an exception? Do we need a length here? Where is the header that the message name refers too? Length will be taken care of by the protobuf serialization/deserialization. The header is the combination of callId, error. If the 'header' is confusing,
[jira] [Updated] (HBASE-5693) When creating a region, the master initializes it and creates a memstore within the master server
[ https://issues.apache.org/jira/browse/HBASE-5693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-5693: --- Attachment: 5593.v2.patch When creating a region, the master initializes it and creates a memstore within the master server - Key: HBASE-5693 URL: https://issues.apache.org/jira/browse/HBASE-5693 Project: HBase Issue Type: Improvement Components: master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 5593.v2.patch, 5693.v1.patch I didn't do a complete analysis, but the attached patch saves more than 0.25s for each region creation and locally all the unit tests work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5693) When creating a region, the master initializes it and creates a memstore within the master server
[ https://issues.apache.org/jira/browse/HBASE-5693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-5693: --- Status: Open (was: Patch Available) When creating a region, the master initializes it and creates a memstore within the master server - Key: HBASE-5693 URL: https://issues.apache.org/jira/browse/HBASE-5693 Project: HBase Issue Type: Improvement Components: master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 5593.v2.patch, 5693.v1.patch I didn't do a complete analysis, but the attached patch saves more than 0.25s for each region creation and locally all the unit tests work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5693) When creating a region, the master initializes it and creates a memstore within the master server
[ https://issues.apache.org/jira/browse/HBASE-5693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-5693: --- Status: Patch Available (was: Open) When creating a region, the master initializes it and creates a memstore within the master server - Key: HBASE-5693 URL: https://issues.apache.org/jira/browse/HBASE-5693 Project: HBase Issue Type: Improvement Components: master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 5593.v2.patch, 5693.v1.patch I didn't do a complete analysis, but the attached patch saves more than 0.25s for each region creation and locally all the unit tests work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5693) When creating a region, the master initializes it and creates a memstore within the master server
[ https://issues.apache.org/jira/browse/HBASE-5693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244021#comment-13244021 ] Hadoop QA commented on HBASE-5693: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520876/5593.v2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestFromClientSide org.apache.hadoop.hbase.io.hfile.TestForceCacheImportantBlocks org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1364//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1364//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1364//console This message is automatically generated. When creating a region, the master initializes it and creates a memstore within the master server - Key: HBASE-5693 URL: https://issues.apache.org/jira/browse/HBASE-5693 Project: HBase Issue Type: Improvement Components: master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 5593.v2.patch, 5693.v1.patch I didn't do a complete analysis, but the attached patch saves more than 0.25s for each region creation and locally all the unit tests work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5694) getRowsWithColumnsTs function Thrift service incorrectly handles time range
[ https://issues.apache.org/jira/browse/HBASE-5694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244086#comment-13244086 ] Hadoop QA commented on HBASE-5694: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520856/HBASE-5694.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1366//console This message is automatically generated. getRowsWithColumnsTs function Thrift service incorrectly handles time range --- Key: HBASE-5694 URL: https://issues.apache.org/jira/browse/HBASE-5694 Project: HBase Issue Type: Bug Components: thrift Affects Versions: 0.92.1 Reporter: Wouter Bolsterlee Fix For: 0.92.2 Attachments: HBASE-5694.patch The getRowsWithColumnsTs() method in the Thrift interface only applies the timestamp if columns are explicitly specified. However, this method also allows for columns to be unspecified (this is even used internally to implement e.g. getRows()). The cause of the bug is a minor scoping issue: the time range is set inside a wrong if statement. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4393) Implement a canary monitoring program
[ https://issues.apache.org/jira/browse/HBASE-4393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244087#comment-13244087 ] Hadoop QA commented on HBASE-4393: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12519998/Canary-v0.java against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1367//console This message is automatically generated. Implement a canary monitoring program - Key: HBASE-4393 URL: https://issues.apache.org/jira/browse/HBASE-4393 Project: HBase Issue Type: New Feature Components: monitoring Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: Matteo Bertozzi Attachments: Canary-v0.java, HBaseCanary.java This JIRA is to implement a standalone program that can be used to do canary monitoring of a running HBase cluster. This program would gather a list of the regions in the cluster, then iterate over them doing lightweight operations (eg short scans) to provide metrics about latency as well as alert on availability issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5682) Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only)
[ https://issues.apache.org/jira/browse/HBASE-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244093#comment-13244093 ] Jieshan Bean commented on HBASE-5682: - Everything seems good to me. Only a minor doubt, is it necessary to close zooKeeper before set it as null? If HConnectionImplementation#managed is true, HConnectionImplementation#abort doesn't set closed to true, just calls close method. It makes sense to me:). So the retry logic introduced in HBASE-5153 seems redundant. If one want to manage the connection by himself. If the connection is aborted. We should suggest to recreate the HConnection and HTable, right? Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only) -- Key: HBASE-5682 URL: https://issues.apache.org/jira/browse/HBASE-5682 Project: HBase Issue Type: Improvement Components: client Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Critical Fix For: 0.94.0 Attachments: 5682-all-v2.txt, 5682-all-v3.txt, 5682-all.txt, 5682-v2.txt, 5682.txt Just realized that without this HBASE-4805 is broken. I.e. there's no point keeping a persistent HConnection around if it can be rendered permanently unusable if the ZK connection is lost temporarily. Note that this is fixed in 0.96 with HBASE-5399 (but that seems to big to backport) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5644) [findbugs] Fix null pointer warnings.
[ https://issues.apache.org/jira/browse/HBASE-5644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244095#comment-13244095 ] Hadoop QA commented on HBASE-5644: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520867/HBASE-5644.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1365//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1365//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1365//console This message is automatically generated. [findbugs] Fix null pointer warnings. - Key: HBASE-5644 URL: https://issues.apache.org/jira/browse/HBASE-5644 Project: HBase Issue Type: Sub-task Components: scripts Reporter: Jonathan Hsieh Assignee: Uma Maheswara Rao G Attachments: HBASE-5644.patch, NullPointerFindBugs_Analysis.xlsx See https://builds.apache.org/job/PreCommit-HBASE-Build/1313//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Fix the NP category -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5666) RegionServer doesn't retry to check if base node is available
[ https://issues.apache.org/jira/browse/HBASE-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-5666: --- Attachment: HBASE-5666-v3.patch RegionServer doesn't retry to check if base node is available - Key: HBASE-5666 URL: https://issues.apache.org/jira/browse/HBASE-5666 Project: HBase Issue Type: Bug Components: regionserver, zookeeper Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Attachments: HBASE-5666-v1.patch, HBASE-5666-v2.patch, HBASE-5666-v3.patch, hbase-1-regionserver.log, hbase-2-regionserver.log, hbase-3-regionserver.log, hbase-master.log, hbase-regionserver.log, hbase-zookeeper.log I've a script that starts hbase and a couple of region servers in distributed mode (hbase.cluster.distributed = true) {code} $HBASE_HOME/bin/start-hbase.sh $HBASE_HOME/bin/local-regionservers.sh start 1 2 3 {code} but the region servers are not able to start... It seems that during the RS start the the znode is still not available, and HRegionServer.initializeZooKeeper() check just once if the base not is available. {code} 2012-03-28 21:54:05,013 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master. 2012-03-28 21:54:08,598 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server localhost,60202,133296824: Initialization of RS failed. Hence aborting RS. java.io.IOException: Received the shutdown message while waiting. at org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:626) at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:596) at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:558) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:672) at java.lang.Thread.run(Thread.java:662) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5698) Add new coprocessor hooks in doMiniBatchPut
Add new coprocessor hooks in doMiniBatchPut --- Key: HBASE-5698 URL: https://issues.apache.org/jira/browse/HBASE-5698 Project: HBase Issue Type: Improvement Affects Versions: 0.96.0 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan As discussed in the JIRA HBASE-5617, this JIRA has been raised to add new hooks to doMiniBatchPut. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5666) RegionServer doesn't retry to check if base node is available
[ https://issues.apache.org/jira/browse/HBASE-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244120#comment-13244120 ] Hadoop QA commented on HBASE-5666: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520963/HBASE-5666-v3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.util.TestHBaseFsck org.apache.hadoop.hbase.client.TestFromClientSide Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1368//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1368//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1368//console This message is automatically generated. RegionServer doesn't retry to check if base node is available - Key: HBASE-5666 URL: https://issues.apache.org/jira/browse/HBASE-5666 Project: HBase Issue Type: Bug Components: regionserver, zookeeper Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Attachments: HBASE-5666-v1.patch, HBASE-5666-v2.patch, HBASE-5666-v3.patch, hbase-1-regionserver.log, hbase-2-regionserver.log, hbase-3-regionserver.log, hbase-master.log, hbase-regionserver.log, hbase-zookeeper.log I've a script that starts hbase and a couple of region servers in distributed mode (hbase.cluster.distributed = true) {code} $HBASE_HOME/bin/start-hbase.sh $HBASE_HOME/bin/local-regionservers.sh start 1 2 3 {code} but the region servers are not able to start... It seems that during the RS start the the znode is still not available, and HRegionServer.initializeZooKeeper() check just once if the base not is available. {code} 2012-03-28 21:54:05,013 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master. 2012-03-28 21:54:08,598 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server localhost,60202,133296824: Initialization of RS failed. Hence aborting RS. java.io.IOException: Received the shutdown message while waiting. at org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:626) at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:596) at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:558) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:672) at java.lang.Thread.run(Thread.java:662) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5695) Use Hadoop's DataOutputOutputStream instead of have a copy local
[ https://issues.apache.org/jira/browse/HBASE-5695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244123#comment-13244123 ] ramkrishna.s.vasudevan commented on HBASE-5695: --- This is same as HBASE-5696. Did you intend something else as the topic of this JIRA? Use Hadoop's DataOutputOutputStream instead of have a copy local Key: HBASE-5695 URL: https://issues.apache.org/jira/browse/HBASE-5695 Project: HBase Issue Type: Improvement Reporter: stack -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3909) Add dynamic config
[ https://issues.apache.org/jira/browse/HBASE-3909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244127#comment-13244127 ] Uma Maheswara Rao G commented on HBASE-3909: I think syncing the configuration across clusters would be mostly OM kind of tools functionality. Bringing that into Hadoop/Hbase may not be correct. I feel the current issue scope would be to allow some way to do the in-memory config updates with out restarting the node. And I agree with Todd. OM tools are good in managing configs. {quote} ¦operations teams are very good at managing text-based configuration files with tools like puppet, cfengine, etc. It's also easy to version-control these kinds of configs, add !-- comments --, etc. Moving to ZK makes these tasks more difficult – we'd need lots of tooling, etc. {quote} The current limitation point would be that, even though OMs are capable enough for updating the configurations in all the places, there is no way to make the nodes reflect with that configs without restart of that node. I am thinking to proceed with Hadoop-7001 kind of implementation, if there are no objections. Also, as a next step we can provide the options like, updating configs from shell and provide command to reload the config from disk one more..etc Add dynamic config -- Key: HBASE-3909 URL: https://issues.apache.org/jira/browse/HBASE-3909 Project: HBase Issue Type: Bug Reporter: stack Fix For: 0.96.0 I'm sure this issue exists already, at least as part of the discussion around making online schema edits possible, but no hard this having its own issue. Ted started a conversation on this topic up on dev and Todd suggested we lookd at how Hadoop did it over in HADOOP-7001 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5636) TestTableMapReduce doesn't work properly.
[ https://issues.apache.org/jira/browse/HBASE-5636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5636: -- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Integrated to trunk and 0.94. Thanks for the patch, Takuya. TestTableMapReduce doesn't work properly. - Key: HBASE-5636 URL: https://issues.apache.org/jira/browse/HBASE-5636 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.92.1, 0.94.0 Reporter: Takuya Ueshin Assignee: Takuya Ueshin Attachments: HBASE-5636-v2.patch, HBASE-5636.patch No map function is called because there are no test data put before test starts. The following three tests are in the same situation: - org.apache.hadoop.hbase.mapred.TestTableMapReduce - org.apache.hadoop.hbase.mapreduce.TestTableMapReduce - org.apache.hadoop.hbase.mapreduce.TestMulitthreadedTableMapper -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5663) MultithreadedTableMapper doesn't work.
[ https://issues.apache.org/jira/browse/HBASE-5663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244189#comment-13244189 ] Zhihong Yu commented on HBASE-5663: --- Integrated to trunk and 0.94. Thanks for the patch, Takuya. MultithreadedTableMapper doesn't work. -- Key: HBASE-5663 URL: https://issues.apache.org/jira/browse/HBASE-5663 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 0.94.0 Reporter: Takuya Ueshin Assignee: Takuya Ueshin Fix For: 0.94.0, 0.96.0 Attachments: 5663+5636.txt, HBASE-5663.patch MapReduce job using MultithreadedTableMapper goes down throwing the following Exception: {noformat} java.io.IOException: java.lang.NoSuchMethodException: org.apache.hadoop.mapreduce.Mapper$Context.init(org.apache.hadoop.conf.Configuration, org.apache.hadoop.mapred.TaskAttemptID, org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper$SubMapRecordReader, org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper$SubMapRecordWriter, org.apache.hadoop.hbase.mapreduce.TableOutputCommitter, org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper$SubMapStatusReporter, org.apache.hadoop.hbase.mapreduce.TableSplit) at org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper$MapRunner.init(MultithreadedTableMapper.java:260) at org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper.run(MultithreadedTableMapper.java:133) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.lang.NoSuchMethodException: org.apache.hadoop.mapreduce.Mapper$Context.init(org.apache.hadoop.conf.Configuration, org.apache.hadoop.mapred.TaskAttemptID, org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper$SubMapRecordReader, org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper$SubMapRecordWriter, org.apache.hadoop.hbase.mapreduce.TableOutputCommitter, org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper$SubMapStatusReporter, org.apache.hadoop.hbase.mapreduce.TableSplit) at java.lang.Class.getConstructor0(Class.java:2706) at java.lang.Class.getConstructor(Class.java:1657) at org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper$MapRunner.init(MultithreadedTableMapper.java:241) ... 8 more {noformat} This occured when the tasks are creating MapRunner threads. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5636) TestTableMapReduce doesn't work properly.
[ https://issues.apache.org/jira/browse/HBASE-5636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5636: -- Fix Version/s: 0.96.0 0.94.0 TestTableMapReduce doesn't work properly. - Key: HBASE-5636 URL: https://issues.apache.org/jira/browse/HBASE-5636 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.92.1, 0.94.0 Reporter: Takuya Ueshin Assignee: Takuya Ueshin Fix For: 0.94.0, 0.96.0 Attachments: HBASE-5636-v2.patch, HBASE-5636.patch No map function is called because there are no test data put before test starts. The following three tests are in the same situation: - org.apache.hadoop.hbase.mapred.TestTableMapReduce - org.apache.hadoop.hbase.mapreduce.TestTableMapReduce - org.apache.hadoop.hbase.mapreduce.TestMulitthreadedTableMapper -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5694) getRowsWithColumnsTs function Thrift service incorrectly handles time range
[ https://issues.apache.org/jira/browse/HBASE-5694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244199#comment-13244199 ] Zhihong Yu commented on HBASE-5694: --- @Wouter: Can you attach a patch which can be applied to trunk ? Thanks getRowsWithColumnsTs function Thrift service incorrectly handles time range --- Key: HBASE-5694 URL: https://issues.apache.org/jira/browse/HBASE-5694 Project: HBase Issue Type: Bug Components: thrift Affects Versions: 0.92.1 Reporter: Wouter Bolsterlee Fix For: 0.92.2 Attachments: HBASE-5694.patch The getRowsWithColumnsTs() method in the Thrift interface only applies the timestamp if columns are explicitly specified. However, this method also allows for columns to be unspecified (this is even used internally to implement e.g. getRows()). The cause of the bug is a minor scoping issue: the time range is set inside a wrong if statement. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5689) Skipping RecoveredEdits may cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5689: -- Attachment: (was: HBASE-5689.patch) Skipping RecoveredEdits may cause data loss --- Key: HBASE-5689 URL: https://issues.apache.org/jira/browse/HBASE-5689 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.0 Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.94.0 Attachments: 5689-simplified.txt, 5689-testcase.patch, HBASE-5689.patch Let's see the following scenario: 1.Region is on the server A 2.put KV(r1-v1) to the region 3.move region from server A to server B 4.put KV(r2-v2) to the region 5.move region from server B to server A 6.put KV(r3-v3) to the region 7.kill -9 server B and start it 8.kill -9 server A and start it 9.scan the region, we could only get two KV(r1-v1,r2-v2), the third KV(r3-v3) is lost. Let's analyse the upper scenario from the code: 1.the edit logs of KV(r1-v1) and KV(r3-v3) are both recorded in the same hlog file on server A. 2.when we split server B's hlog file in the process of ServerShutdownHandler, we create one RecoveredEdits file f1 for the region. 2.when we split server A's hlog file in the process of ServerShutdownHandler, we create another RecoveredEdits file f2 for the region. 3.however, RecoveredEdits file f2 will be skiped when initializing region HRegion#replayRecoveredEditsIfAny {code} for (Path edits: files) { if (edits == null || !this.fs.exists(edits)) { LOG.warn(Null or non-existent edits file: + edits); continue; } if (isZeroLengthThenDelete(this.fs, edits)) continue; if (checkSafeToSkip) { Path higher = files.higher(edits); long maxSeqId = Long.MAX_VALUE; if (higher != null) { // Edit file name pattern, HLog.EDITFILES_NAME_PATTERN: -?[0-9]+ String fileName = higher.getName(); maxSeqId = Math.abs(Long.parseLong(fileName)); } if (maxSeqId = minSeqId) { String msg = Maximum possible sequenceid for this log is + maxSeqId + , skipped the whole file, path= + edits; LOG.debug(msg); continue; } else { checkSafeToSkip = false; } } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5625) Avoid byte buffer allocations when reading a value from a Result object
[ https://issues.apache.org/jira/browse/HBASE-5625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244219#comment-13244219 ] jirapos...@reviews.apache.org commented on HBASE-5625: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4607/ --- Review request for hbase. Summary --- When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value. These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method. This addresses bug HBASE-5625. https://issues.apache.org/jira/browse/HBASE-5625 Diffs - src/main/java/org/apache/hadoop/hbase/KeyValue.java 243d76f src/main/java/org/apache/hadoop/hbase/client/Result.java df0b3ef src/test/java/org/apache/hadoop/hbase/client/TestResult.java f9e29c2 Diff: https://reviews.apache.org/r/4607/diff Testing --- Added value check to TestResult#testBasic and TestResult.testMultiVersion. Thanks, Tudor Avoid byte buffer allocations when reading a value from a Result object --- Key: HBASE-5625 URL: https://issues.apache.org/jira/browse/HBASE-5625 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.92.1 Reporter: Tudor Scurtu Assignee: Tudor Scurtu Labels: patch Attachments: 5625.txt, 5625v2.txt, 5625v3.txt, 5625v4.txt When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value. These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method. The current functionality is maintained, and we have added a separate method call stack that employs the described changes. I will provide more details with the patch. Running tests with a profiler, the reduction of read time seems to be of up to 40%. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5625) Avoid byte buffer allocations when reading a value from a Result object
[ https://issues.apache.org/jira/browse/HBASE-5625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tudor Scurtu updated HBASE-5625: Attachment: 5625v5.txt @Zhihong: Thanks for the review request. I actually had to make my own in order to upload the diff: https://reviews.apache.org/r/4607/ The performance actually depends on the system capabilities. It's hard to write a microbenchmark test for an issue that manifests itself on large I/O intensive jobs that put a lot of gc pressure. I implemented a few of Cosmin's suggestions. Avoid byte buffer allocations when reading a value from a Result object --- Key: HBASE-5625 URL: https://issues.apache.org/jira/browse/HBASE-5625 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.92.1 Reporter: Tudor Scurtu Assignee: Tudor Scurtu Labels: patch Attachments: 5625.txt, 5625v2.txt, 5625v3.txt, 5625v4.txt, 5625v5.txt When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value. These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method. The current functionality is maintained, and we have added a separate method call stack that employs the described changes. I will provide more details with the patch. Running tests with a profiler, the reduction of read time seems to be of up to 40%. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5689) Skipping RecoveredEdits may cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244233#comment-13244233 ] Zhihong Yu commented on HBASE-5689: --- @Chunhui: Hadoop QA isn't picking up any patches from this JIRA. Please run through test suite and let us know the result. Skipping RecoveredEdits may cause data loss --- Key: HBASE-5689 URL: https://issues.apache.org/jira/browse/HBASE-5689 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.0 Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.94.0 Attachments: 5689-simplified.txt, 5689-testcase.patch, HBASE-5689.patch Let's see the following scenario: 1.Region is on the server A 2.put KV(r1-v1) to the region 3.move region from server A to server B 4.put KV(r2-v2) to the region 5.move region from server B to server A 6.put KV(r3-v3) to the region 7.kill -9 server B and start it 8.kill -9 server A and start it 9.scan the region, we could only get two KV(r1-v1,r2-v2), the third KV(r3-v3) is lost. Let's analyse the upper scenario from the code: 1.the edit logs of KV(r1-v1) and KV(r3-v3) are both recorded in the same hlog file on server A. 2.when we split server B's hlog file in the process of ServerShutdownHandler, we create one RecoveredEdits file f1 for the region. 2.when we split server A's hlog file in the process of ServerShutdownHandler, we create another RecoveredEdits file f2 for the region. 3.however, RecoveredEdits file f2 will be skiped when initializing region HRegion#replayRecoveredEditsIfAny {code} for (Path edits: files) { if (edits == null || !this.fs.exists(edits)) { LOG.warn(Null or non-existent edits file: + edits); continue; } if (isZeroLengthThenDelete(this.fs, edits)) continue; if (checkSafeToSkip) { Path higher = files.higher(edits); long maxSeqId = Long.MAX_VALUE; if (higher != null) { // Edit file name pattern, HLog.EDITFILES_NAME_PATTERN: -?[0-9]+ String fileName = higher.getName(); maxSeqId = Math.abs(Long.parseLong(fileName)); } if (maxSeqId = minSeqId) { String msg = Maximum possible sequenceid for this log is + maxSeqId + , skipped the whole file, path= + edits; LOG.debug(msg); continue; } else { checkSafeToSkip = false; } } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5663) MultithreadedTableMapper doesn't work.
[ https://issues.apache.org/jira/browse/HBASE-5663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244245#comment-13244245 ] Hudson commented on HBASE-5663: --- Integrated in HBase-0.94 #75 (See [https://builds.apache.org/job/HBase-0.94/75/]) HBASE-5663 HBASE-5636 MultithreadedTableMapper doesn't work (Takuya Ueshin) (Revision 1308354) Result = FAILURE tedyu : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/MultithreadedTableMapper.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapred/TestTableMapReduce.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestMulitthreadedTableMapper.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestMultithreadedTableMapper.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java MultithreadedTableMapper doesn't work. -- Key: HBASE-5663 URL: https://issues.apache.org/jira/browse/HBASE-5663 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 0.94.0 Reporter: Takuya Ueshin Assignee: Takuya Ueshin Fix For: 0.94.0, 0.96.0 Attachments: 5663+5636.txt, HBASE-5663.patch MapReduce job using MultithreadedTableMapper goes down throwing the following Exception: {noformat} java.io.IOException: java.lang.NoSuchMethodException: org.apache.hadoop.mapreduce.Mapper$Context.init(org.apache.hadoop.conf.Configuration, org.apache.hadoop.mapred.TaskAttemptID, org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper$SubMapRecordReader, org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper$SubMapRecordWriter, org.apache.hadoop.hbase.mapreduce.TableOutputCommitter, org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper$SubMapStatusReporter, org.apache.hadoop.hbase.mapreduce.TableSplit) at org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper$MapRunner.init(MultithreadedTableMapper.java:260) at org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper.run(MultithreadedTableMapper.java:133) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.lang.NoSuchMethodException: org.apache.hadoop.mapreduce.Mapper$Context.init(org.apache.hadoop.conf.Configuration, org.apache.hadoop.mapred.TaskAttemptID, org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper$SubMapRecordReader, org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper$SubMapRecordWriter, org.apache.hadoop.hbase.mapreduce.TableOutputCommitter, org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper$SubMapStatusReporter, org.apache.hadoop.hbase.mapreduce.TableSplit) at java.lang.Class.getConstructor0(Class.java:2706) at java.lang.Class.getConstructor(Class.java:1657) at org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper$MapRunner.init(MultithreadedTableMapper.java:241) ... 8 more {noformat} This occured when the tasks are creating MapRunner threads. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5636) TestTableMapReduce doesn't work properly.
[ https://issues.apache.org/jira/browse/HBASE-5636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244244#comment-13244244 ] Hudson commented on HBASE-5636: --- Integrated in HBase-0.94 #75 (See [https://builds.apache.org/job/HBase-0.94/75/]) HBASE-5663 HBASE-5636 MultithreadedTableMapper doesn't work (Takuya Ueshin) (Revision 1308354) Result = FAILURE tedyu : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/MultithreadedTableMapper.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapred/TestTableMapReduce.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestMulitthreadedTableMapper.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestMultithreadedTableMapper.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java TestTableMapReduce doesn't work properly. - Key: HBASE-5636 URL: https://issues.apache.org/jira/browse/HBASE-5636 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.92.1, 0.94.0 Reporter: Takuya Ueshin Assignee: Takuya Ueshin Fix For: 0.94.0, 0.96.0 Attachments: HBASE-5636-v2.patch, HBASE-5636.patch No map function is called because there are no test data put before test starts. The following three tests are in the same situation: - org.apache.hadoop.hbase.mapred.TestTableMapReduce - org.apache.hadoop.hbase.mapreduce.TestTableMapReduce - org.apache.hadoop.hbase.mapreduce.TestMulitthreadedTableMapper -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5680) Hbase94 and Hbase 92.2 is not compatible with the Hadoop 23.1
[ https://issues.apache.org/jira/browse/HBASE-5680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244250#comment-13244250 ] Jonathan Hsieh commented on HBASE-5680: --- I feel like to resolve this we should give the user some sort of warning about needing to recompile against hadoop23 (or vice versa if a version compiled against hadoop23 is attempts to run against a hadoop 1.0.0/0.20.x based hdfs). Thoughts? Hbase94 and Hbase 92.2 is not compatible with the Hadoop 23.1 -- Key: HBASE-5680 URL: https://issues.apache.org/jira/browse/HBASE-5680 Project: HBase Issue Type: Bug Components: master Reporter: Kristam Subba Swathi Hmaster is not able to start because of the following error Please find the following error 2012-03-30 11:12:19,487 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown. java.lang.NoClassDefFoundError: org/apache/hadoop/hdfs/protocol/FSConstants$SafeModeAction at org.apache.hadoop.hbase.util.FSUtils.waitOnSafeMode(FSUtils.java:524) at org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:324) at org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:127) at org.apache.hadoop.hbase.master.MasterFileSystem.init(MasterFileSystem.java:112) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:496) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:363) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hdfs.protocol.FSConstants$SafeModeAction at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) ... 7 more There is a change in the FSConstants -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5625) Avoid byte buffer allocations when reading a value from a Result object
[ https://issues.apache.org/jira/browse/HBASE-5625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244253#comment-13244253 ] Hadoop QA commented on HBASE-5625: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520978/5625v5.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestMultithreadedTableMapper org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat org.apache.hadoop.hbase.mapreduce.TestTableMapReduce Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1369//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1369//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1369//console This message is automatically generated. Avoid byte buffer allocations when reading a value from a Result object --- Key: HBASE-5625 URL: https://issues.apache.org/jira/browse/HBASE-5625 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.92.1 Reporter: Tudor Scurtu Assignee: Tudor Scurtu Labels: patch Attachments: 5625.txt, 5625v2.txt, 5625v3.txt, 5625v4.txt, 5625v5.txt When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value. These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method. The current functionality is maintained, and we have added a separate method call stack that employs the described changes. I will provide more details with the patch. Running tests with a profiler, the reduction of read time seems to be of up to 40%. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5451) Switch RPC call envelope/headers to PBs
[ https://issues.apache.org/jira/browse/HBASE-5451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244255#comment-13244255 ] jirapos...@reviews.apache.org commented on HBASE-5451: -- bq. On 2012-04-02 00:21:20, Michael Stack wrote: bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java, line 548 bq. https://reviews.apache.org/r/4096/diff/3/?file=97740#file97740line548 bq. bq. We have an issue for removing this Invocation stuff? bq. bq. Devaraj Das wrote: bq. No not yet. But I'll create one to do with this issue once this patch is committed. Thanks bq. On 2012-04-02 00:21:20, Michael Stack wrote: bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto, line 25 bq. https://reviews.apache.org/r/4096/diff/3/?file=97744#file97744line25 bq. bq. Should we just remove them in the next iteration on rpc since 0.96 is to be a singularity? Why even bother trying to keep compatibility w/ older clients? bq. bq. What is 'failure compatibility'? We are telling the client to go away, nicely (smile). bq. bq. What you think we should replace hrpc0x0005 with? bq. bq. this - these bq. bq. Devaraj Das wrote: bq. Yeah, valid points. We can remove this version string and all in a follow up patch. Lets discuss in another jira. A hrpc version followed by something that says its protobuf that follows, etc., bq. On 2012-04-02 00:21:20, Michael Stack wrote: bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto, line 28 bq. https://reviews.apache.org/r/4096/diff/3/?file=97744#file97744line28 bq. bq. How does RpcRequestWithHeaderProto relate to ConnectionHeaderProto? This text should say? bq. bq. Would be nice to have illustration on how the back and forth work. bq. bq. Devaraj Das wrote: bq. The latter is used only while establishing connections and the former for exchanging RPC requests/responses over a channel that is connected. Okay, will add some text. Thanks bq. On 2012-04-02 00:21:20, Michael Stack wrote: bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto, line 55 bq. https://reviews.apache.org/r/4096/diff/3/?file=97744#file97744line55 bq. bq. We'll send this String each time? bq. bq. Devaraj Das wrote: bq. Actually, I could make this field 'optional' since this has a default value. Will do so. That'd be a good idea I think. The other protocols are less used and can include the String bq. On 2012-04-02 00:21:20, Michael Stack wrote: bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto, line 66 bq. https://reviews.apache.org/r/4096/diff/3/?file=97744#file97744line66 bq. bq. Which part in here is the 'header'? How does it relate to ConnectionHeaderProto? bq. bq. request can be an Invocation/Writable? Or a protobuf? Do we need a length in here? bq. bq. Devaraj Das wrote: bq. Today the only 'header' is the callId.. There is no relation to ConnectionHeaderProto. If the 'header' is confusing, I can take it off the object name. Let me know. bq. bq. 'request' in this patch is only a Invocation/Writable. In theory, it could be a protobuf object as well (since it is just bytes), but, for protobuf, we could make things more explicit by defining a protobuf object rather than a opaque set of bytes. But that's another jira (ProtoBufRpcEngine implementation similar to Hadoop). Length is not needed - the protobuf serialization/deserialization will take care of it.. I think taking the 'Header' off Request/Response would be best (Did I ask you add it previous? If so, sorry... I misunderstood. Thanks for being accomodating). Yes, on a new issue to make it pb rather than opaque bytes. Do you have to do something here -- make bytes optional? -- to allow for the later pb replacement? On length, thats probably good to keep. For us, we'll give the stream to a pb deserializer but other clients might want to know how many bytes on the line so keep it I'd say. bq. On 2012-04-02 00:21:20, Michael Stack wrote: bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto, line 93 bq. https://reviews.apache.org/r/4096/diff/3/?file=97744#file97744line93 bq. bq. Should this precede the response? So if false, a response follows else an exception? Do we need a length here? Where is the header that the message name refers too? bq. bq. Devaraj Das wrote: bq. Length will be taken care of by the protobuf serialization/deserialization. The header is the combination of callId, error. If the 'header' is confusing, I can take it off the object name. Let me know. Yeah, take away
[jira] [Resolved] (HBASE-5695) Use Hadoop's DataOutputOutputStream instead of have a copy local
[ https://issues.apache.org/jira/browse/HBASE-5695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-5695. -- Resolution: Duplicate hbase-5696 (Thanks Ram) Use Hadoop's DataOutputOutputStream instead of have a copy local Key: HBASE-5695 URL: https://issues.apache.org/jira/browse/HBASE-5695 Project: HBase Issue Type: Improvement Reporter: stack -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5435) TestForceCacheImportantBlocks fails with OutOfMemoryError
[ https://issues.apache.org/jira/browse/HBASE-5435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244276#comment-13244276 ] Zhihong Yu commented on HBASE-5435: --- The test error happened in 0.94 build #75 as well. https://builds.apache.org/job/HBase-0.94/75/testReport/junit/org.apache.hadoop.hbase.io.hfile/TestForceCacheImportantBlocks/testCacheBlocks_1_/ TestForceCacheImportantBlocks fails with OutOfMemoryError - Key: HBASE-5435 URL: https://issues.apache.org/jira/browse/HBASE-5435 Project: HBase Issue Type: Test Reporter: Zhihong Yu Fix For: 0.96.0 Here is related stack trace (see https://builds.apache.org/job/HBase-TRUNK/2665/testReport/org.apache.hadoop.hbase.io.hfile/TestForceCacheImportantBlocks/testCacheBlocks_1_/): {code} Caused by: java.lang.OutOfMemoryError at java.util.zip.Deflater.init(Native Method) at java.util.zip.Deflater.init(Deflater.java:124) at java.util.zip.GZIPOutputStream.init(GZIPOutputStream.java:46) at java.util.zip.GZIPOutputStream.init(GZIPOutputStream.java:58) at org.apache.hadoop.hbase.io.hfile.ReusableStreamGzipCodec$ReusableGzipOutputStream$ResetableGZIPOutputStream.init(ReusableStreamGzipCodec.java:79) at org.apache.hadoop.hbase.io.hfile.ReusableStreamGzipCodec$ReusableGzipOutputStream.init(ReusableStreamGzipCodec.java:90) at org.apache.hadoop.hbase.io.hfile.ReusableStreamGzipCodec.createOutputStream(ReusableStreamGzipCodec.java:130) at org.apache.hadoop.io.compress.GzipCodec.createOutputStream(GzipCodec.java:101) at org.apache.hadoop.hbase.io.hfile.Compression$Algorithm.createPlainCompressionStream(Compression.java:239) at org.apache.hadoop.hbase.io.hfile.Compression$Algorithm.createCompressionStream(Compression.java:223) at org.apache.hadoop.hbase.io.hfile.HFileWriterV1.getCompressingStream(HFileWriterV1.java:270) at org.apache.hadoop.hbase.io.hfile.HFileWriterV1.close(HFileWriterV1.java:416) at org.apache.hadoop.hbase.regionserver.StoreFile$Writer.close(StoreFile.java:1115) at org.apache.hadoop.hbase.regionserver.Store.internalFlushCache(Store.java:706) at org.apache.hadoop.hbase.regionserver.Store.flushCache(Store.java:633) at org.apache.hadoop.hbase.regionserver.Store.access$400(Store.java:106) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5680) Hbase94 and Hbase 92.2 is not compatible with the Hadoop 23.1
[ https://issues.apache.org/jira/browse/HBASE-5680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244283#comment-13244283 ] stack commented on HBASE-5680: -- Yes. Unless someone has a bit of reflection jujitsu they can apply here. It'd be a PITA shipping four tgzs. Two is already too many. Hbase94 and Hbase 92.2 is not compatible with the Hadoop 23.1 -- Key: HBASE-5680 URL: https://issues.apache.org/jira/browse/HBASE-5680 Project: HBase Issue Type: Bug Components: master Reporter: Kristam Subba Swathi Hmaster is not able to start because of the following error Please find the following error 2012-03-30 11:12:19,487 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown. java.lang.NoClassDefFoundError: org/apache/hadoop/hdfs/protocol/FSConstants$SafeModeAction at org.apache.hadoop.hbase.util.FSUtils.waitOnSafeMode(FSUtils.java:524) at org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:324) at org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:127) at org.apache.hadoop.hbase.master.MasterFileSystem.init(MasterFileSystem.java:112) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:496) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:363) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hdfs.protocol.FSConstants$SafeModeAction at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) ... 7 more There is a change in the FSConstants -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5625) Avoid byte buffer allocations when reading a value from a Result object
[ https://issues.apache.org/jira/browse/HBASE-5625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244324#comment-13244324 ] jirapos...@reviews.apache.org commented on HBASE-5625: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4607/#review6622 --- I ran TestTableMapReduce and TestMultithreadedTableMapper with patch v5. They passed. Some minor comments below. src/main/java/org/apache/hadoop/hbase/KeyValue.java https://reviews.apache.org/r/4607/#comment14303 Please include vlength in the exception message src/main/java/org/apache/hadoop/hbase/KeyValue.java https://reviews.apache.org/r/4607/#comment14304 Should read 'BufferOverflowException if there' src/main/java/org/apache/hadoop/hbase/KeyValue.java https://reviews.apache.org/r/4607/#comment14305 Add a space between comma and fl. src/main/java/org/apache/hadoop/hbase/client/Result.java https://reviews.apache.org/r/4607/#comment14306 Is this comment needed ? src/main/java/org/apache/hadoop/hbase/client/Result.java https://reviews.apache.org/r/4607/#comment14307 This line can be removed. src/test/java/org/apache/hadoop/hbase/client/TestResult.java https://reviews.apache.org/r/4607/#comment14308 white space. src/test/java/org/apache/hadoop/hbase/client/TestResult.java https://reviews.apache.org/r/4607/#comment14309 Since benchmarking is hard to do, this test case can be dropped. - Ted On 2012-04-02 14:22:48, Tudor Scurtu wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/4607/ bq. --- bq. bq. (Updated 2012-04-02 14:22:48) bq. bq. bq. Review request for hbase. bq. bq. bq. Summary bq. --- bq. bq. When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value. bq. bq. These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method. bq. bq. bq. This addresses bug HBASE-5625. bq. https://issues.apache.org/jira/browse/HBASE-5625 bq. bq. bq. Diffs bq. - bq. bq.src/main/java/org/apache/hadoop/hbase/KeyValue.java 243d76f bq.src/main/java/org/apache/hadoop/hbase/client/Result.java df0b3ef bq.src/test/java/org/apache/hadoop/hbase/client/TestResult.java f9e29c2 bq. bq. Diff: https://reviews.apache.org/r/4607/diff bq. bq. bq. Testing bq. --- bq. bq. Added value check to TestResult#testBasic and TestResult.testMultiVersion. bq. bq. bq. Thanks, bq. bq. Tudor bq. bq. Avoid byte buffer allocations when reading a value from a Result object --- Key: HBASE-5625 URL: https://issues.apache.org/jira/browse/HBASE-5625 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.92.1 Reporter: Tudor Scurtu Assignee: Tudor Scurtu Labels: patch Attachments: 5625.txt, 5625v2.txt, 5625v3.txt, 5625v4.txt, 5625v5.txt When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value. These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method. The current functionality is maintained, and we have added a separate method call stack that employs the described changes. I will provide more details with the patch. Running tests with a profiler, the reduction of read time seems to be of up to 40%. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5625) Avoid byte buffer allocations when reading a value from a Result object
[ https://issues.apache.org/jira/browse/HBASE-5625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5625: -- Fix Version/s: 0.96.0 Hadoop Flags: Reviewed Avoid byte buffer allocations when reading a value from a Result object --- Key: HBASE-5625 URL: https://issues.apache.org/jira/browse/HBASE-5625 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.92.1 Reporter: Tudor Scurtu Assignee: Tudor Scurtu Labels: patch Fix For: 0.96.0 Attachments: 5625.txt, 5625v2.txt, 5625v3.txt, 5625v4.txt, 5625v5.txt When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value. These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method. The current functionality is maintained, and we have added a separate method call stack that employs the described changes. I will provide more details with the patch. Running tests with a profiler, the reduction of read time seems to be of up to 40%. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5672) TestLruBlockCache#testBackgroundEvictionThread fails occasionally
[ https://issues.apache.org/jira/browse/HBASE-5672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244343#comment-13244343 ] stack commented on HBASE-5672: -- bq. I think Thread.isAlive returns true if we have called Thread.start(),however, Thread.run() haven't been executed at that time. That may be so (I've not looked at source). Do you want to have a flag in the Thread that gets set when you enter the run method and check that too? The above would still be better than a timed wait. TestLruBlockCache#testBackgroundEvictionThread fails occasionally - Key: HBASE-5672 URL: https://issues.apache.org/jira/browse/HBASE-5672 Project: HBase Issue Type: Bug Reporter: chunhui shen Assignee: chunhui shen Attachments: HBASE-5672.patch We find TestLruBlockCache#testBackgroundEvictionThread fails occasionally. I think it's a problem of the test case. Because runEviction() only do evictionThread.evict(): {code} public void evict() { synchronized(this) { this.notify(); // FindBugs NN_NAKED_NOTIFY } } {code} However when we call evictionThread.evict(), the evictionThread may haven't been in run() in the TestLruBlockCache#testBackgroundEvictionThread. If we run the test many times, we could find failture easily. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5694) getRowsWithColumnsTs function Thrift service incorrectly handles time range
[ https://issues.apache.org/jira/browse/HBASE-5694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wouter Bolsterlee updated HBASE-5694: - Status: Open (was: Patch Available) getRowsWithColumnsTs function Thrift service incorrectly handles time range --- Key: HBASE-5694 URL: https://issues.apache.org/jira/browse/HBASE-5694 Project: HBase Issue Type: Bug Components: thrift Affects Versions: 0.92.1 Reporter: Wouter Bolsterlee Fix For: 0.92.2 Attachments: HBASE-5694.patch The getRowsWithColumnsTs() method in the Thrift interface only applies the timestamp if columns are explicitly specified. However, this method also allows for columns to be unspecified (this is even used internally to implement e.g. getRows()). The cause of the bug is a minor scoping issue: the time range is set inside a wrong if statement. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5694) getRowsWithColumnsTs() in Thrift service handles timestamps incorrectly
[ https://issues.apache.org/jira/browse/HBASE-5694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wouter Bolsterlee updated HBASE-5694: - Attachment: HBASE-5694-trunk-20120402.patch Patch against SVN trunk as of today. It's a one-liner that moves the setTimeRange() call outside the if (columns != null) block. getRowsWithColumnsTs() in Thrift service handles timestamps incorrectly --- Key: HBASE-5694 URL: https://issues.apache.org/jira/browse/HBASE-5694 Project: HBase Issue Type: Bug Components: thrift Affects Versions: 0.92.1 Reporter: Wouter Bolsterlee Fix For: 0.92.2 Attachments: HBASE-5694-trunk-20120402.patch, HBASE-5694.patch The getRowsWithColumnsTs() method in the Thrift interface only applies the timestamp if columns are explicitly specified. However, this method also allows for columns to be unspecified (this is even used internally to implement e.g. getRows()). The cause of the bug is a minor scoping issue: the time range is set inside a wrong if statement. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5694) getRowsWithColumnsTs() in Thrift service handles timestamps incorrectly
[ https://issues.apache.org/jira/browse/HBASE-5694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wouter Bolsterlee updated HBASE-5694: - Summary: getRowsWithColumnsTs() in Thrift service handles timestamps incorrectly (was: getRowsWithColumnsTs function Thrift service incorrectly handles time range) getRowsWithColumnsTs() in Thrift service handles timestamps incorrectly --- Key: HBASE-5694 URL: https://issues.apache.org/jira/browse/HBASE-5694 Project: HBase Issue Type: Bug Components: thrift Affects Versions: 0.92.1 Reporter: Wouter Bolsterlee Fix For: 0.92.2 Attachments: HBASE-5694-trunk-20120402.patch, HBASE-5694.patch The getRowsWithColumnsTs() method in the Thrift interface only applies the timestamp if columns are explicitly specified. However, this method also allows for columns to be unspecified (this is even used internally to implement e.g. getRows()). The cause of the bug is a minor scoping issue: the time range is set inside a wrong if statement. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5682) Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only)
[ https://issues.apache.org/jira/browse/HBASE-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244359#comment-13244359 ] Lars Hofhansl commented on HBASE-5682: -- Presumably close it not needed since the connection is known to be down in this case. To be save, I'll add that, and make sure it doesn't cause another hang. I think this is better than HBASE-5153, because it attempts to reconnect when the connection is needed and not when it was lost (in which case it is likely that the next retry will fail as well, leading to long hangs with no change for the caller to notice). Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only) -- Key: HBASE-5682 URL: https://issues.apache.org/jira/browse/HBASE-5682 Project: HBase Issue Type: Improvement Components: client Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Critical Fix For: 0.94.0 Attachments: 5682-all-v2.txt, 5682-all-v3.txt, 5682-all.txt, 5682-v2.txt, 5682.txt Just realized that without this HBASE-4805 is broken. I.e. there's no point keeping a persistent HConnection around if it can be rendered permanently unusable if the ZK connection is lost temporarily. Note that this is fixed in 0.96 with HBASE-5399 (but that seems to big to backport) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5451) Switch RPC call envelope/headers to PBs
[ https://issues.apache.org/jira/browse/HBASE-5451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244362#comment-13244362 ] jirapos...@reviews.apache.org commented on HBASE-5451: -- bq. On 2012-04-02 00:21:20, Michael Stack wrote: bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto, line 93 bq. https://reviews.apache.org/r/4096/diff/3/?file=97744#file97744line93 bq. bq. Should this precede the response? So if false, a response follows else an exception? Do we need a length here? Where is the header that the message name refers too? bq. bq. Devaraj Das wrote: bq. Length will be taken care of by the protobuf serialization/deserialization. The header is the combination of callId, error. If the 'header' is confusing, I can take it off the object name. Let me know. bq. bq. Michael Stack wrote: bq. Yeah, take away the header. Length I think is good. Makes it more robust (IIRC, we went out of our way to add length to the old RPC to help clients figure how much to pull). The argument above for 'length' applies here too... bq. On 2012-04-02 00:21:20, Michael Stack wrote: bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto, line 66 bq. https://reviews.apache.org/r/4096/diff/3/?file=97744#file97744line66 bq. bq. Which part in here is the 'header'? How does it relate to ConnectionHeaderProto? bq. bq. request can be an Invocation/Writable? Or a protobuf? Do we need a length in here? bq. bq. Devaraj Das wrote: bq. Today the only 'header' is the callId.. There is no relation to ConnectionHeaderProto. If the 'header' is confusing, I can take it off the object name. Let me know. bq. bq. 'request' in this patch is only a Invocation/Writable. In theory, it could be a protobuf object as well (since it is just bytes), but, for protobuf, we could make things more explicit by defining a protobuf object rather than a opaque set of bytes. But that's another jira (ProtoBufRpcEngine implementation similar to Hadoop). Length is not needed - the protobuf serialization/deserialization will take care of it.. bq. bq. Michael Stack wrote: bq. I think taking the 'Header' off Request/Response would be best (Did I ask you add it previous? If so, sorry... I misunderstood. Thanks for being accomodating). Yes, on a new issue to make it pb rather than opaque bytes. Do you have to do something here -- make bytes optional? -- to allow for the later pb replacement? bq. bq. On length, thats probably good to keep. For us, we'll give the stream to a pb deserializer but other clients might want to know how many bytes on the line so keep it I'd say. Yes, I'll take off the 'header' from the message name. I could make the 'bytes' field optional. Actually, on the length, I am not sure I understand why we need it in the PB model. Generally speaking, clients talking to servers have to be aware of the PB encoding in order for them to make any sense of the PB data.. The PB type 'bytes' has the length taken care of in the implementation of serialization/deserialization internally. In that sense, I don't think having an explicit length field is required. Does this reasoning make sense? (Also note that the top level RPC request envelope has the length preceding the request data) - Devaraj --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4096/#review6613 --- On 2012-03-30 23:29:32, Devaraj Das wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/4096/ bq. --- bq. bq. (Updated 2012-03-30 23:29:32) bq. bq. bq. Review request for Michael Stack and Benoit Sigoure. bq. bq. bq. Summary bq. --- bq. bq. Switch RPC call envelope/headers to PBs bq. bq. bq. This addresses bug HBASE-5451. bq. https://issues.apache.org/jira/browse/HBASE-5451 bq. bq. bq. Diffs bq. - bq. bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/DataOutputOutputStream.java 1307644 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java 1307644 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java 1307644 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/RPCMessageProtos.java PRE-CREATION bq.
[jira] [Commented] (HBASE-5682) Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only)
[ https://issues.apache.org/jira/browse/HBASE-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244363#comment-13244363 ] Lars Hofhansl commented on HBASE-5682: -- Oh, and thanks for taking a look Jieshan :) Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only) -- Key: HBASE-5682 URL: https://issues.apache.org/jira/browse/HBASE-5682 Project: HBase Issue Type: Improvement Components: client Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Critical Fix For: 0.94.0 Attachments: 5682-all-v2.txt, 5682-all-v3.txt, 5682-all.txt, 5682-v2.txt, 5682.txt Just realized that without this HBASE-4805 is broken. I.e. there's no point keeping a persistent HConnection around if it can be rendered permanently unusable if the ZK connection is lost temporarily. Note that this is fixed in 0.96 with HBASE-5399 (but that seems to big to backport) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5451) Switch RPC call envelope/headers to PBs
[ https://issues.apache.org/jira/browse/HBASE-5451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244365#comment-13244365 ] jirapos...@reviews.apache.org commented on HBASE-5451: -- bq. On 2012-04-02 00:21:20, Michael Stack wrote: bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto, line 66 bq. https://reviews.apache.org/r/4096/diff/3/?file=97744#file97744line66 bq. bq. Which part in here is the 'header'? How does it relate to ConnectionHeaderProto? bq. bq. request can be an Invocation/Writable? Or a protobuf? Do we need a length in here? bq. bq. Devaraj Das wrote: bq. Today the only 'header' is the callId.. There is no relation to ConnectionHeaderProto. If the 'header' is confusing, I can take it off the object name. Let me know. bq. bq. 'request' in this patch is only a Invocation/Writable. In theory, it could be a protobuf object as well (since it is just bytes), but, for protobuf, we could make things more explicit by defining a protobuf object rather than a opaque set of bytes. But that's another jira (ProtoBufRpcEngine implementation similar to Hadoop). Length is not needed - the protobuf serialization/deserialization will take care of it.. bq. bq. Michael Stack wrote: bq. I think taking the 'Header' off Request/Response would be best (Did I ask you add it previous? If so, sorry... I misunderstood. Thanks for being accomodating). Yes, on a new issue to make it pb rather than opaque bytes. Do you have to do something here -- make bytes optional? -- to allow for the later pb replacement? bq. bq. On length, thats probably good to keep. For us, we'll give the stream to a pb deserializer but other clients might want to know how many bytes on the line so keep it I'd say. bq. bq. Devaraj Das wrote: bq. Yes, I'll take off the 'header' from the message name. I could make the 'bytes' field optional. bq. bq. Actually, on the length, I am not sure I understand why we need it in the PB model. Generally speaking, clients talking to servers have to be aware of the PB encoding in order for them to make any sense of the PB data.. The PB type 'bytes' has the length taken care of in the implementation of serialization/deserialization internally. In that sense, I don't think having an explicit length field is required. Does this reasoning make sense? bq. bq. (Also note that the top level RPC request envelope has the length preceding the request data) If the top level rpc request envelope has the length, then I agree w/ you, its not needed as prefix on pb messages. bq. On 2012-04-02 00:21:20, Michael Stack wrote: bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto, line 93 bq. https://reviews.apache.org/r/4096/diff/3/?file=97744#file97744line93 bq. bq. Should this precede the response? So if false, a response follows else an exception? Do we need a length here? Where is the header that the message name refers too? bq. bq. Devaraj Das wrote: bq. Length will be taken care of by the protobuf serialization/deserialization. The header is the combination of callId, error. If the 'header' is confusing, I can take it off the object name. Let me know. bq. bq. Michael Stack wrote: bq. Yeah, take away the header. Length I think is good. Makes it more robust (IIRC, we went out of our way to add length to the old RPC to help clients figure how much to pull). bq. bq. Devaraj Das wrote: bq. The argument above for 'length' applies here too... Agreed. So high level, the response and request have a length of the total message? If so, don't need it down inside preceeding pb messages. - Michael --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4096/#review6613 --- On 2012-03-30 23:29:32, Devaraj Das wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/4096/ bq. --- bq. bq. (Updated 2012-03-30 23:29:32) bq. bq. bq. Review request for Michael Stack and Benoit Sigoure. bq. bq. bq. Summary bq. --- bq. bq. Switch RPC call envelope/headers to PBs bq. bq. bq. This addresses bug HBASE-5451. bq. https://issues.apache.org/jira/browse/HBASE-5451 bq. bq. bq. Diffs bq. - bq. bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/DataOutputOutputStream.java 1307644 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java
[jira] [Commented] (HBASE-5682) Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only)
[ https://issues.apache.org/jira/browse/HBASE-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244367#comment-13244367 ] stack commented on HBASE-5682: -- @Nkeywal Hows' this relate to your TRUNK work (if at all)? Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only) -- Key: HBASE-5682 URL: https://issues.apache.org/jira/browse/HBASE-5682 Project: HBase Issue Type: Improvement Components: client Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Critical Fix For: 0.94.0 Attachments: 5682-all-v2.txt, 5682-all-v3.txt, 5682-all.txt, 5682-v2.txt, 5682.txt Just realized that without this HBASE-4805 is broken. I.e. there's no point keeping a persistent HConnection around if it can be rendered permanently unusable if the ZK connection is lost temporarily. Note that this is fixed in 0.96 with HBASE-5399 (but that seems to big to backport) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5451) Switch RPC call envelope/headers to PBs
[ https://issues.apache.org/jira/browse/HBASE-5451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244368#comment-13244368 ] jirapos...@reviews.apache.org commented on HBASE-5451: -- bq. On 2012-04-02 00:21:20, Michael Stack wrote: bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto, line 66 bq. https://reviews.apache.org/r/4096/diff/3/?file=97744#file97744line66 bq. bq. Which part in here is the 'header'? How does it relate to ConnectionHeaderProto? bq. bq. request can be an Invocation/Writable? Or a protobuf? Do we need a length in here? bq. bq. Devaraj Das wrote: bq. Today the only 'header' is the callId.. There is no relation to ConnectionHeaderProto. If the 'header' is confusing, I can take it off the object name. Let me know. bq. bq. 'request' in this patch is only a Invocation/Writable. In theory, it could be a protobuf object as well (since it is just bytes), but, for protobuf, we could make things more explicit by defining a protobuf object rather than a opaque set of bytes. But that's another jira (ProtoBufRpcEngine implementation similar to Hadoop). Length is not needed - the protobuf serialization/deserialization will take care of it.. bq. bq. Michael Stack wrote: bq. I think taking the 'Header' off Request/Response would be best (Did I ask you add it previous? If so, sorry... I misunderstood. Thanks for being accomodating). Yes, on a new issue to make it pb rather than opaque bytes. Do you have to do something here -- make bytes optional? -- to allow for the later pb replacement? bq. bq. On length, thats probably good to keep. For us, we'll give the stream to a pb deserializer but other clients might want to know how many bytes on the line so keep it I'd say. bq. bq. Devaraj Das wrote: bq. Yes, I'll take off the 'header' from the message name. I could make the 'bytes' field optional. bq. bq. Actually, on the length, I am not sure I understand why we need it in the PB model. Generally speaking, clients talking to servers have to be aware of the PB encoding in order for them to make any sense of the PB data.. The PB type 'bytes' has the length taken care of in the implementation of serialization/deserialization internally. In that sense, I don't think having an explicit length field is required. Does this reasoning make sense? bq. bq. (Also note that the top level RPC request envelope has the length preceding the request data) bq. bq. Michael Stack wrote: bq. If the top level rpc request envelope has the length, then I agree w/ you, its not needed as prefix on pb messages. cool bq. On 2012-04-02 00:21:20, Michael Stack wrote: bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto, line 93 bq. https://reviews.apache.org/r/4096/diff/3/?file=97744#file97744line93 bq. bq. Should this precede the response? So if false, a response follows else an exception? Do we need a length here? Where is the header that the message name refers too? bq. bq. Devaraj Das wrote: bq. Length will be taken care of by the protobuf serialization/deserialization. The header is the combination of callId, error. If the 'header' is confusing, I can take it off the object name. Let me know. bq. bq. Michael Stack wrote: bq. Yeah, take away the header. Length I think is good. Makes it more robust (IIRC, we went out of our way to add length to the old RPC to help clients figure how much to pull). bq. bq. Devaraj Das wrote: bq. The argument above for 'length' applies here too... bq. bq. Michael Stack wrote: bq. Agreed. So high level, the response and request have a length of the total message? If so, don't need it down inside preceeding pb messages. I meant the argument on the PB encoding.. The RPC response envelope, even today, doesn't include the length. For instance, the client side of the method HBaseClient.receiveResponse starts with reading the callId. - Devaraj --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4096/#review6613 --- On 2012-03-30 23:29:32, Devaraj Das wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/4096/ bq. --- bq. bq. (Updated 2012-03-30 23:29:32) bq. bq. bq. Review request for Michael Stack and Benoit Sigoure. bq. bq. bq. Summary bq. --- bq. bq. Switch RPC call envelope/headers to PBs bq. bq. bq. This addresses bug HBASE-5451. bq.
[jira] [Commented] (HBASE-5451) Switch RPC call envelope/headers to PBs
[ https://issues.apache.org/jira/browse/HBASE-5451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244377#comment-13244377 ] jirapos...@reviews.apache.org commented on HBASE-5451: -- bq. On 2012-04-02 00:21:20, Michael Stack wrote: bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto, line 93 bq. https://reviews.apache.org/r/4096/diff/3/?file=97744#file97744line93 bq. bq. Should this precede the response? So if false, a response follows else an exception? Do we need a length here? Where is the header that the message name refers too? bq. bq. Devaraj Das wrote: bq. Length will be taken care of by the protobuf serialization/deserialization. The header is the combination of callId, error. If the 'header' is confusing, I can take it off the object name. Let me know. bq. bq. Michael Stack wrote: bq. Yeah, take away the header. Length I think is good. Makes it more robust (IIRC, we went out of our way to add length to the old RPC to help clients figure how much to pull). bq. bq. Devaraj Das wrote: bq. The argument above for 'length' applies here too... bq. bq. Michael Stack wrote: bq. Agreed. So high level, the response and request have a length of the total message? If so, don't need it down inside preceeding pb messages. bq. bq. Devaraj Das wrote: bq. I meant the argument on the PB encoding.. bq. bq. The RPC response envelope, even today, doesn't include the length. For instance, the client side of the method HBaseClient.receiveResponse starts with reading the callId. Ok. We are replicating what was there previous. Lets make new jira for doing things like a length prefix. - Michael --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4096/#review6613 --- On 2012-03-30 23:29:32, Devaraj Das wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/4096/ bq. --- bq. bq. (Updated 2012-03-30 23:29:32) bq. bq. bq. Review request for Michael Stack and Benoit Sigoure. bq. bq. bq. Summary bq. --- bq. bq. Switch RPC call envelope/headers to PBs bq. bq. bq. This addresses bug HBASE-5451. bq. https://issues.apache.org/jira/browse/HBASE-5451 bq. bq. bq. Diffs bq. - bq. bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/DataOutputOutputStream.java 1307644 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java 1307644 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java 1307644 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/RPCMessageProtos.java PRE-CREATION bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/security/User.java 1307644 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto PRE-CREATION bq. bq. Diff: https://reviews.apache.org/r/4096/diff bq. bq. bq. Testing bq. --- bq. bq. bq. Thanks, bq. bq. Devaraj bq. bq. Switch RPC call envelope/headers to PBs --- Key: HBASE-5451 URL: https://issues.apache.org/jira/browse/HBASE-5451 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Affects Versions: 0.94.0 Reporter: Todd Lipcon Assignee: Devaraj Das Fix For: 0.96.0 Attachments: rpc-proto.2.txt, rpc-proto.3.txt, rpc-proto.patch.1_2, rpc-proto.r5.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5451) Switch RPC call envelope/headers to PBs
[ https://issues.apache.org/jira/browse/HBASE-5451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244378#comment-13244378 ] jirapos...@reviews.apache.org commented on HBASE-5451: -- bq. On 2012-04-02 00:21:20, Michael Stack wrote: bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto, line 93 bq. https://reviews.apache.org/r/4096/diff/3/?file=97744#file97744line93 bq. bq. Should this precede the response? So if false, a response follows else an exception? Do we need a length here? Where is the header that the message name refers too? bq. bq. Devaraj Das wrote: bq. Length will be taken care of by the protobuf serialization/deserialization. The header is the combination of callId, error. If the 'header' is confusing, I can take it off the object name. Let me know. bq. bq. Michael Stack wrote: bq. Yeah, take away the header. Length I think is good. Makes it more robust (IIRC, we went out of our way to add length to the old RPC to help clients figure how much to pull). bq. bq. Devaraj Das wrote: bq. The argument above for 'length' applies here too... bq. bq. Michael Stack wrote: bq. Agreed. So high level, the response and request have a length of the total message? If so, don't need it down inside preceeding pb messages. bq. bq. Devaraj Das wrote: bq. I meant the argument on the PB encoding.. bq. bq. The RPC response envelope, even today, doesn't include the length. For instance, the client side of the method HBaseClient.receiveResponse starts with reading the callId. bq. bq. Michael Stack wrote: bq. Ok. We are replicating what was there previous. Lets make new jira for doing things like a length prefix. Okay let's discuss that in a separate jira.. Otherwise, do you think the patch is good to go? If so, I'll submit a new patch with some of the comments incorporated. - Devaraj --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4096/#review6613 --- On 2012-03-30 23:29:32, Devaraj Das wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/4096/ bq. --- bq. bq. (Updated 2012-03-30 23:29:32) bq. bq. bq. Review request for Michael Stack and Benoit Sigoure. bq. bq. bq. Summary bq. --- bq. bq. Switch RPC call envelope/headers to PBs bq. bq. bq. This addresses bug HBASE-5451. bq. https://issues.apache.org/jira/browse/HBASE-5451 bq. bq. bq. Diffs bq. - bq. bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/DataOutputOutputStream.java 1307644 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java 1307644 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java 1307644 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/RPCMessageProtos.java PRE-CREATION bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/security/User.java 1307644 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto PRE-CREATION bq. bq. Diff: https://reviews.apache.org/r/4096/diff bq. bq. bq. Testing bq. --- bq. bq. bq. Thanks, bq. bq. Devaraj bq. bq. Switch RPC call envelope/headers to PBs --- Key: HBASE-5451 URL: https://issues.apache.org/jira/browse/HBASE-5451 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Affects Versions: 0.94.0 Reporter: Todd Lipcon Assignee: Devaraj Das Fix For: 0.96.0 Attachments: rpc-proto.2.txt, rpc-proto.3.txt, rpc-proto.patch.1_2, rpc-proto.r5.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-5694) getRowsWithColumnsTs() in Thrift service handles timestamps incorrectly
[ https://issues.apache.org/jira/browse/HBASE-5694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-5694. -- Resolution: Fixed Fix Version/s: (was: 0.92.2) 0.94.0 Hadoop Flags: Reviewed Applied to 0.94 and to trunk. Thanks for the patch Wouter (I ran testthriftserver local and it passed). Doesn't look like thrift2 has similar code so passed on trying to apply the patch there. getRowsWithColumnsTs() in Thrift service handles timestamps incorrectly --- Key: HBASE-5694 URL: https://issues.apache.org/jira/browse/HBASE-5694 Project: HBase Issue Type: Bug Components: thrift Affects Versions: 0.92.1 Reporter: Wouter Bolsterlee Fix For: 0.94.0 Attachments: HBASE-5694-trunk-20120402.patch, HBASE-5694.patch The getRowsWithColumnsTs() method in the Thrift interface only applies the timestamp if columns are explicitly specified. However, this method also allows for columns to be unspecified (this is even used internally to implement e.g. getRows()). The cause of the bug is a minor scoping issue: the time range is set inside a wrong if statement. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5451) Switch RPC call envelope/headers to PBs
[ https://issues.apache.org/jira/browse/HBASE-5451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244389#comment-13244389 ] jirapos...@reviews.apache.org commented on HBASE-5451: -- bq. On 2012-04-02 00:21:20, Michael Stack wrote: bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto, line 93 bq. https://reviews.apache.org/r/4096/diff/3/?file=97744#file97744line93 bq. bq. Should this precede the response? So if false, a response follows else an exception? Do we need a length here? Where is the header that the message name refers too? bq. bq. Devaraj Das wrote: bq. Length will be taken care of by the protobuf serialization/deserialization. The header is the combination of callId, error. If the 'header' is confusing, I can take it off the object name. Let me know. bq. bq. Michael Stack wrote: bq. Yeah, take away the header. Length I think is good. Makes it more robust (IIRC, we went out of our way to add length to the old RPC to help clients figure how much to pull). bq. bq. Devaraj Das wrote: bq. The argument above for 'length' applies here too... bq. bq. Michael Stack wrote: bq. Agreed. So high level, the response and request have a length of the total message? If so, don't need it down inside preceeding pb messages. bq. bq. Devaraj Das wrote: bq. I meant the argument on the PB encoding.. bq. bq. The RPC response envelope, even today, doesn't include the length. For instance, the client side of the method HBaseClient.receiveResponse starts with reading the callId. bq. bq. Michael Stack wrote: bq. Ok. We are replicating what was there previous. Lets make new jira for doing things like a length prefix. bq. bq. Devaraj Das wrote: bq. Okay let's discuss that in a separate jira.. bq. bq. Otherwise, do you think the patch is good to go? If so, I'll submit a new patch with some of the comments incorporated. There items above you said you'd address such as removing Header from the request and response and cleaning up doc in the .proto file, right? - Michael --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4096/#review6613 --- On 2012-03-30 23:29:32, Devaraj Das wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/4096/ bq. --- bq. bq. (Updated 2012-03-30 23:29:32) bq. bq. bq. Review request for Michael Stack and Benoit Sigoure. bq. bq. bq. Summary bq. --- bq. bq. Switch RPC call envelope/headers to PBs bq. bq. bq. This addresses bug HBASE-5451. bq. https://issues.apache.org/jira/browse/HBASE-5451 bq. bq. bq. Diffs bq. - bq. bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/DataOutputOutputStream.java 1307644 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java 1307644 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java 1307644 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/RPCMessageProtos.java PRE-CREATION bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/security/User.java 1307644 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto PRE-CREATION bq. bq. Diff: https://reviews.apache.org/r/4096/diff bq. bq. bq. Testing bq. --- bq. bq. bq. Thanks, bq. bq. Devaraj bq. bq. Switch RPC call envelope/headers to PBs --- Key: HBASE-5451 URL: https://issues.apache.org/jira/browse/HBASE-5451 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Affects Versions: 0.94.0 Reporter: Todd Lipcon Assignee: Devaraj Das Fix For: 0.96.0 Attachments: rpc-proto.2.txt, rpc-proto.3.txt, rpc-proto.patch.1_2, rpc-proto.r5.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5451) Switch RPC call envelope/headers to PBs
[ https://issues.apache.org/jira/browse/HBASE-5451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244400#comment-13244400 ] jirapos...@reviews.apache.org commented on HBASE-5451: -- bq. On 2012-04-02 00:21:20, Michael Stack wrote: bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto, line 93 bq. https://reviews.apache.org/r/4096/diff/3/?file=97744#file97744line93 bq. bq. Should this precede the response? So if false, a response follows else an exception? Do we need a length here? Where is the header that the message name refers too? bq. bq. Devaraj Das wrote: bq. Length will be taken care of by the protobuf serialization/deserialization. The header is the combination of callId, error. If the 'header' is confusing, I can take it off the object name. Let me know. bq. bq. Michael Stack wrote: bq. Yeah, take away the header. Length I think is good. Makes it more robust (IIRC, we went out of our way to add length to the old RPC to help clients figure how much to pull). bq. bq. Devaraj Das wrote: bq. The argument above for 'length' applies here too... bq. bq. Michael Stack wrote: bq. Agreed. So high level, the response and request have a length of the total message? If so, don't need it down inside preceeding pb messages. bq. bq. Devaraj Das wrote: bq. I meant the argument on the PB encoding.. bq. bq. The RPC response envelope, even today, doesn't include the length. For instance, the client side of the method HBaseClient.receiveResponse starts with reading the callId. bq. bq. Michael Stack wrote: bq. Ok. We are replicating what was there previous. Lets make new jira for doing things like a length prefix. bq. bq. Devaraj Das wrote: bq. Okay let's discuss that in a separate jira.. bq. bq. Otherwise, do you think the patch is good to go? If so, I'll submit a new patch with some of the comments incorporated. bq. bq. Michael Stack wrote: bq. There items above you said you'd address such as removing Header from the request and response and cleaning up doc in the .proto file, right? Correct .. that's what I meant to include in the new patch. - Devaraj --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4096/#review6613 --- On 2012-03-30 23:29:32, Devaraj Das wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/4096/ bq. --- bq. bq. (Updated 2012-03-30 23:29:32) bq. bq. bq. Review request for Michael Stack and Benoit Sigoure. bq. bq. bq. Summary bq. --- bq. bq. Switch RPC call envelope/headers to PBs bq. bq. bq. This addresses bug HBASE-5451. bq. https://issues.apache.org/jira/browse/HBASE-5451 bq. bq. bq. Diffs bq. - bq. bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/DataOutputOutputStream.java 1307644 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java 1307644 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java 1307644 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/RPCMessageProtos.java PRE-CREATION bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/security/User.java 1307644 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto PRE-CREATION bq. bq. Diff: https://reviews.apache.org/r/4096/diff bq. bq. bq. Testing bq. --- bq. bq. bq. Thanks, bq. bq. Devaraj bq. bq. Switch RPC call envelope/headers to PBs --- Key: HBASE-5451 URL: https://issues.apache.org/jira/browse/HBASE-5451 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Affects Versions: 0.94.0 Reporter: Todd Lipcon Assignee: Devaraj Das Fix For: 0.96.0 Attachments: rpc-proto.2.txt, rpc-proto.3.txt, rpc-proto.patch.1_2, rpc-proto.r5.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5699) Should we use muti HLog or Writer in HLog in a HRegionServer
Should we use muti HLog or Writer in HLog in a HRegionServer Key: HBASE-5699 URL: https://issues.apache.org/jira/browse/HBASE-5699 Project: HBase Issue Type: Improvement Reporter: binlijin -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5689) Skipping RecoveredEdits may cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244418#comment-13244418 ] stack commented on HBASE-5689: -- Good one Chunhui. I think the patch good. Nice reproduction of the problem in a test. Where in the test do you find that we've lost the third edit? So we name the file when we write it for its first edit, then when we move it into place, we rename it to be by last edit in the file? Add a comment to that effect I'd say else could be confusing. Hmm... I suppose you have it here on the doc for getCompletedRecoveredEditsFilePath. Thats probably good enough.. but no harm explaining why we go from naming file w/ first edit to instead name it for the last edit. Skipping RecoveredEdits may cause data loss --- Key: HBASE-5689 URL: https://issues.apache.org/jira/browse/HBASE-5689 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.0 Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.94.0 Attachments: 5689-simplified.txt, 5689-testcase.patch, HBASE-5689.patch Let's see the following scenario: 1.Region is on the server A 2.put KV(r1-v1) to the region 3.move region from server A to server B 4.put KV(r2-v2) to the region 5.move region from server B to server A 6.put KV(r3-v3) to the region 7.kill -9 server B and start it 8.kill -9 server A and start it 9.scan the region, we could only get two KV(r1-v1,r2-v2), the third KV(r3-v3) is lost. Let's analyse the upper scenario from the code: 1.the edit logs of KV(r1-v1) and KV(r3-v3) are both recorded in the same hlog file on server A. 2.when we split server B's hlog file in the process of ServerShutdownHandler, we create one RecoveredEdits file f1 for the region. 2.when we split server A's hlog file in the process of ServerShutdownHandler, we create another RecoveredEdits file f2 for the region. 3.however, RecoveredEdits file f2 will be skiped when initializing region HRegion#replayRecoveredEditsIfAny {code} for (Path edits: files) { if (edits == null || !this.fs.exists(edits)) { LOG.warn(Null or non-existent edits file: + edits); continue; } if (isZeroLengthThenDelete(this.fs, edits)) continue; if (checkSafeToSkip) { Path higher = files.higher(edits); long maxSeqId = Long.MAX_VALUE; if (higher != null) { // Edit file name pattern, HLog.EDITFILES_NAME_PATTERN: -?[0-9]+ String fileName = higher.getName(); maxSeqId = Math.abs(Long.parseLong(fileName)); } if (maxSeqId = minSeqId) { String msg = Maximum possible sequenceid for this log is + maxSeqId + , skipped the whole file, path= + edits; LOG.debug(msg); continue; } else { checkSafeToSkip = false; } } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Should we use muti HLog or Writer in HLog in a HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244419#comment-13244419 ] stack commented on HBASE-5699: -- Please provide more detail on what this issue is about and correct the subject so its properly spelled. Thanks. Should we use muti HLog or Writer in HLog in a HRegionServer Key: HBASE-5699 URL: https://issues.apache.org/jira/browse/HBASE-5699 Project: HBase Issue Type: Improvement Reporter: binlijin -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5682) Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only)
[ https://issues.apache.org/jira/browse/HBASE-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244420#comment-13244420 ] Lars Hofhansl commented on HBASE-5682: -- One other strangeness I found is that none of ZKUtil methods actually throw exceptions. They retry (via RecoverableZooKeeper) and then just log a message if there is a failure. This is especially annoying with ZooKeeperWatcher, because there is no way of telling whether the connection succeeded of not from the outside. Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only) -- Key: HBASE-5682 URL: https://issues.apache.org/jira/browse/HBASE-5682 Project: HBase Issue Type: Improvement Components: client Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Critical Fix For: 0.94.0 Attachments: 5682-all-v2.txt, 5682-all-v3.txt, 5682-all.txt, 5682-v2.txt, 5682.txt Just realized that without this HBASE-4805 is broken. I.e. there's no point keeping a persistent HConnection around if it can be rendered permanently unusable if the ZK connection is lost temporarily. Note that this is fixed in 0.96 with HBASE-5399 (but that seems to big to backport) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5682) Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only)
[ https://issues.apache.org/jira/browse/HBASE-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244436#comment-13244436 ] stack commented on HBASE-5682: -- Can we add an isAlive to ZKW? Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only) -- Key: HBASE-5682 URL: https://issues.apache.org/jira/browse/HBASE-5682 Project: HBase Issue Type: Improvement Components: client Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Critical Fix For: 0.94.0 Attachments: 5682-all-v2.txt, 5682-all-v3.txt, 5682-all.txt, 5682-v2.txt, 5682.txt Just realized that without this HBASE-4805 is broken. I.e. there's no point keeping a persistent HConnection around if it can be rendered permanently unusable if the ZK connection is lost temporarily. Note that this is fixed in 0.96 with HBASE-5399 (but that seems to big to backport) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (HBASE-5699) Should we use muti HLog or Writer in HLog in a HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244419#comment-13244419 ] stack edited comment on HBASE-5699 at 4/2/12 6:48 PM: -- Please provide more detail on what this issue is about and correct the subject so it's properly spelled. Thanks. was (Author: stack): Please provide more detail on what this issue is about and correct the subject so its properly spelled. Thanks. Should we use muti HLog or Writer in HLog in a HRegionServer Key: HBASE-5699 URL: https://issues.apache.org/jira/browse/HBASE-5699 Project: HBase Issue Type: Improvement Reporter: binlijin -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5682) Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only)
[ https://issues.apache.org/jira/browse/HBASE-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1320#comment-1320 ] nkeywal commented on HBASE-5682: .bq none of ZKUtil methods actually throw exceptions From what is see on 0.96 it should, as the return is not reached: the pattern is too call keeperException, and keeperException throws an exception. {noformat} public void keeperException(KeeperException ke) throws KeeperException { LOG.error(prefix(Received unexpected KeeperException, re-throwing exception), ke); throw ke; } {noformat} Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only) -- Key: HBASE-5682 URL: https://issues.apache.org/jira/browse/HBASE-5682 Project: HBase Issue Type: Improvement Components: client Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Critical Fix For: 0.94.0 Attachments: 5682-all-v2.txt, 5682-all-v3.txt, 5682-all.txt, 5682-v2.txt, 5682.txt Just realized that without this HBASE-4805 is broken. I.e. there's no point keeping a persistent HConnection around if it can be rendered permanently unusable if the ZK connection is lost temporarily. Note that this is fixed in 0.96 with HBASE-5399 (but that seems to big to backport) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3909) Add dynamic config
[ https://issues.apache.org/jira/browse/HBASE-3909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244456#comment-13244456 ] stack commented on HBASE-3909: -- Nothing in hadoop-7001 guarantees that what is in the *.xml files is in agreement w/ what gets POSTed to the daemon, right? Add dynamic config -- Key: HBASE-3909 URL: https://issues.apache.org/jira/browse/HBASE-3909 Project: HBase Issue Type: Bug Reporter: stack Fix For: 0.96.0 I'm sure this issue exists already, at least as part of the discussion around making online schema edits possible, but no hard this having its own issue. Ted started a conversation on this topic up on dev and Todd suggested we lookd at how Hadoop did it over in HADOOP-7001 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5666) RegionServer doesn't retry to check if base node is available
[ https://issues.apache.org/jira/browse/HBASE-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-5666: --- Attachment: HBASE-5666-v4.patch RegionServer doesn't retry to check if base node is available - Key: HBASE-5666 URL: https://issues.apache.org/jira/browse/HBASE-5666 Project: HBase Issue Type: Bug Components: regionserver, zookeeper Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Attachments: HBASE-5666-v1.patch, HBASE-5666-v2.patch, HBASE-5666-v3.patch, HBASE-5666-v4.patch, hbase-1-regionserver.log, hbase-2-regionserver.log, hbase-3-regionserver.log, hbase-master.log, hbase-regionserver.log, hbase-zookeeper.log I've a script that starts hbase and a couple of region servers in distributed mode (hbase.cluster.distributed = true) {code} $HBASE_HOME/bin/start-hbase.sh $HBASE_HOME/bin/local-regionservers.sh start 1 2 3 {code} but the region servers are not able to start... It seems that during the RS start the the znode is still not available, and HRegionServer.initializeZooKeeper() check just once if the base not is available. {code} 2012-03-28 21:54:05,013 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master. 2012-03-28 21:54:08,598 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server localhost,60202,133296824: Initialization of RS failed. Hence aborting RS. java.io.IOException: Received the shutdown message while waiting. at org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:626) at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:596) at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:558) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:672) at java.lang.Thread.run(Thread.java:662) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5694) getRowsWithColumnsTs() in Thrift service handles timestamps incorrectly
[ https://issues.apache.org/jira/browse/HBASE-5694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244462#comment-13244462 ] Hudson commented on HBASE-5694: --- Integrated in HBase-0.94 #78 (See [https://builds.apache.org/job/HBase-0.94/78/]) HBASE-5694 getRowsWithColumnsTs() in Thrift service handles timestamps incorrectly (Revision 1308446) Result = FAILURE stack : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/thrift/ThriftServerRunner.java getRowsWithColumnsTs() in Thrift service handles timestamps incorrectly --- Key: HBASE-5694 URL: https://issues.apache.org/jira/browse/HBASE-5694 Project: HBase Issue Type: Bug Components: thrift Affects Versions: 0.92.1 Reporter: Wouter Bolsterlee Fix For: 0.94.0 Attachments: HBASE-5694-trunk-20120402.patch, HBASE-5694.patch The getRowsWithColumnsTs() method in the Thrift interface only applies the timestamp if columns are explicitly specified. However, this method also allows for columns to be unspecified (this is even used internally to implement e.g. getRows()). The cause of the bug is a minor scoping issue: the time range is set inside a wrong if statement. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5682) Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only)
[ https://issues.apache.org/jira/browse/HBASE-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244468#comment-13244468 ] Lars Hofhansl commented on HBASE-5682: -- Yeah, my comment was wrong. It's not generally doing that. What I do find is if the ZK quorum is down, none of getZookeeperWatcher(), masterAddressTracker.start(), and rootRegionTracker.start() actually fail. They just retry and then happily return, which is as designed, because they are asynchronous. Would be nice to have a isAlive or waitForConnect method on ZKW that would throw if the connection could not be established. The attached patch is still a vast improvement, but it could be made better (even with zk timeout set to 100ms and retries to 3, it still take 22s for ensureZookeeperTrackers to finish). Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only) -- Key: HBASE-5682 URL: https://issues.apache.org/jira/browse/HBASE-5682 Project: HBase Issue Type: Improvement Components: client Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Critical Fix For: 0.94.0 Attachments: 5682-all-v2.txt, 5682-all-v3.txt, 5682-all.txt, 5682-v2.txt, 5682.txt Just realized that without this HBASE-4805 is broken. I.e. there's no point keeping a persistent HConnection around if it can be rendered permanently unusable if the ZK connection is lost temporarily. Note that this is fixed in 0.96 with HBASE-5399 (but that seems to big to backport) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5682) Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only)
[ https://issues.apache.org/jira/browse/HBASE-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244470#comment-13244470 ] Lars Hofhansl commented on HBASE-5682: -- Even isAlive or waitForConnect would need to rely on a timeout, so we wouldn't have won anything really. Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only) -- Key: HBASE-5682 URL: https://issues.apache.org/jira/browse/HBASE-5682 Project: HBase Issue Type: Improvement Components: client Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Critical Fix For: 0.94.0 Attachments: 5682-all-v2.txt, 5682-all-v3.txt, 5682-all.txt, 5682-v2.txt, 5682.txt Just realized that without this HBASE-4805 is broken. I.e. there's no point keeping a persistent HConnection around if it can be rendered permanently unusable if the ZK connection is lost temporarily. Note that this is fixed in 0.96 with HBASE-5399 (but that seems to big to backport) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5666) RegionServer doesn't retry to check if base node is available
[ https://issues.apache.org/jira/browse/HBASE-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244471#comment-13244471 ] stack commented on HBASE-5666: -- On creation of ZooKeeperWatcher, we do following. Why is it not sufficient? {code} // The first call against zk can fail with connection loss. Seems common. // Apparently this is recoverable. Retry a while. // See http://wiki.apache.org/hadoop/ZooKeeper/ErrorHandling // TODO: Generalize out in ZKUtil. long wait = conf.getLong(HConstants.ZOOKEEPER_RECOVERABLE_WAITTIME, HConstants.DEFAULT_ZOOKEPER_RECOVERABLE_WAITIME); long finished = System.currentTimeMillis() + wait; KeeperException ke = null; do { try { ZKUtil.createAndFailSilent(this, baseZNode); ke = null; break; } catch (KeeperException.ConnectionLossException e) { if (LOG.isDebugEnabled() (isFinishedRetryingRecoverable(finished))) { LOG.debug(Retrying zk create for another + (finished - System.currentTimeMillis()) + ms; set 'hbase.zookeeper.recoverable.waittime' to change + wait time); + e.getMessage()); } ke = e; } } while (isFinishedRetryingRecoverable(finished)); {code} Is the wait too short? RegionServer doesn't retry to check if base node is available - Key: HBASE-5666 URL: https://issues.apache.org/jira/browse/HBASE-5666 Project: HBase Issue Type: Bug Components: regionserver, zookeeper Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Attachments: HBASE-5666-v1.patch, HBASE-5666-v2.patch, HBASE-5666-v3.patch, HBASE-5666-v4.patch, hbase-1-regionserver.log, hbase-2-regionserver.log, hbase-3-regionserver.log, hbase-master.log, hbase-regionserver.log, hbase-zookeeper.log I've a script that starts hbase and a couple of region servers in distributed mode (hbase.cluster.distributed = true) {code} $HBASE_HOME/bin/start-hbase.sh $HBASE_HOME/bin/local-regionservers.sh start 1 2 3 {code} but the region servers are not able to start... It seems that during the RS start the the znode is still not available, and HRegionServer.initializeZooKeeper() check just once if the base not is available. {code} 2012-03-28 21:54:05,013 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master. 2012-03-28 21:54:08,598 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server localhost,60202,133296824: Initialization of RS failed. Hence aborting RS. java.io.IOException: Received the shutdown message while waiting. at org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:626) at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:596) at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:558) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:672) at java.lang.Thread.run(Thread.java:662) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5682) Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only)
[ https://issues.apache.org/jira/browse/HBASE-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-5682: - Attachment: 5682-all-v4.txt I think this is as good as we can get in 0.94. # Removed the exception handling from ensureZookeeperTrackers none of these methods throw. # added getZookeeperWatcher to two methods that just need a ZKW. The key is that an HConnection will never be left in a permanently useless state. Can file another jira for better timeouts. Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only) -- Key: HBASE-5682 URL: https://issues.apache.org/jira/browse/HBASE-5682 Project: HBase Issue Type: Improvement Components: client Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Critical Fix For: 0.94.0 Attachments: 5682-all-v2.txt, 5682-all-v3.txt, 5682-all-v4.txt, 5682-all.txt, 5682-v2.txt, 5682.txt Just realized that without this HBASE-4805 is broken. I.e. there's no point keeping a persistent HConnection around if it can be rendered permanently unusable if the ZK connection is lost temporarily. Note that this is fixed in 0.96 with HBASE-5399 (but that seems to big to backport) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3909) Add dynamic config
[ https://issues.apache.org/jira/browse/HBASE-3909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244482#comment-13244482 ] Uma Maheswara Rao G commented on HBASE-3909: Yes, As per my understanding, HADOOP-7001 will assume that, OM/other tools will update *.xml and POST the same configs to Daemon for updating in-memory values. Add dynamic config -- Key: HBASE-3909 URL: https://issues.apache.org/jira/browse/HBASE-3909 Project: HBase Issue Type: Bug Reporter: stack Fix For: 0.96.0 I'm sure this issue exists already, at least as part of the discussion around making online schema edits possible, but no hard this having its own issue. Ted started a conversation on this topic up on dev and Todd suggested we lookd at how Hadoop did it over in HADOOP-7001 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5682) Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only)
[ https://issues.apache.org/jira/browse/HBASE-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244484#comment-13244484 ] nkeywal commented on HBASE-5682: In 0.96 this should work, with the restriction that the logic is that you can get a non working connection, that will get fixed when you try to use it. It's a different mechanism than the one for HBaseAdmin, as HBaseAdmin first check the connection. Thz ZK mechanism is more efficient (you save a remote call to check that the connection is really working), but is more complex. However it seems it does not work at the end: bq. What I saw in 0.96 is that the client was blocked for a very long time (gave up after a few minutes), even though I had set all timeouts to low values. This is also deadly in an app server setting. Might be a simple fix there, didn't dig deeper. @lars What did you exactly do? I can do the fix it on 0.96. Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only) -- Key: HBASE-5682 URL: https://issues.apache.org/jira/browse/HBASE-5682 Project: HBase Issue Type: Improvement Components: client Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Critical Fix For: 0.94.0 Attachments: 5682-all-v2.txt, 5682-all-v3.txt, 5682-all-v4.txt, 5682-all.txt, 5682-v2.txt, 5682.txt Just realized that without this HBASE-4805 is broken. I.e. there's no point keeping a persistent HConnection around if it can be rendered permanently unusable if the ZK connection is lost temporarily. Note that this is fixed in 0.96 with HBASE-5399 (but that seems to big to backport) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5666) RegionServer doesn't retry to check if base node is available
[ https://issues.apache.org/jira/browse/HBASE-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244488#comment-13244488 ] Matteo Bertozzi commented on HBASE-5666: The problem here is that there's no ConnectionLossException... if you take a look at the log you can see that there's no KeeperException but zookeeper respond that the base node doesn't exists. RegionServer doesn't retry to check if base node is available - Key: HBASE-5666 URL: https://issues.apache.org/jira/browse/HBASE-5666 Project: HBase Issue Type: Bug Components: regionserver, zookeeper Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Attachments: HBASE-5666-v1.patch, HBASE-5666-v2.patch, HBASE-5666-v3.patch, HBASE-5666-v4.patch, hbase-1-regionserver.log, hbase-2-regionserver.log, hbase-3-regionserver.log, hbase-master.log, hbase-regionserver.log, hbase-zookeeper.log I've a script that starts hbase and a couple of region servers in distributed mode (hbase.cluster.distributed = true) {code} $HBASE_HOME/bin/start-hbase.sh $HBASE_HOME/bin/local-regionservers.sh start 1 2 3 {code} but the region servers are not able to start... It seems that during the RS start the the znode is still not available, and HRegionServer.initializeZooKeeper() check just once if the base not is available. {code} 2012-03-28 21:54:05,013 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master. 2012-03-28 21:54:08,598 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server localhost,60202,133296824: Initialization of RS failed. Hence aborting RS. java.io.IOException: Received the shutdown message while waiting. at org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:626) at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:596) at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:558) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:672) at java.lang.Thread.run(Thread.java:662) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3909) Add dynamic config
[ https://issues.apache.org/jira/browse/HBASE-3909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244489#comment-13244489 ] Todd Lipcon commented on HBASE-3909: I think adding a refreshConfigs admin command is a good idea. It can re-read the configs off the local disk, and emit warnings for any configs that changed that were not runtime-reconfigurable. Add dynamic config -- Key: HBASE-3909 URL: https://issues.apache.org/jira/browse/HBASE-3909 Project: HBase Issue Type: Bug Reporter: stack Fix For: 0.96.0 I'm sure this issue exists already, at least as part of the discussion around making online schema edits possible, but no hard this having its own issue. Ted started a conversation on this topic up on dev and Todd suggested we lookd at how Hadoop did it over in HADOOP-7001 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4393) Implement a canary monitoring program
[ https://issues.apache.org/jira/browse/HBASE-4393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244491#comment-13244491 ] stack commented on HBASE-4393: -- Please format your contrib as a patch (git or svn add then do a git diff --no-prefix or svn diff). Thanks. This line is not necessary any more: {code} * Copyright 2012 The Apache Software Foundation {code} Please fix this doc: ' * HBase Canary Tool, that that can be used to do' (too many 'that's) On the Sink interface, its not going to be used by anyone else if its private? That might be fine for first checkin. Later when other Sinks we can open it up? I think filesink is the wrong sink to do as first implementation. Your first Sink should be StdOutSink using Logging system. Notice how anything that is started with bin/hbase-daemon.sh gets log files set up for it (master, regionserver, but also rest, thrift, etc.). Doing this, your emissions will be in a well-known place in files that are named with a format that matches other loggings made by hbase, etc. This method is oddly named: {code} public void publish(HRegionInfo region, HColumnDescriptor column, long msTime) { {code} It seems like its for logging messages like this: %s read from region %s column family %s in %dms\n, ... should method name be logReadTime? Or publishReadTiming? Whats the BasicParser do? It matches what Tool does? We don't want GnuParser? I like this comment: {code} // user has specified an interval for canary breaths {code} Thats cute. Put on one line: {code} if (conf == null) conf = HBaseConfiguration.create(); {code} I think I should be able to run this once OR run it as a daemon. Pass an arg if its to run as daemon process? Can this code use any of the utility that is in hbck? I like the Tool improvements. Thanks Matteo. Implement a canary monitoring program - Key: HBASE-4393 URL: https://issues.apache.org/jira/browse/HBASE-4393 Project: HBase Issue Type: New Feature Components: monitoring Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: Matteo Bertozzi Attachments: Canary-v0.java, HBaseCanary.java This JIRA is to implement a standalone program that can be used to do canary monitoring of a running HBase cluster. This program would gather a list of the regions in the cluster, then iterate over them doing lightweight operations (eg short scans) to provide metrics about latency as well as alert on availability issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5666) RegionServer doesn't retry to check if base node is available
[ https://issues.apache.org/jira/browse/HBASE-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244492#comment-13244492 ] stack commented on HBASE-5666: -- This is the code that is supposed to create the base node right? If we come out of here and there is no base node, then thats a problem? Should the fix be down here in ZKW rather than up in regionserver? RegionServer doesn't retry to check if base node is available - Key: HBASE-5666 URL: https://issues.apache.org/jira/browse/HBASE-5666 Project: HBase Issue Type: Bug Components: regionserver, zookeeper Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Attachments: HBASE-5666-v1.patch, HBASE-5666-v2.patch, HBASE-5666-v3.patch, HBASE-5666-v4.patch, hbase-1-regionserver.log, hbase-2-regionserver.log, hbase-3-regionserver.log, hbase-master.log, hbase-regionserver.log, hbase-zookeeper.log I've a script that starts hbase and a couple of region servers in distributed mode (hbase.cluster.distributed = true) {code} $HBASE_HOME/bin/start-hbase.sh $HBASE_HOME/bin/local-regionservers.sh start 1 2 3 {code} but the region servers are not able to start... It seems that during the RS start the the znode is still not available, and HRegionServer.initializeZooKeeper() check just once if the base not is available. {code} 2012-03-28 21:54:05,013 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master. 2012-03-28 21:54:08,598 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server localhost,60202,133296824: Initialization of RS failed. Hence aborting RS. java.io.IOException: Received the shutdown message while waiting. at org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:626) at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:596) at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:558) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:672) at java.lang.Thread.run(Thread.java:662) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3909) Add dynamic config
[ https://issues.apache.org/jira/browse/HBASE-3909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244494#comment-13244494 ] stack commented on HBASE-3909: -- But it should go via zk I'd say since we have it rather than have us POST refreshConfigs to a servlet on each server Add dynamic config -- Key: HBASE-3909 URL: https://issues.apache.org/jira/browse/HBASE-3909 Project: HBase Issue Type: Bug Reporter: stack Fix For: 0.96.0 I'm sure this issue exists already, at least as part of the discussion around making online schema edits possible, but no hard this having its own issue. Ted started a conversation on this topic up on dev and Todd suggested we lookd at how Hadoop did it over in HADOOP-7001 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5693) When creating a region, the master initializes it and creates a memstore within the master server
[ https://issues.apache.org/jira/browse/HBASE-5693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244495#comment-13244495 ] stack commented on HBASE-5693: -- The failures because of your patch N? When creating a region, the master initializes it and creates a memstore within the master server - Key: HBASE-5693 URL: https://issues.apache.org/jira/browse/HBASE-5693 Project: HBase Issue Type: Improvement Components: master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 5593.v2.patch, 5693.v1.patch I didn't do a complete analysis, but the attached patch saves more than 0.25s for each region creation and locally all the unit tests work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5697) Audit HBase for usage of deprecated hadoop 0.20.x property names.
[ https://issues.apache.org/jira/browse/HBASE-5697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244500#comment-13244500 ] stack commented on HBASE-5697: -- ooo... nelly. Thats a long list Jon. Audit HBase for usage of deprecated hadoop 0.20.x property names. - Key: HBASE-5697 URL: https://issues.apache.org/jira/browse/HBASE-5697 Project: HBase Issue Type: Task Reporter: Jonathan Hsieh Many xml config properties in Hadoop have changed in 0.23. We should audit hbase to insulate it from hadoop property name changes. Here is a list of the hadoop property name changes: http://hadoop.apache.org/common/docs/r0.23.1/hadoop-project-dist/hadoop-common/DeprecatedProperties.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3909) Add dynamic config
[ https://issues.apache.org/jira/browse/HBASE-3909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244509#comment-13244509 ] stack commented on HBASE-3909: -- An argument for redoing hadoop-7001 in hbase would be that you can reset configs in hbase the way you do it in hadoop. I could go for that. Add dynamic config -- Key: HBASE-3909 URL: https://issues.apache.org/jira/browse/HBASE-3909 Project: HBase Issue Type: Bug Reporter: stack Fix For: 0.96.0 I'm sure this issue exists already, at least as part of the discussion around making online schema edits possible, but no hard this having its own issue. Ted started a conversation on this topic up on dev and Todd suggested we lookd at how Hadoop did it over in HADOOP-7001 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5693) When creating a region, the master initializes it and creates a memstore within the master server
[ https://issues.apache.org/jira/browse/HBASE-5693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244519#comment-13244519 ] nkeywal commented on HBASE-5693: I don't think so. I didn't see them locally. When creating a region, the master initializes it and creates a memstore within the master server - Key: HBASE-5693 URL: https://issues.apache.org/jira/browse/HBASE-5693 Project: HBase Issue Type: Improvement Components: master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 5593.v2.patch, 5693.v1.patch I didn't do a complete analysis, but the attached patch saves more than 0.25s for each region creation and locally all the unit tests work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5443) Add PB-based calls to HRegionInterface
[ https://issues.apache.org/jira/browse/HBASE-5443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244522#comment-13244522 ] stack commented on HBASE-5443: -- There is also this write up of Todd's on why pb in first place over in hdfs: https://issues.apache.org/jira/browse/HDFS-2058?focusedCommentId=13047289page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13047289 Add PB-based calls to HRegionInterface -- Key: HBASE-5443 URL: https://issues.apache.org/jira/browse/HBASE-5443 Project: HBase Issue Type: Task Components: ipc, master, migration, regionserver Reporter: Todd Lipcon Assignee: Jimmy Xiang Fix For: 0.96.0 Attachments: region_java-proto-mapping.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5666) RegionServer doesn't retry to check if base node is available
[ https://issues.apache.org/jira/browse/HBASE-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244537#comment-13244537 ] Matteo Bertozzi commented on HBASE-5666: m... maybe i've lost something but, in 0.92 and trunk that code was removed and there's just a call to ZKUtil.createAndFailSilent() that doesn't retry. Any idea? https://github.com/apache/hbase/commit/6dc7ccf3779add13188bd73011e0d25bbab77a05 RegionServer doesn't retry to check if base node is available - Key: HBASE-5666 URL: https://issues.apache.org/jira/browse/HBASE-5666 Project: HBase Issue Type: Bug Components: regionserver, zookeeper Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Attachments: HBASE-5666-v1.patch, HBASE-5666-v2.patch, HBASE-5666-v3.patch, HBASE-5666-v4.patch, hbase-1-regionserver.log, hbase-2-regionserver.log, hbase-3-regionserver.log, hbase-master.log, hbase-regionserver.log, hbase-zookeeper.log I've a script that starts hbase and a couple of region servers in distributed mode (hbase.cluster.distributed = true) {code} $HBASE_HOME/bin/start-hbase.sh $HBASE_HOME/bin/local-regionservers.sh start 1 2 3 {code} but the region servers are not able to start... It seems that during the RS start the the znode is still not available, and HRegionServer.initializeZooKeeper() check just once if the base not is available. {code} 2012-03-28 21:54:05,013 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master. 2012-03-28 21:54:08,598 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server localhost,60202,133296824: Initialization of RS failed. Hence aborting RS. java.io.IOException: Received the shutdown message while waiting. at org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:626) at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:596) at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:558) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:672) at java.lang.Thread.run(Thread.java:662) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5666) RegionServer doesn't retry to check if base node is available
[ https://issues.apache.org/jira/browse/HBASE-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244560#comment-13244560 ] stack commented on HBASE-5666: -- Thanks for digging. Seems like the RecoverableZK is failing silently (smile). Seriously, it may be retrying any ConnectionLossException but if no base dir up on in zk, there's nothing for ZKW to 'watch'... it should fail construction (or this test needs to be moved out to an init method or something ...). What you reckon? RegionServer doesn't retry to check if base node is available - Key: HBASE-5666 URL: https://issues.apache.org/jira/browse/HBASE-5666 Project: HBase Issue Type: Bug Components: regionserver, zookeeper Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Attachments: HBASE-5666-v1.patch, HBASE-5666-v2.patch, HBASE-5666-v3.patch, HBASE-5666-v4.patch, hbase-1-regionserver.log, hbase-2-regionserver.log, hbase-3-regionserver.log, hbase-master.log, hbase-regionserver.log, hbase-zookeeper.log I've a script that starts hbase and a couple of region servers in distributed mode (hbase.cluster.distributed = true) {code} $HBASE_HOME/bin/start-hbase.sh $HBASE_HOME/bin/local-regionservers.sh start 1 2 3 {code} but the region servers are not able to start... It seems that during the RS start the the znode is still not available, and HRegionServer.initializeZooKeeper() check just once if the base not is available. {code} 2012-03-28 21:54:05,013 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master. 2012-03-28 21:54:08,598 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server localhost,60202,133296824: Initialization of RS failed. Hence aborting RS. java.io.IOException: Received the shutdown message while waiting. at org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:626) at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:596) at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:558) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:672) at java.lang.Thread.run(Thread.java:662) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5666) RegionServer doesn't retry to check if base node is available
[ https://issues.apache.org/jira/browse/HBASE-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244564#comment-13244564 ] Hadoop QA commented on HBASE-5666: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12521011/HBASE-5666-v4.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1370//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1370//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1370//console This message is automatically generated. RegionServer doesn't retry to check if base node is available - Key: HBASE-5666 URL: https://issues.apache.org/jira/browse/HBASE-5666 Project: HBase Issue Type: Bug Components: regionserver, zookeeper Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Attachments: HBASE-5666-v1.patch, HBASE-5666-v2.patch, HBASE-5666-v3.patch, HBASE-5666-v4.patch, hbase-1-regionserver.log, hbase-2-regionserver.log, hbase-3-regionserver.log, hbase-master.log, hbase-regionserver.log, hbase-zookeeper.log I've a script that starts hbase and a couple of region servers in distributed mode (hbase.cluster.distributed = true) {code} $HBASE_HOME/bin/start-hbase.sh $HBASE_HOME/bin/local-regionservers.sh start 1 2 3 {code} but the region servers are not able to start... It seems that during the RS start the the znode is still not available, and HRegionServer.initializeZooKeeper() check just once if the base not is available. {code} 2012-03-28 21:54:05,013 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master. 2012-03-28 21:54:08,598 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server localhost,60202,133296824: Initialization of RS failed. Hence aborting RS. java.io.IOException: Received the shutdown message while waiting. at org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:626) at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:596) at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:558) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:672) at java.lang.Thread.run(Thread.java:662) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5693) When creating a region, the master initializes it and creates a memstore within the master server
[ https://issues.apache.org/jira/browse/HBASE-5693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5693: - Resolution: Fixed Fix Version/s: 0.96.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I tried the first three locally. They pass for me. Committed trunk. Thanks for the patch N. When creating a region, the master initializes it and creates a memstore within the master server - Key: HBASE-5693 URL: https://issues.apache.org/jira/browse/HBASE-5693 Project: HBase Issue Type: Improvement Components: master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5593.v2.patch, 5693.v1.patch I didn't do a complete analysis, but the attached patch saves more than 0.25s for each region creation and locally all the unit tests work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5682) Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only)
[ https://issues.apache.org/jira/browse/HBASE-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244577#comment-13244577 ] Lars Hofhansl commented on HBASE-5682: -- Let me dig into 0.96 after I get this into 0.94... Wanna cut RC1 soon. From the past comments here I see no objections to posted patch... Will commit soon. Please speak up if you disagree. Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only) -- Key: HBASE-5682 URL: https://issues.apache.org/jira/browse/HBASE-5682 Project: HBase Issue Type: Improvement Components: client Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Critical Fix For: 0.94.0 Attachments: 5682-all-v2.txt, 5682-all-v3.txt, 5682-all-v4.txt, 5682-all.txt, 5682-v2.txt, 5682.txt Just realized that without this HBASE-4805 is broken. I.e. there's no point keeping a persistent HConnection around if it can be rendered permanently unusable if the ZK connection is lost temporarily. Note that this is fixed in 0.96 with HBASE-5399 (but that seems to big to backport) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5665) Repeated split causes HRegionServer failures and breaks table
[ https://issues.apache.org/jira/browse/HBASE-5665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5665: - Resolution: Fixed Fix Version/s: 0.94.0 0.92.2 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to 0.92, 0.94, and trunk. Thanks Cosmin and Matteo. Repeated split causes HRegionServer failures and breaks table -- Key: HBASE-5665 URL: https://issues.apache.org/jira/browse/HBASE-5665 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.92.0, 0.92.1 Reporter: Cosmin Lehene Assignee: Cosmin Lehene Priority: Blocker Fix For: 0.92.2, 0.94.0 Attachments: 5665trunk.v2.patch, HBASE-5665-0.92.patch, HBASE-5665-trunk.patch Repeated splits on large tables (2 consecutive would suffice) will essentially break the table (and the cluster), unrecoverable. The regionserver doing the split dies and the master will get into an infinite loop trying to assign regions that seem to have the files missing from HDFS. The table can be disabled once. upon trying to re-enable it, it will remain in an intermediary state forever. I was able to reproduce this on a smaller table consistently. {code} hbase(main):030:0 (0..1).each{|x| put 't1', #{x}, 'f1:t', 'dd'} hbase(main):030:0 (0..1000).each{|x| split 't1', #{x*10}} {code} Running overlapping splits in parallel (e.g. #{x*10+1}, #{x*10+2}... ) will reproduce the issue almost instantly and consistently. {code} 2012-03-28 10:57:16,320 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Offlined parent region t1,,1332957435767.2fb0473f4e71339e88dab0ee0d4dffa1. in META 2012-03-28 10:57:16,321 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Split requested for t1,5,1332957435767.648d30de55a5cec6fc2f56dcb3c7eee1.. compaction_queue=(0:1), split_queue=10 2012-03-28 10:57:16,343 INFO org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup of failed split of t1,,1332957435767.2fb0473f4e71339e88dab0ee0d4dffa1.; Failed ld2,60020,1332957343833-daughterOpener=2469c5650ea2aeed631eb85d3cdc3124 java.io.IOException: Failed ld2,60020,1332957343833-daughterOpener=2469c5650ea2aeed631eb85d3cdc3124 at org.apache.hadoop.hbase.regionserver.SplitTransaction.openDaughters(SplitTransaction.java:363) at org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:451) at org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.FileNotFoundException: File does not exist: /hbase/t1/589c44cabba419c6ad8c9b427e5894e3.2fb0473f4e71339e88dab0ee0d4dffa1/f1/d62a852c25ad44e09518e102ca557237 at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1822) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.init(DFSClient.java:1813) at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:544) at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:187) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:456) at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:341) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.init(StoreFile.java:1008) at org.apache.hadoop.hbase.io.HalfStoreFileReader.init(HalfStoreFileReader.java:65) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:467) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:548) at org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:284) at org.apache.hadoop.hbase.regionserver.Store.init(Store.java:221) at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2511) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:450) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3229) at org.apache.hadoop.hbase.regionserver.SplitTransaction.openDaughterRegion(SplitTransaction.java:504) at org.apache.hadoop.hbase.regionserver.SplitTransaction$DaughterOpener.run(SplitTransaction.java:484) ... 1 more 2012-03-28 10:57:16,345 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server ld2,60020,1332957343833: Abort; we got an error after point-of-no-return {code}
[jira] [Updated] (HBASE-5665) Repeated split causes HRegionServer failures and breaks table
[ https://issues.apache.org/jira/browse/HBASE-5665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5665: - Attachment: 5665trunk.v2.patch Same as last patch but w/ fixed javadoc... isAvailable is not closed and not closing. Repeated split causes HRegionServer failures and breaks table -- Key: HBASE-5665 URL: https://issues.apache.org/jira/browse/HBASE-5665 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.92.0, 0.92.1 Reporter: Cosmin Lehene Assignee: Cosmin Lehene Priority: Blocker Fix For: 0.92.2, 0.94.0 Attachments: 5665trunk.v2.patch, HBASE-5665-0.92.patch, HBASE-5665-trunk.patch Repeated splits on large tables (2 consecutive would suffice) will essentially break the table (and the cluster), unrecoverable. The regionserver doing the split dies and the master will get into an infinite loop trying to assign regions that seem to have the files missing from HDFS. The table can be disabled once. upon trying to re-enable it, it will remain in an intermediary state forever. I was able to reproduce this on a smaller table consistently. {code} hbase(main):030:0 (0..1).each{|x| put 't1', #{x}, 'f1:t', 'dd'} hbase(main):030:0 (0..1000).each{|x| split 't1', #{x*10}} {code} Running overlapping splits in parallel (e.g. #{x*10+1}, #{x*10+2}... ) will reproduce the issue almost instantly and consistently. {code} 2012-03-28 10:57:16,320 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Offlined parent region t1,,1332957435767.2fb0473f4e71339e88dab0ee0d4dffa1. in META 2012-03-28 10:57:16,321 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Split requested for t1,5,1332957435767.648d30de55a5cec6fc2f56dcb3c7eee1.. compaction_queue=(0:1), split_queue=10 2012-03-28 10:57:16,343 INFO org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup of failed split of t1,,1332957435767.2fb0473f4e71339e88dab0ee0d4dffa1.; Failed ld2,60020,1332957343833-daughterOpener=2469c5650ea2aeed631eb85d3cdc3124 java.io.IOException: Failed ld2,60020,1332957343833-daughterOpener=2469c5650ea2aeed631eb85d3cdc3124 at org.apache.hadoop.hbase.regionserver.SplitTransaction.openDaughters(SplitTransaction.java:363) at org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:451) at org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.FileNotFoundException: File does not exist: /hbase/t1/589c44cabba419c6ad8c9b427e5894e3.2fb0473f4e71339e88dab0ee0d4dffa1/f1/d62a852c25ad44e09518e102ca557237 at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1822) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.init(DFSClient.java:1813) at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:544) at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:187) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:456) at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:341) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.init(StoreFile.java:1008) at org.apache.hadoop.hbase.io.HalfStoreFileReader.init(HalfStoreFileReader.java:65) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:467) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:548) at org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:284) at org.apache.hadoop.hbase.regionserver.Store.init(Store.java:221) at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2511) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:450) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3229) at org.apache.hadoop.hbase.regionserver.SplitTransaction.openDaughterRegion(SplitTransaction.java:504) at org.apache.hadoop.hbase.regionserver.SplitTransaction$DaughterOpener.run(SplitTransaction.java:484) ... 1 more 2012-03-28 10:57:16,345 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server ld2,60020,1332957343833: Abort; we got an error after point-of-no-return {code} http://hastebin.com/diqinibajo.avrasm later edit: (I'm using the last 4 characters from each string) Region 94e3 has storefile
[jira] [Commented] (HBASE-5688) Convert zk root-region-server znode content to pb
[ https://issues.apache.org/jira/browse/HBASE-5688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244612#comment-13244612 ] jirapos...@reviews.apache.org commented on HBASE-5688: -- bq. On 2012-04-01 16:41:00, Ted Yu wrote: bq. src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java, line 48 bq. https://reviews.apache.org/r/4600/diff/1/?file=97843#file97843line48 bq. bq. I think prefixedWithPBMagic would be a better name for this method. Disagree. bq. On 2012-04-01 16:41:00, Ted Yu wrote: bq. src/test/java/org/apache/hadoop/hbase/zookeeper/TestRootRegionTracker.java, line 28 bq. https://reviews.apache.org/r/4600/diff/1/?file=97852#file97852line28 bq. bq. Javadoc would be desirable. Classname says what it does. bq. On 2012-04-01 16:41:00, Ted Yu wrote: bq. src/test/java/org/apache/hadoop/hbase/zookeeper/TestRootRegionTracker.java, line 43 bq. https://reviews.apache.org/r/4600/diff/1/?file=97852#file97852line43 bq. bq. White space. Will fix on commit. - Michael --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4600/#review6606 --- On 2012-04-01 00:18:54, Michael Stack wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/4600/ bq. --- bq. bq. (Updated 2012-04-01 00:18:54) bq. bq. bq. Review request for hbase. bq. bq. bq. Summary bq. --- bq. bq. Changes the content of the root location znode, root-region-server, to be bq. four magic bytes ('PBUF') followed by a protobuf message that holds the bq. ServerName of the server currently hosting root. bq. bq. D src/main/java/org/apache/hadoop/hbase/catalog/RootLocationEditor.java bq.Removed. Had two methods, one to add root-region-server znode and another bq.to removed it. Rather, put these methods in RootRegionTracker. It bq.tracks root-region-server znode. Having all to do w/ root-region-server bq.is more cohesive. Also makes it so can encapsulate in one class bq.all to do w/ create, delete, and reading of root-region-server. bq.We also want to purge the catalog package (See note at head of bq.CatalogTracker). bq. M src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java bq. M src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java bq. M src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java bq.Get root region location from RootRegionTracker rather than from RootLocationEditor. bq. A src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java bq.Utility to do w/ protobuf handling. Has methods to help prefixing bq.and stripping from serialized protobuf messages some 'magic'. bq. A src/main/java/org/apache/hadoop/hbase/protobuf/generated/ZooKeeperProtos.java bq.PB generated. bq. M src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java bq.Use new RootRegionTracker method for getting content of znode rather bq.than do it all here (going via RootRegionTracker, we can keep how bq.the znode content is serialized private to the RootRegionTracker class. bq. M src/main/java/org/apache/hadoop/hbase/zookeeper/RootRegionTracker.java bq.Has the methods that used to be in RootLocationEditor plus a new bq. bq. bq. This addresses bug hbase-5688. bq. https://issues.apache.org/jira/browse/hbase-5688 bq. bq. bq. Diffs bq. - bq. bq.src/main/java/org/apache/hadoop/hbase/catalog/RootLocationEditor.java c90864a bq.src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java b2a5463 bq.src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 64def15 bq.src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java PRE-CREATION bq. src/main/java/org/apache/hadoop/hbase/protobuf/generated/ZooKeeperProtos.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 9c215b4 bq.src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 2f05005 bq.src/main/java/org/apache/hadoop/hbase/zookeeper/RootRegionTracker.java 33e4e71 bq.src/main/protobuf/ZooKeeper.proto PRE-CREATION bq.src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java 533b2bf bq. src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTrackerOnCluster.java fe37156 bq.src/test/java/org/apache/hadoop/hbase/master/TestMasterNoCluster.java 2132036 bq. src/test/java/org/apache/hadoop/hbase/zookeeper/TestRootRegionTracker.java PRE-CREATION bq. bq. Diff: https://reviews.apache.org/r/4600/diff bq. bq. bq. Testing bq. --- bq.
[jira] [Updated] (HBASE-5688) Convert zk root-region-server znode content to pb
[ https://issues.apache.org/jira/browse/HBASE-5688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5688: - Attachment: 5688v5.txt v5 removes a single white space. Its what I'll commit. Convert zk root-region-server znode content to pb - Key: HBASE-5688 URL: https://issues.apache.org/jira/browse/HBASE-5688 Project: HBase Issue Type: Task Reporter: stack Assignee: stack Fix For: 0.96.0 Attachments: 5688.txt, 5688v4.txt, 5688v5.txt Move the root-region-server znode content from the versioned bytes that ServerName.getVersionedBytes outputs to instead be pb. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5688) Convert zk root-region-server znode content to pb
[ https://issues.apache.org/jira/browse/HBASE-5688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5688: - Resolution: Fixed Hadoop Flags: Incompatible change,Reviewed Status: Resolved (was: Patch Available) Committed to trunk. Thanks for review T{o,e}dd. Convert zk root-region-server znode content to pb - Key: HBASE-5688 URL: https://issues.apache.org/jira/browse/HBASE-5688 Project: HBase Issue Type: Task Reporter: stack Assignee: stack Fix For: 0.96.0 Attachments: 5688.txt, 5688v4.txt, 5688v5.txt Move the root-region-server znode content from the versioned bytes that ServerName.getVersionedBytes outputs to instead be pb. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5688) Convert zk root-region-server znode content to pb
[ https://issues.apache.org/jira/browse/HBASE-5688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244621#comment-13244621 ] stack commented on HBASE-5688: -- Oh, and thanks Jimmy for review. Convert zk root-region-server znode content to pb - Key: HBASE-5688 URL: https://issues.apache.org/jira/browse/HBASE-5688 Project: HBase Issue Type: Task Reporter: stack Assignee: stack Fix For: 0.96.0 Attachments: 5688.txt, 5688v4.txt, 5688v5.txt Move the root-region-server znode content from the versioned bytes that ServerName.getVersionedBytes outputs to instead be pb. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5692) Add real action time for HLogPrettyPrinter
[ https://issues.apache.org/jira/browse/HBASE-5692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5692: - Attachment: HBASE-5665-trunk.v2.patch Same as v1 w/ some formatting changes. +1 on this patch. Add real action time for HLogPrettyPrinter -- Key: HBASE-5692 URL: https://issues.apache.org/jira/browse/HBASE-5692 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Xing Shi Priority: Minor Attachments: HBASE-5665-trunk.v2.patch, HBASE-5692.patch Now the HLogPrettyPrinter print the log without real op time but the timestamp {quote} Sequence 4 from region ee9877dfd55624f50b20acf572416a88 in table t5 Action: row: r column: f3:q at time: Thu Jan 01 08:02:03 CST 1970 {quote} Maybe we need to know the real op time like this {quote} Sequence 4 from region ee9877dfd55624f50b20acf572416a88 in table t5 at time: Sun Apr 01 10:42:53 CST 2012 Action: row: r column: f3:q timestamp: Thu Jan 01 08:02:03 CST 1970 {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5692) Add real action time for HLogPrettyPrinter
[ https://issues.apache.org/jira/browse/HBASE-5692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5692: - Status: Patch Available (was: Open) Add real action time for HLogPrettyPrinter -- Key: HBASE-5692 URL: https://issues.apache.org/jira/browse/HBASE-5692 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Xing Shi Priority: Minor Attachments: HBASE-5665-trunk.v2.patch, HBASE-5692.patch Now the HLogPrettyPrinter print the log without real op time but the timestamp {quote} Sequence 4 from region ee9877dfd55624f50b20acf572416a88 in table t5 Action: row: r column: f3:q at time: Thu Jan 01 08:02:03 CST 1970 {quote} Maybe we need to know the real op time like this {quote} Sequence 4 from region ee9877dfd55624f50b20acf572416a88 in table t5 at time: Sun Apr 01 10:42:53 CST 2012 Action: row: r column: f3:q timestamp: Thu Jan 01 08:02:03 CST 1970 {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5697) Audit HBase for usage of deprecated hadoop 0.20.x property names.
[ https://issues.apache.org/jira/browse/HBASE-5697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244671#comment-13244671 ] Jonathan Hsieh commented on HBASE-5697: --- Hopefully that is a comprehensive list. My guess is that only a handful are relevant. I started testing on hadoop 23 and there are definitely some new deprecation warnings that show up in logs/console. Also, some have been bugs in previous versions - I've gotten snagged on this one before: fs.default.name - fs.defaultFS Audit HBase for usage of deprecated hadoop 0.20.x property names. - Key: HBASE-5697 URL: https://issues.apache.org/jira/browse/HBASE-5697 Project: HBase Issue Type: Task Reporter: Jonathan Hsieh Many xml config properties in Hadoop have changed in 0.23. We should audit hbase to insulate it from hadoop property name changes. Here is a list of the hadoop property name changes: http://hadoop.apache.org/common/docs/r0.23.1/hadoop-project-dist/hadoop-common/DeprecatedProperties.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-2186) hbase master should publish more stats
[ https://issues.apache.org/jira/browse/HBASE-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-2186: - Attachment: screenshot-1.jpg The Master bean hbase master should publish more stats -- Key: HBASE-2186 URL: https://issues.apache.org/jira/browse/HBASE-2186 Project: HBase Issue Type: Bug Components: metrics Reporter: ryan rawson Attachments: screenshot-1.jpg, screenshot-2.jpg hbase master only publishes cluster.requests to ganglia. we should also publish regionserver count and other interesting metrics. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-2186) hbase master should publish more stats
[ https://issues.apache.org/jira/browse/HBASE-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-2186: - Attachment: screenshot-2.jpg The master stats bean hbase master should publish more stats -- Key: HBASE-2186 URL: https://issues.apache.org/jira/browse/HBASE-2186 Project: HBase Issue Type: Bug Components: metrics Reporter: ryan rawson Attachments: screenshot-1.jpg, screenshot-2.jpg hbase master only publishes cluster.requests to ganglia. we should also publish regionserver count and other interesting metrics. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-2186) hbase master should publish more stats
[ https://issues.apache.org/jira/browse/HBASE-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-2186. -- Resolution: Duplicate Resolving at Otis's suggestion. Master has more stats now. Could do w/ more but let this be enough to close this issue. hbase master should publish more stats -- Key: HBASE-2186 URL: https://issues.apache.org/jira/browse/HBASE-2186 Project: HBase Issue Type: Bug Components: metrics Reporter: ryan rawson Attachments: screenshot-1.jpg, screenshot-2.jpg hbase master only publishes cluster.requests to ganglia. we should also publish regionserver count and other interesting metrics. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5700) [89-fb] Fix TestMiniClusterLoad* test failures
[89-fb] Fix TestMiniClusterLoad* test failures -- Key: HBASE-5700 URL: https://issues.apache.org/jira/browse/HBASE-5700 Project: HBase Issue Type: Bug Reporter: Mikhail Bautin Assignee: Mikhail Bautin Priority: Minor Porting TestMiniClusterLoad* tests to 89-fb in HBASE-5679 uncovered certain problems with mini-cluster setup in 89-fb that need to be fixed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5701) Put RegionServerDynamicStatistics under RegionServer in MBean hierarchy rather than have it as a peer.
Put RegionServerDynamicStatistics under RegionServer in MBean hierarchy rather than have it as a peer. -- Key: HBASE-5701 URL: https://issues.apache.org/jira/browse/HBASE-5701 Project: HBase Issue Type: Bug Reporter: stack -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5692) Add real action time for HLogPrettyPrinter
[ https://issues.apache.org/jira/browse/HBASE-5692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244689#comment-13244689 ] Hadoop QA commented on HBASE-5692: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12521048/HBASE-5665-trunk.v2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1371//console This message is automatically generated. Add real action time for HLogPrettyPrinter -- Key: HBASE-5692 URL: https://issues.apache.org/jira/browse/HBASE-5692 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Xing Shi Priority: Minor Attachments: HBASE-5665-trunk.v2.patch, HBASE-5692.patch Now the HLogPrettyPrinter print the log without real op time but the timestamp {quote} Sequence 4 from region ee9877dfd55624f50b20acf572416a88 in table t5 Action: row: r column: f3:q at time: Thu Jan 01 08:02:03 CST 1970 {quote} Maybe we need to know the real op time like this {quote} Sequence 4 from region ee9877dfd55624f50b20acf572416a88 in table t5 at time: Sun Apr 01 10:42:53 CST 2012 Action: row: r column: f3:q timestamp: Thu Jan 01 08:02:03 CST 1970 {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5701) Put RegionServerDynamicStatistics under RegionServer in MBean hierarchy rather than have it as a peer.
[ https://issues.apache.org/jira/browse/HBASE-5701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5701: - Attachment: screenshot-1.jpg See how the mbeans are currently arrayed where hadoop is top level and then we have master and regionserver AND regionserverdynamic levels. In regionserver we have the regionserver mbean and another regionserverstatistics mbean. Over in regionserverdynamic we have regionserverdynamicstatistics mbean. Put RegionServerDynamicStatistics under RegionServer in MBean hierarchy rather than have it as a peer. -- Key: HBASE-5701 URL: https://issues.apache.org/jira/browse/HBASE-5701 Project: HBase Issue Type: Bug Reporter: stack Attachments: screenshot-1.jpg -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5701) Put RegionServerDynamicStatistics under RegionServer in MBean hierarchy rather than have it as a peer.
[ https://issues.apache.org/jira/browse/HBASE-5701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244691#comment-13244691 ] stack commented on HBASE-5701: -- This commit added the dynamic mbean: {code} r1185835 | nspiegelberg | 2011-10-18 13:23:28 -0700 (Tue, 18 Oct 2011) | 1 line HBASE-4219 Per Column Family Metrics {code} Put RegionServerDynamicStatistics under RegionServer in MBean hierarchy rather than have it as a peer. -- Key: HBASE-5701 URL: https://issues.apache.org/jira/browse/HBASE-5701 Project: HBase Issue Type: Bug Reporter: stack Attachments: screenshot-1.jpg -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5692) Add real action time for HLogPrettyPrinter
[ https://issues.apache.org/jira/browse/HBASE-5692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5692: - Attachment: 5692v2.patch Attach the 'right' patch Add real action time for HLogPrettyPrinter -- Key: HBASE-5692 URL: https://issues.apache.org/jira/browse/HBASE-5692 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Xing Shi Priority: Minor Attachments: 5692v2.patch, HBASE-5665-trunk.v2.patch, HBASE-5692.patch Now the HLogPrettyPrinter print the log without real op time but the timestamp {quote} Sequence 4 from region ee9877dfd55624f50b20acf572416a88 in table t5 Action: row: r column: f3:q at time: Thu Jan 01 08:02:03 CST 1970 {quote} Maybe we need to know the real op time like this {quote} Sequence 4 from region ee9877dfd55624f50b20acf572416a88 in table t5 at time: Sun Apr 01 10:42:53 CST 2012 Action: row: r column: f3:q timestamp: Thu Jan 01 08:02:03 CST 1970 {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5692) Add real action time for HLogPrettyPrinter
[ https://issues.apache.org/jira/browse/HBASE-5692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5692: - Status: Open (was: Patch Available) Add real action time for HLogPrettyPrinter -- Key: HBASE-5692 URL: https://issues.apache.org/jira/browse/HBASE-5692 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Xing Shi Priority: Minor Attachments: 5692v2.patch, HBASE-5665-trunk.v2.patch, HBASE-5692.patch Now the HLogPrettyPrinter print the log without real op time but the timestamp {quote} Sequence 4 from region ee9877dfd55624f50b20acf572416a88 in table t5 Action: row: r column: f3:q at time: Thu Jan 01 08:02:03 CST 1970 {quote} Maybe we need to know the real op time like this {quote} Sequence 4 from region ee9877dfd55624f50b20acf572416a88 in table t5 at time: Sun Apr 01 10:42:53 CST 2012 Action: row: r column: f3:q timestamp: Thu Jan 01 08:02:03 CST 1970 {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira