[jira] [Commented] (HBASE-6632) [0.92 UNIT TESTS] testCreateTableRPCTimeOut sets rpc timeout to 1500ms and leaves it (testHundredsOfTable fails w/ 1500ms timeout)
[ https://issues.apache.org/jira/browse/HBASE-6632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439303#comment-13439303 ] Hudson commented on HBASE-6632: --- Integrated in HBase-0.94 #414 (See [https://builds.apache.org/job/HBase-0.94/414/]) HBASE-6632 [0.92 UNIT TESTS] testCreateTableRPCTimeOut sets rpc timeout to 1500ms and leaves it (testHundredsOfTable fails w/ 1500ms timeout) (Revision 1375898) Result = SUCCESS stack : Files : * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java [0.92 UNIT TESTS] testCreateTableRPCTimeOut sets rpc timeout to 1500ms and leaves it (testHundredsOfTable fails w/ 1500ms timeout) -- Key: HBASE-6632 URL: https://issues.apache.org/jira/browse/HBASE-6632 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Fix For: 0.92.2, 0.94.2 Attachments: 6632-trunk.txt, 6632.txt I see that in 0.92 #502 and #501 that TestAdmin.testHundredsOfTable fails because socket times out after 1500ms. I see in TestAdmin that before this test runs, testCreateTableRPCTimeOut sets the socket timeout to 1500 and then does not set it back. Maybe the obnoxious testHundredsOfTable will pass more often if it has the default rpc timeout. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3271) Allow .META. table to be exported
[ https://issues.apache.org/jira/browse/HBASE-3271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439305#comment-13439305 ] stack commented on HBASE-3271: -- Patch looks fine to me. Will integrate in next day or so unless objection. Allow .META. table to be exported - Key: HBASE-3271 URL: https://issues.apache.org/jira/browse/HBASE-3271 Project: HBase Issue Type: Improvement Components: util Affects Versions: 0.20.6 Reporter: Ted Yu Attachments: HBASE-3271.patch I tried to export .META. table in 0.20.6 and got: [hadoop@us01-ciqps1-name01 hbase]$ bin/hbase org.apache.hadoop.hbase.mapreduce.Export .META. h-meta 1 0 0 10/11/23 20:59:05 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 2010-11-23 20:59:05.255::INFO: Logging to STDERR via org.mortbay.log.StdErrLog 2010-11-23 20:59:05.255::INFO: verisons=1, starttime=0, endtime=9223372036854775807 10/11/23 20:59:05 INFO zookeeper.ZooKeeper: Client environment:zookeeper.version=3.2.2-888565, built on 12/08/2009 21:51 GMT 10/11/23 20:59:05 INFO zookeeper.ZooKeeper: Client environment:host.name=us01-ciqps1-name01.carrieriq.com 10/11/23 20:59:05 INFO zookeeper.ZooKeeper: Client environment:java.version=1.6.0_21 10/11/23 20:59:05 INFO zookeeper.ZooKeeper: Client environment:java.vendor=Sun Microsystems Inc. ... 10/11/23 20:59:05 INFO zookeeper.ClientCnxn: Server connection successful 10/11/23 20:59:05 DEBUG zookeeper.ZooKeeperWrapper: Read ZNode /hbase/root-region-server got 10.202.50.112:60020 10/11/23 20:59:05 DEBUG client.HConnectionManager$TableServers: Found ROOT at 10.202.50.112:60020 10/11/23 20:59:05 DEBUG client.HConnectionManager$TableServers: Cached location for .META.,,1 is us01-ciqps1-grid02.carrieriq.com:60020 Exception in thread main java.io.IOException: Expecting at least one region. at org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:281) at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779) at org.apache.hadoop.mapreduce.Job.submit(Job.java:432) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447) at org.apache.hadoop.hbase.mapreduce.Export.main(Export.java:146) Related code is: if (keys == null || keys.getFirst() == null || keys.getFirst().length == 0) { throw new IOException(Expecting at least one region.); } My intention was to save the dangling rows in .META. (for future investigation) which prevented a table from being created. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6632) [0.92 UNIT TESTS] testCreateTableRPCTimeOut sets rpc timeout to 1500ms and leaves it (testHundredsOfTable fails w/ 1500ms timeout)
[ https://issues.apache.org/jira/browse/HBASE-6632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439306#comment-13439306 ] Hudson commented on HBASE-6632: --- Integrated in HBase-TRUNK #3253 (See [https://builds.apache.org/job/HBase-TRUNK/3253/]) HBASE-6632 [0.92 UNIT TESTS] testCreateTableRPCTimeOut sets rpc timeout to 1500ms and leaves it (testHundredsOfTable fails w/ 1500ms timeout) (Revision 1375897) Result = FAILURE stack : Files : * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java [0.92 UNIT TESTS] testCreateTableRPCTimeOut sets rpc timeout to 1500ms and leaves it (testHundredsOfTable fails w/ 1500ms timeout) -- Key: HBASE-6632 URL: https://issues.apache.org/jira/browse/HBASE-6632 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Fix For: 0.92.2, 0.94.2 Attachments: 6632-trunk.txt, 6632.txt I see that in 0.92 #502 and #501 that TestAdmin.testHundredsOfTable fails because socket times out after 1500ms. I see in TestAdmin that before this test runs, testCreateTableRPCTimeOut sets the socket timeout to 1500 and then does not set it back. Maybe the obnoxious testHundredsOfTable will pass more often if it has the default rpc timeout. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6618) Implement FuzzyRowFilter with ranges support
[ https://issues.apache.org/jira/browse/HBASE-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439311#comment-13439311 ] Anil Gupta commented on HBASE-6618: --- Hi Alex, I agree with you idea of RangeBased Fuzzy Filter. However, I would like to take a phased approach in developing this: In your proposal, the user can provide multiple fuzzy ranges in a single scan. i.e. any 4 bytesany 6 bytes value between _0001 and 0099any 3 bytesany 4 bytes value between _001 and _099 Instead of the above, IMO lets try to make a filter for any 4 bytesany 6 bytes value between _0001 and 0099any 3 bytes or any 4 bytesany 6 bytes value between _0001 and 0099. Once we develop this then we can enhance it to use multiple fuzzy ranges. This is just my thought/approach of developing this. Let me know your opinion. From this week, at work I had to shift focus from HBase to Hive and HCatalog for another POC. So, I'll be squeezing time for this JIRA out of work schedule. I'll start looking into the current implementation of FuzzyRowFilter to get idea about implementation. Thanks, Anil Gupta Software Engineer II, Intuit, Inc Implement FuzzyRowFilter with ranges support Key: HBASE-6618 URL: https://issues.apache.org/jira/browse/HBASE-6618 Project: HBase Issue Type: New Feature Components: filters Reporter: Alex Baranau Priority: Minor Apart from current ability to specify fuzzy row filter e.g. for userId_actionId format as _0004 (where 0004 - actionId) it would be great to also have ability to specify the fuzzy range , e.g. _0004, ..., _0099. See initial discussion here: http://search-hadoop.com/m/WVLJdX0Z65 Note: currently it is possible to provide multiple fuzzy row rules to existing FuzzyRowFilter, but in case when the range is big (contains thousands of values) it is not efficient. Filter should perform efficient fast-forwarding during the scan (this is what distinguishes it from regex row filter). While such functionality may seem like a proper fit for custom filter (i.e. not including into standard filter set) it looks like the filter may be very re-useable. We may judge based on the implementation that will hopefully be added. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6632) [0.92 UNIT TESTS] testCreateTableRPCTimeOut sets rpc timeout to 1500ms and leaves it (testHundredsOfTable fails w/ 1500ms timeout)
[ https://issues.apache.org/jira/browse/HBASE-6632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439314#comment-13439314 ] nkeywal commented on HBASE-6632: fwiw, I removed TestAdmin#testHundredsOfTable in 0.94 in HBASE-5114: ??The patch includes as well the removal of TestAdmin#testHundredsOfTable because this test takes 2 minutes to execute; but proves nothing on success as there is no check on resources used.?? [0.92 UNIT TESTS] testCreateTableRPCTimeOut sets rpc timeout to 1500ms and leaves it (testHundredsOfTable fails w/ 1500ms timeout) -- Key: HBASE-6632 URL: https://issues.apache.org/jira/browse/HBASE-6632 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Fix For: 0.92.2, 0.94.2 Attachments: 6632-trunk.txt, 6632.txt I see that in 0.92 #502 and #501 that TestAdmin.testHundredsOfTable fails because socket times out after 1500ms. I see in TestAdmin that before this test runs, testCreateTableRPCTimeOut sets the socket timeout to 1500 and then does not set it back. Maybe the obnoxious testHundredsOfTable will pass more often if it has the default rpc timeout. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6632) [0.92 UNIT TESTS] testCreateTableRPCTimeOut sets rpc timeout to 1500ms and leaves it (testHundredsOfTable fails w/ 1500ms timeout)
[ https://issues.apache.org/jira/browse/HBASE-6632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439316#comment-13439316 ] stack commented on HBASE-6632: -- @nkeywal Hmm... yeah... I was tempted to plain remove it because its kinda dumb test -- tables have no data and one region only does same task 100 times. If this doesn't fix it, I will remove it if it fails again. [0.92 UNIT TESTS] testCreateTableRPCTimeOut sets rpc timeout to 1500ms and leaves it (testHundredsOfTable fails w/ 1500ms timeout) -- Key: HBASE-6632 URL: https://issues.apache.org/jira/browse/HBASE-6632 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Fix For: 0.92.2, 0.94.2 Attachments: 6632-trunk.txt, 6632.txt I see that in 0.92 #502 and #501 that TestAdmin.testHundredsOfTable fails because socket times out after 1500ms. I see in TestAdmin that before this test runs, testCreateTableRPCTimeOut sets the socket timeout to 1500 and then does not set it back. Maybe the obnoxious testHundredsOfTable will pass more often if it has the default rpc timeout. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6584) Test flappers due to port 60000 already in use.
[ https://issues.apache.org/jira/browse/HBASE-6584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] rajeshbabu updated HBASE-6584: -- Attachment: HBASE-6584_trunk_3.patch Test flappers due to port 6 already in use. --- Key: HBASE-6584 URL: https://issues.apache.org/jira/browse/HBASE-6584 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.94.0 Reporter: Sameer Vaishampayan Priority: Critical Attachments: HBASE-6584_trunk_2.patch, HBASE-6584_trunk_3.patch, HBASE-6584_trunk.patch, HBASE-6584_trunk.patch, patch2.diff -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6584) Test flappers due to port 60000 already in use.
[ https://issues.apache.org/jira/browse/HBASE-6584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439321#comment-13439321 ] rajeshbabu commented on HBASE-6584: --- @Stack, Its better to set it to 0 instead of fixed port. Even tests run parallelly it will get unused port only. Updated same in latest patch. Test flappers due to port 6 already in use. --- Key: HBASE-6584 URL: https://issues.apache.org/jira/browse/HBASE-6584 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.94.0 Reporter: Sameer Vaishampayan Priority: Critical Attachments: HBASE-6584_trunk_2.patch, HBASE-6584_trunk_3.patch, HBASE-6584_trunk.patch, HBASE-6584_trunk.patch, patch2.diff -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6584) Test flappers due to port 60000 already in use.
[ https://issues.apache.org/jira/browse/HBASE-6584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] rajeshbabu updated HBASE-6584: -- Status: Open (was: Patch Available) Test flappers due to port 6 already in use. --- Key: HBASE-6584 URL: https://issues.apache.org/jira/browse/HBASE-6584 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.94.0 Reporter: Sameer Vaishampayan Priority: Critical Attachments: HBASE-6584_trunk_2.patch, HBASE-6584_trunk_3.patch, HBASE-6584_trunk.patch, HBASE-6584_trunk.patch, patch2.diff -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6584) Test flappers due to port 60000 already in use.
[ https://issues.apache.org/jira/browse/HBASE-6584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] rajeshbabu updated HBASE-6584: -- Status: Patch Available (was: Open) Test flappers due to port 6 already in use. --- Key: HBASE-6584 URL: https://issues.apache.org/jira/browse/HBASE-6584 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.94.0 Reporter: Sameer Vaishampayan Priority: Critical Attachments: HBASE-6584_trunk_2.patch, HBASE-6584_trunk_3.patch, HBASE-6584_trunk.patch, HBASE-6584_trunk.patch, patch2.diff -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6584) Test flappers due to port 60000 already in use.
[ https://issues.apache.org/jira/browse/HBASE-6584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439343#comment-13439343 ] Hadoop QA commented on HBASE-6584: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12541873/HBASE-6584_trunk_3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 5 javac compiler warnings (more than the trunk's current 4 warnings). -1 findbugs. The patch appears to introduce 9 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2644//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2644//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2644//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2644//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2644//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2644//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2644//console This message is automatically generated. Test flappers due to port 6 already in use. --- Key: HBASE-6584 URL: https://issues.apache.org/jira/browse/HBASE-6584 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.94.0 Reporter: Sameer Vaishampayan Priority: Critical Attachments: HBASE-6584_trunk_2.patch, HBASE-6584_trunk_3.patch, HBASE-6584_trunk.patch, HBASE-6584_trunk.patch, patch2.diff -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6581) Build with hadoop.profile=3.0
[ https://issues.apache.org/jira/browse/HBASE-6581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439347#comment-13439347 ] Eric Charles commented on HBASE-6581: - Just tried it quickly, and I now have symbol : method setHostName(java.lang.String) [ERROR] location: class org.apache.hadoop.hdfs.protocol.DatanodeInfo Ivan, I will fix it in the coming days, if you don't do it before me :) Build with hadoop.profile=3.0 - Key: HBASE-6581 URL: https://issues.apache.org/jira/browse/HBASE-6581 Project: HBase Issue Type: Bug Reporter: Eric Charles Attachments: HBASE-6581-1.patch, HBASE-6581-2.patch Building trunk with hadoop.profile=3.0 gives exceptions (see [1]) due to change in the hadoop maven modules naming (and also usage of 3.0-SNAPSHOT instead of 3.0.0-SNAPSHOT in hbase-common). I can provide a patch that would move most of hadoop dependencies in their respective profiles and will define the correct hadoop deps in the 3.0 profile. Please tell me if that's ok to go this way. Thx, Eric [1] $ mvn clean install -Dhadoop.profile=3.0 [INFO] Scanning for projects... [ERROR] The build could not read 3 projects - [Help 1] [ERROR] [ERROR] The project org.apache.hbase:hbase-server:0.95-SNAPSHOT (/d/hbase.svn/hbase-server/pom.xml) has 3 errors [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-common:jar is missing. @ line 655, column 21 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-annotations:jar is missing. @ line 659, column 21 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-minicluster:jar is missing. @ line 663, column 21 [ERROR] [ERROR] The project org.apache.hbase:hbase-common:0.95-SNAPSHOT (/d/hbase.svn/hbase-common/pom.xml) has 3 errors [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-common:jar is missing. @ line 170, column 21 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-annotations:jar is missing. @ line 174, column 21 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-minicluster:jar is missing. @ line 178, column 21 [ERROR] [ERROR] The project org.apache.hbase:hbase-it:0.95-SNAPSHOT (/d/hbase.svn/hbase-it/pom.xml) has 3 errors [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-common:jar is missing. @ line 220, column 18 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-annotations:jar is missing. @ line 224, column 21 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-minicluster:jar is missing. @ line 228, column 21 [ERROR] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6632) [0.92 UNIT TESTS] testCreateTableRPCTimeOut sets rpc timeout to 1500ms and leaves it (testHundredsOfTable fails w/ 1500ms timeout)
[ https://issues.apache.org/jira/browse/HBASE-6632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439361#comment-13439361 ] Hudson commented on HBASE-6632: --- Integrated in HBase-0.92 #513 (See [https://builds.apache.org/job/HBase-0.92/513/]) HBASE-6632 [0.92 UNIT TESTS] testCreateTableRPCTimeOut sets rpc timeout to 1500ms and leaves it (testHundredsOfTable fails w/ 1500ms timeout) (Revision 1375899) Result = FAILURE stack : Files : * /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java [0.92 UNIT TESTS] testCreateTableRPCTimeOut sets rpc timeout to 1500ms and leaves it (testHundredsOfTable fails w/ 1500ms timeout) -- Key: HBASE-6632 URL: https://issues.apache.org/jira/browse/HBASE-6632 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Fix For: 0.92.2, 0.94.2 Attachments: 6632-trunk.txt, 6632.txt I see that in 0.92 #502 and #501 that TestAdmin.testHundredsOfTable fails because socket times out after 1500ms. I see in TestAdmin that before this test runs, testCreateTableRPCTimeOut sets the socket timeout to 1500 and then does not set it back. Maybe the obnoxious testHundredsOfTable will pass more often if it has the default rpc timeout. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6633) Adding new hooks to the split flow - For roll backs and one final hook after split is completed either successfully or failed
ramkrishna.s.vasudevan created HBASE-6633: - Summary: Adding new hooks to the split flow - For roll backs and one final hook after split is completed either successfully or failed Key: HBASE-6633 URL: https://issues.apache.org/jira/browse/HBASE-6633 Project: HBase Issue Type: Improvement Components: coprocessors Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Currently we have two hooks in the split flow of a region. PreSplit() and postSplit(). But not always these are helpful in case i have a problem in preSplit() or postSplit() i need to do a rollback of the current region or the region that i am handling thro the hooks. So its better if we have a hook in the rollback code and also one final hook say postCompleteSplit() so that CP can take any corrective action. Pls do suggest if i can provide a patch for this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6632) [0.92 UNIT TESTS] testCreateTableRPCTimeOut sets rpc timeout to 1500ms and leaves it (testHundredsOfTable fails w/ 1500ms timeout)
[ https://issues.apache.org/jira/browse/HBASE-6632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439423#comment-13439423 ] Hudson commented on HBASE-6632: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #142 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/142/]) HBASE-6632 [0.92 UNIT TESTS] testCreateTableRPCTimeOut sets rpc timeout to 1500ms and leaves it (testHundredsOfTable fails w/ 1500ms timeout) (Revision 1375897) Result = FAILURE stack : Files : * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java [0.92 UNIT TESTS] testCreateTableRPCTimeOut sets rpc timeout to 1500ms and leaves it (testHundredsOfTable fails w/ 1500ms timeout) -- Key: HBASE-6632 URL: https://issues.apache.org/jira/browse/HBASE-6632 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Fix For: 0.92.2, 0.94.2 Attachments: 6632-trunk.txt, 6632.txt I see that in 0.92 #502 and #501 that TestAdmin.testHundredsOfTable fails because socket times out after 1500ms. I see in TestAdmin that before this test runs, testCreateTableRPCTimeOut sets the socket timeout to 1500 and then does not set it back. Maybe the obnoxious testHundredsOfTable will pass more often if it has the default rpc timeout. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6631) TestHMasterRPCException in 0.92 failed twice on socket timeout
[ https://issues.apache.org/jira/browse/HBASE-6631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439424#comment-13439424 ] Hudson commented on HBASE-6631: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #142 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/142/]) HBASE-6631 TestHMasterRPCException in 0.92 failed twice on socket timeout (Revision 1375849) Result = FAILURE stack : Files : * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestHMasterRPCException.java TestHMasterRPCException in 0.92 failed twice on socket timeout -- Key: HBASE-6631 URL: https://issues.apache.org/jira/browse/HBASE-6631 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Fix For: 0.92.2, 0.94.2 Attachments: 6631-trunk.txt, 6631-trunk.txt, 6631.txt #502 and #498 0.92 builds have TestHMasterRPCException failing because of socket timeout when servernotrunning is expected. Socket timeout is 100ms only. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6364) Powering down the server host holding the .META. table causes HBase Client to take excessively long to recover and connect to reassigned .META. table
[ https://issues.apache.org/jira/browse/HBASE-6364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-6364: --- Attachment: 6364.94.v2.nolargetest.patch 6364.94.v2.nolargetest.patch contains the patch for 0.94. My own test depends on a class that does not exist in 0.94; so I didn't test it on 0.95 Unit tests ok, except testClientPoolRoundRobin(org.apache.hadoop.hbase.client.TestFromClientSide): The number of versions of '[B@4c9cde9a:[B@4eda77c1 did not match 4 expected:4 but was:3 failed once, second try ok. Committed. Powering down the server host holding the .META. table causes HBase Client to take excessively long to recover and connect to reassigned .META. table - Key: HBASE-6364 URL: https://issues.apache.org/jira/browse/HBASE-6364 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.6, 0.92.1, 0.94.0 Reporter: Suraj Varma Assignee: nkeywal Labels: client Fix For: 0.96.0, 0.94.2 Attachments: 6364.94.v2.nolargetest.patch, 6364-host-serving-META.v1.patch, 6364.v11.nolargetest.patch, 6364.v1.patch, 6364.v1.patch, 6364.v2.patch, 6364.v3.patch, 6364.v3.patch, 6364.v5.patch, 6364.v5.withtests.patch, 6364.v6.patch, 6364.v6.withtests.patch, 6364.v7.withtests.patch, 6364.v8.withtests.patch, 6364.v9.patch, stacktrace.txt When a server host with a Region Server holding the .META. table is powered down on a live cluster, while the HBase cluster itself detects and reassigns the .META. table, connected HBase Client's take an excessively long time to detect this and re-discover the reassigned .META. Workaround: Decrease the ipc.socket.timeout on HBase Client side to a low value (default is 20s leading to 35 minute recovery time; we were able to get acceptable results with 100ms getting a 3 minute recovery) This was found during some hardware failure testing scenarios. Test Case: 1) Apply load via client app on HBase cluster for several minutes 2) Power down the region server holding the .META. server (i.e. power off ... and keep it off) 3) Measure how long it takes for cluster to reassign META table and for client threads to re-lookup and re-orient to the lesser cluster (minus the RS and DN on that host). Observation: 1) Client threads spike up to maxThreads size ... and take over 35 mins to recover (i.e. for the thread count to go back to normal) - no client calls are serviced - they just back up on a synchronized method (see #2 below) 2) All the client app threads queue up behind the oahh.ipc.HBaseClient#setupIOStreams method http://tinyurl.com/7js53dj After taking several thread dumps we found that the thread within this synchronized method was blocked on NetUtils.connect(this.socket, remoteId.getAddress(), getSocketTimeout(conf)); The client thread that gets the synchronized lock would try to connect to the dead RS (till socket times out after 20s), retries, and then the next thread gets in and so forth in a serial manner. Workaround: --- Default ipc.socket.timeout is set to 20s. We dropped this to a low number (1000 ms, 100 ms, etc) on the client side hbase-site.xml. With this setting, the client threads recovered in a couple of minutes by failing fast and re-discovering the .META. table on a reassigned RS. Assumption: This ipc.socket.timeout is only ever used during the initial HConnection setup via the NetUtils.connect and should only ever be used when connectivity to a region server is lost and needs to be re-established. i.e it does not affect the normal RPC actiivity as this is just the connect timeout. During RS GC periods, any _new_ clients trying to connect will fail and will require .META. table re-lookups. This above timeout workaround is only for the HBase client side. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6627) TestMultiVersions.testGetRowVersions is flaky
[ https://issues.apache.org/jira/browse/HBASE-6627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439453#comment-13439453 ] nkeywal commented on HBASE-6627: hadoop-qa didn't run the patch. And I don't reproduce the issue locally now, with or without the patch. I keep the jira opened for now. TestMultiVersions.testGetRowVersions is flaky - Key: HBASE-6627 URL: https://issues.apache.org/jira/browse/HBASE-6627 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.96.0 Environment: hadoop-qa mainly, seems to happen tests in parallel; difficult to reproduce on a single test. Reporter: nkeywal Assignee: nkeywal Attachments: 6627.v1.patch org.apache.hadoop.hbase.TestMultiVersions.testGetRowVersions Shutting down Stacktrace java.io.IOException: Shutting down at org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:229) at org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:92) at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:688) at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:661) at org.apache.hadoop.hbase.TestMultiVersions.testGetRowVersions(TestMultiVersions.java:143) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30) at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:47) at org.junit.rules.RunRules.evaluate(RunRules.java:18) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) at org.junit.runners.ParentRunner$3.run -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6593) TestAdmin times out sometimes
[ https://issues.apache.org/jira/browse/HBASE-6593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439495#comment-13439495 ] Hudson commented on HBASE-6593: --- Integrated in HBase-0.94-security #48 (See [https://builds.apache.org/job/HBase-0.94-security/48/]) HBASE-6593 TestAdmin times out sometimes (Revision 1374063) Result = FAILURE jxiang : Files : * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java TestAdmin times out sometimes - Key: HBASE-6593 URL: https://issues.apache.org/jira/browse/HBASE-6593 Project: HBase Issue Type: Test Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Fix For: 0.96.0, 0.94.2 Attachments: trunk-6593.patch In TestAdmin#splitTest, individual put is used to prepare the test data. We can group them together so as to avoid possible timeout. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6478) TestClassLoading.testClassLoadingFromLibDirInJar occasionally fails
[ https://issues.apache.org/jira/browse/HBASE-6478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439497#comment-13439497 ] Hudson commented on HBASE-6478: --- Integrated in HBase-0.94-security #48 (See [https://builds.apache.org/job/HBase-0.94-security/48/]) HBASE-6478. TestClassLoading.testClassLoadingFromLibDirInJar occasionally fails (Revision 1373521) Result = FAILURE apurtell : Files : * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/coprocessor/TestClassLoading.java TestClassLoading.testClassLoadingFromLibDirInJar occasionally fails --- Key: HBASE-6478 URL: https://issues.apache.org/jira/browse/HBASE-6478 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.94.0 Reporter: zhou wenjian Fix For: 0.96.0, 0.94.2 Attachments: HBASE-6478-trunk.patch, HBASE-6478-trunk-v2.patch, HBASE-6478-trunk-v3.patch, HBASE-6478-trunk-v4.patch When hudson runs for HBASE-6459, it encounters a failed testcase in org.apache.hadoop.hbase.coprocessor.TestClassLoading.testClassLoadingFromLibDirInJar. The link is https://builds.apache.org/job/PreCommit-HBASE-Build/2455/testReport/org.apache.hadoop.hbase.coprocessor/TestClassLoading/testClassLoadingFromLibDirInJar/ I check the log, and find that the function waitTableAvailable will only check the meta table, when rs open the region and update the metalocation in meta, it may not be added to the onlineregions in rs. for (HRegion region: hbase.getRegionServer(0).getOnlineRegionsLocalContext()) { this Loop will ship, and found1 will be false altogether. that's why the testcase failed. So maybe we can hbave some strictly check when table is created -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6562) Fake KVs are sometimes passed to filters
[ https://issues.apache.org/jira/browse/HBASE-6562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439496#comment-13439496 ] Hudson commented on HBASE-6562: --- Integrated in HBase-0.94-security #48 (See [https://builds.apache.org/job/HBase-0.94-security/48/]) HBASE-6562 Fake KVs are sometimes passed to filters (Revision 1373679) Result = FAILURE larsh : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/KeyValue.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/client/TestFakeKeyInFilter.java Fake KVs are sometimes passed to filters Key: HBASE-6562 URL: https://issues.apache.org/jira/browse/HBASE-6562 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Fix For: 0.96.0, 0.94.2 Attachments: 6562.txt, 6562-v2.txt, 6562-v3.txt In internal tests at Salesforce we found that fake row keys sometimes are passed to filters (Filter.filterRowKey(...) specifically). The KVs are eventually filtered by the StoreScanner/ScanQueryMatcher, but the row key is passed to filterRowKey in RegionScannImpl *before* that happens. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6263) Use default mode for HBase Thrift gateway if not specified
[ https://issues.apache.org/jira/browse/HBASE-6263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439499#comment-13439499 ] Hudson commented on HBASE-6263: --- Integrated in HBase-0.94-security #48 (See [https://builds.apache.org/job/HBase-0.94-security/48/]) HBASE-6263. Use default mode for Thrift gateway if not specified (Revision 1373593) Result = FAILURE apurtell : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/thrift/ThriftServerRunner.java Use default mode for HBase Thrift gateway if not specified -- Key: HBASE-6263 URL: https://issues.apache.org/jira/browse/HBASE-6263 Project: HBase Issue Type: Bug Components: thrift Affects Versions: 0.94.0, 0.96.0 Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Labels: noob Fix For: 0.96.0, 0.94.2 Attachments: HBASE-6263-0.94.patch, HBASE-6263.patch The Thrift gateway should start with a default mode if one is not selected. Currently, instead we see: {noformat} Exception in thread main java.lang.AssertionError: Exactly one option out of [-hsha, -nonblocking, -threadpool, -threadedselector] has to be specified at org.apache.hadoop.hbase.thrift.ThriftServerRunner$ImplType.setServerImpl(ThriftServerRunner.java:201) at org.apache.hadoop.hbase.thrift.ThriftServer.processOptions(ThriftServer.java:169) at org.apache.hadoop.hbase.thrift.ThriftServer.doMain(ThriftServer.java:85) at org.apache.hadoop.hbase.thrift.ThriftServer.main(ThriftServer.java:192) {noformat} See also BIGTOP-648. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6621) Reduce calls to Bytes.toInt
[ https://issues.apache.org/jira/browse/HBASE-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439500#comment-13439500 ] Hudson commented on HBASE-6621: --- Integrated in HBase-0.94-security #48 (See [https://builds.apache.org/job/HBase-0.94-security/48/]) HBASE-6621 Reduce calls to Bytes.toInt (Revision 1375665) Result = FAILURE larsh : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/KeyValue.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java Reduce calls to Bytes.toInt --- Key: HBASE-6621 URL: https://issues.apache.org/jira/browse/HBASE-6621 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Fix For: 0.96.0, 0.94.2 Attachments: 6621-0.96.txt, 6621-0.96-v2.txt, 6621-0.96-v3.txt, 6621-0.96-v4.txt Bytes.toInt shows up quite often in a profiler run. It turns out that one source is HFileReaderV2$ScannerV2.getKeyValue(). Notice that we call the KeyValue(byte[], int) constructor, which forces the constructor to determine its size by reading some of the header information and calculate the size. In this case, however, we already know the size (from the call to readKeyValueLen), so we could just use that. In the extreme case of 1's of columns this noticeably reduces CPU. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6561) Gets/Puts with many columns send the RegionServer into an endless loop
[ https://issues.apache.org/jira/browse/HBASE-6561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439498#comment-13439498 ] Hudson commented on HBASE-6561: --- Integrated in HBase-0.94-security #48 (See [https://builds.apache.org/job/HBase-0.94-security/48/]) HBASE-6561 Gets/Puts with many columns send the RegionServer into an 'endless' loop (Revision 1373951) Result = FAILURE larsh : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/HConstants.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java Gets/Puts with many columns send the RegionServer into an endless loop Key: HBASE-6561 URL: https://issues.apache.org/jira/browse/HBASE-6561 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.96.0, 0.94.2 Attachments: 6561-0.94.txt, 6561-0.96.txt, 6561-0.96-v2.txt, 6561-0.96-v3.txt, 6561-0.96-v4.txt, 6561-0.96-v4.txt, 6561-0.96-v5.txt This came from the mailing this: We were able to replicate this behavior in a pseudo-distributed hbase (hbase-0.94.1) environment. We wrote a test program that creates a test table MyTestTable and populates it with random rows, then it creates a row with 60,000 columns and repeatedly updates it. Each column has a 18 byte qualifier and a 50 byte value. In our tests, when we ran the program, we usually never got beyond 15 updates before it would flush for a really long time. The rows that are being updated are about 4MB each (minues any hbase metadata). It doesn't seem like it's caused by GC. I turned on gc logging, and didn't see any long pauses. This is the gc log during the flush. http://pastebin.com/vJKKXDx5 This is the regionserver log with debug on during the same flush http://pastebin.com/Fh5213mg This is the test program we wrote. http://pastebin.com/aZ0k5tx2 You should be able to just compile it, and run it against a running HBase cluster. $ java TestTable Carlos -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6378) the javadoc of setEnabledTable maybe not describe accurately
[ https://issues.apache.org/jira/browse/HBASE-6378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439494#comment-13439494 ] Hudson commented on HBASE-6378: --- Integrated in HBase-0.94-security #48 (See [https://builds.apache.org/job/HBase-0.94-security/48/]) HBASE-6378 the javadoc of setEnabledTable maybe not describe accurately (Revision 1375202) Result = FAILURE stack : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKTable.java the javadoc of setEnabledTable maybe not describe accurately -- Key: HBASE-6378 URL: https://issues.apache.org/jira/browse/HBASE-6378 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.94.0 Reporter: zhou wenjian Assignee: David S. Wang Fix For: 0.94.2 Attachments: 6378.patch, HBASE-6378.patch, HBASE-6378-trunk.patch /** * Sets the ENABLED state in the cache and deletes the zookeeper node. Fails * silently if the node is not in enabled in zookeeper * * @param tableName * @throws KeeperException */ public void setEnabledTable(final String tableName) throws KeeperException { setTableState(tableName, TableState.ENABLED); } When setEnabledTable occours ,It will update the cache and the zookeeper node,rather than to delete the zk node. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6444) Expose the ability to set custom HTTP Request Headers for the REST client used by RemoteHTable
[ https://issues.apache.org/jira/browse/HBASE-6444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439502#comment-13439502 ] Hudson commented on HBASE-6444: --- Integrated in HBase-0.94-security #48 (See [https://builds.apache.org/job/HBase-0.94-security/48/]) HBASE-6444 Expose the ability to set custom HTTP Request Headers for the REST client used by RemoteHTable (Revision 1370645) Result = FAILURE jxiang : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/rest/client/Client.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java Expose the ability to set custom HTTP Request Headers for the REST client used by RemoteHTable -- Key: HBASE-6444 URL: https://issues.apache.org/jira/browse/HBASE-6444 Project: HBase Issue Type: Improvement Components: rest Reporter: Erich Hochmuth Assignee: Jimmy Xiang Fix For: 0.96.0, 0.94.2 Attachments: HBASE-6444-0.94.patch, HBASE-6444.patch, trunk-6444.patch, trunk-6444_v2.patch Original Estimate: 48h Remaining Estimate: 48h My corporate security office (ISO) requires that all http traffic get routed through a Web Access Management layer (http://en.wikipedia.org/wiki/Web_access_management) Our Hadoop cluster has been segmented by a virtual network with all access to HBase from outside clients being managed through HBase Stargate rest server. The corporate WAM system requires that all http clients authenticate with it first before making any http request to any http service in the corporate network. After the http client authenticates with the WAM system the WAM system returns the client a set of values that must be inserted into a http cookie and request header of all future http requests to other http clients. This would mean that all requests through the RemoteHTable interface would require that this cookie and request header be set as part of the http request. org.apache.hadoop.hbase.rest.client.Client looks like the appropriate place that this functionality would need to be plugged into. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6160) META entries from daughters can be deleted before parent entries
[ https://issues.apache.org/jira/browse/HBASE-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439501#comment-13439501 ] Hudson commented on HBASE-6160: --- Integrated in HBase-0.94-security #48 (See [https://builds.apache.org/job/HBase-0.94-security/48/]) HBASE-6608 Fix for HBASE-6160, META entries from daughters can be deleted before parent entries, shouldn't compare HRegionInfo's (Enis) (Revision 1375158) Result = FAILURE tedyu : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/CatalogJanitor.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java META entries from daughters can be deleted before parent entries Key: HBASE-6160 URL: https://issues.apache.org/jira/browse/HBASE-6160 Project: HBase Issue Type: Bug Components: client, regionserver Affects Versions: 0.92.2, 0.94.0, 0.96.0 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.92.2, 0.94.1 Attachments: HBASE-6160_v1.patch, HBASE-6160v2092.txt, HBASE-6160_v2.patch, HBASE-6160_v2.patch HBASE-5986 fixed and issue, where the client sees the META entry for the parent, but not the children. However, after the fix, we have seen the following issue in tests: Region A is split to - B, C Region B is split to - D, E After some time, META entry for B is deleted since it is not needed anymore, but META entry for Region A stays in META (C still refers it). In this case, the client throws RegionOfflineException for B. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6522) Expose locks and leases to Coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439503#comment-13439503 ] Hudson commented on HBASE-6522: --- Integrated in HBase-0.94-security #48 (See [https://builds.apache.org/job/HBase-0.94-security/48/]) HBASE-6522 Expose locks and leases to Coprocessors (Revision 1371026) Result = FAILURE larsh : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java Expose locks and leases to Coprocessors --- Key: HBASE-6522 URL: https://issues.apache.org/jira/browse/HBASE-6522 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.96.0, 0.94.2 Attachments: 6522.txt, 6522-v2.txt Currently it is not possible for CP to implement any of checkAndMutate type operations, because coprocessor have no way create a lock, because getLock is private HRegion (interestingly ReleaseLock is public). In addition it would nice if Coprocessor could hook into the RegionServers' Lease management. Here I propose two trivial changes: # Make HRegion.getLock public # Add {code}Leases getLeases(){code} to RegionServerServices (and hence to HRegionServer) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6503) HBase Shell Documentation For DROP Is Outdated
[ https://issues.apache.org/jira/browse/HBASE-6503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439504#comment-13439504 ] Hudson commented on HBASE-6503: --- Integrated in HBase-0.94-security #48 (See [https://builds.apache.org/job/HBase-0.94-security/48/]) HBASE-6503 HBase Shell Documentation For DROP Is Outdated (Revision 1375206) Result = FAILURE stack : Files : * /hbase/branches/0.94/src/main/ruby/shell/commands/drop.rb HBase Shell Documentation For DROP Is Outdated -- Key: HBASE-6503 URL: https://issues.apache.org/jira/browse/HBASE-6503 Project: HBase Issue Type: Bug Reporter: Paul Cavallaro Assignee: Paul Cavallaro Priority: Trivial Fix For: 0.92.2, 0.94.2 Attachments: HBASE-6503-example.patch, HBASE-6503.patch HBase Shell help documentation for the drop command says: If table has more than one region, run a major compaction on .META. According to JD this is old news: jdcryans: back in the days when hadoop didn't support durability it was possible to lose .META. data so we were force flushing .META. and major compacting it all the time also we used to have consistency issues that major compacting was solving ahhh the good old days -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6520) MSLab May cause the Bytes.toLong not work correctly for increment
[ https://issues.apache.org/jira/browse/HBASE-6520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439505#comment-13439505 ] Hudson commented on HBASE-6520: --- Integrated in HBase-0.94-security #48 (See [https://builds.apache.org/job/HBase-0.94-security/48/]) HBASE-6520 MSLab May cause the Bytes.toLong not work correctly for increment (ShiXing) (Revision 1371045) Result = FAILURE larsh : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java MSLab May cause the Bytes.toLong not work correctly for increment - Key: HBASE-6520 URL: https://issues.apache.org/jira/browse/HBASE-6520 Project: HBase Issue Type: Bug Reporter: ShiXing Assignee: ShiXing Fix For: 0.96.0, 0.94.2 Attachments: HBASE-6520-0.94-v1.patch, HBASE-6520-trunk-v1.patch When use MemStoreLAB, the KeyValues will share the byte array allocated by the MemStoreLAB, all the KeyValues' bytes attributes are the same byte array. When use the functions such as Bytes.toLong(byte[] bytes, int offset): {code} public static long toLong(byte[] bytes, int offset) { return toLong(bytes, offset, SIZEOF_LONG); } public static long toLong(byte[] bytes, int offset, final int length) { if (length != SIZEOF_LONG || offset + length bytes.length) { throw explainWrongLengthOrOffset(bytes, offset, length, SIZEOF_LONG); } long l = 0; for(int i = offset; i offset + length; i++) { l = 8; l ^= bytes[i] 0xFF; } return l; } {code} If we do not put a long value to the KeyValue, and read it as a long value in HRegion.increment(),the check {code} offset + length bytes.length {code} will take no effects, because the bytes.length is not equal to keyLength+valueLength, indeed it is MemStoreLAB chunkSize which is default 2048 * 1024. I will paste the patch later. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6525) bin/replication/copy_tables_desc.rb references non-existent class
[ https://issues.apache.org/jira/browse/HBASE-6525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439507#comment-13439507 ] Hudson commented on HBASE-6525: --- Integrated in HBase-0.94-security #48 (See [https://builds.apache.org/job/HBase-0.94-security/48/]) HBASE-6525 bin/replication/copy_tables_desc.rb references non-existent class (David S. Wang) (Revision 1371515) Result = FAILURE jmhsieh : Files : * /hbase/branches/0.94/bin/replication/copy_tables_desc.rb bin/replication/copy_tables_desc.rb references non-existent class - Key: HBASE-6525 URL: https://issues.apache.org/jira/browse/HBASE-6525 Project: HBase Issue Type: Bug Components: scripts Affects Versions: 0.94.0, 0.96.0 Reporter: David S. Wang Assignee: David S. Wang Priority: Trivial Labels: noob Fix For: 0.96.0, 0.94.2 Attachments: HBASE-6525.patch $ hbase org.jruby.Main copy_tables_desc.rb NameError: cannot load Java class org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper get_proxy_or_package_under_package at org/jruby/javasupport/JavaUtilities.java:54 method_missing at file:/mnt/data/hbase/lib/jruby-complete-1.6.5.jar!/builtin/javasupport/java.rb:51 (root) at copy_tables_desc.rb:35 Removing the line that references the non-existent class seems to make the script work without any visible side-effects. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5728) Methods Missing in HTableInterface
[ https://issues.apache.org/jira/browse/HBASE-5728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439506#comment-13439506 ] Hudson commented on HBASE-5728: --- Integrated in HBase-0.94-security #48 (See [https://builds.apache.org/job/HBase-0.94-security/48/]) HBASE-5728 Methods Missing in HTableInterface (Revision 1373480) Result = FAILURE jxiang : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java Methods Missing in HTableInterface -- Key: HBASE-5728 URL: https://issues.apache.org/jira/browse/HBASE-5728 Project: HBase Issue Type: Improvement Components: client Reporter: Bing Li Assignee: Jimmy Xiang Fix For: 0.96.0, 0.94.2 Attachments: trunk-5728.patch, trunk-5728_v2.patch, trunk-5728_v3.patch, trunk-5728_v4.patch Dear all, I found some methods existed in HTable were not in HTableInterface. setAutoFlush setWriteBufferSize ... In most cases, I manipulate HBase through HTableInterface from HTablePool. If I need to use the above methods, how to do that? I am considering writing my own table pool if no proper ways. Is it fine? Thanks so much! Best regards, Bing -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6602) Region Server Dynamic Metrics can cause high cpu usage.
[ https://issues.apache.org/jira/browse/HBASE-6602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439508#comment-13439508 ] Hudson commented on HBASE-6602: --- Integrated in HBase-0.94-security #48 (See [https://builds.apache.org/job/HBase-0.94-security/48/]) HBASE-6602 Region Server Dynamic Metrics can cause high cpu usage. (Revision 1374398) Result = FAILURE stack : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerDynamicMetrics.java Region Server Dynamic Metrics can cause high cpu usage. --- Key: HBASE-6602 URL: https://issues.apache.org/jira/browse/HBASE-6602 Project: HBase Issue Type: Bug Reporter: Elliott Clark Assignee: Elliott Clark Priority: Critical Fix For: 0.94.2 Attachments: HBASE-6602-0.94-0.patch, HBASE-6602-trunk-0.patch When regions are getting added and removed lots of cpu time can be used by jmx. This is caused by sending jmx messages for every new metric that is added or removed. Seeing jstacks like this: RMI TCP Connection(3)-10.4.19.33 daemon prio=10 tid=0x7f9d64b1d000 nid=0x353 runnable [0x7f9d598d6000] java.lang.Thread.State: RUNNABLE at java.util.HashMap.put(HashMap.java:374) at org.apache.hadoop.metrics.util.MetricsDynamicMBeanBase.createMBeanInfo(MetricsDynamicMBeanBase.java:103) at org.apache.hadoop.metrics.util.MetricsDynamicMBeanBase.updateMbeanInfoIfMetricsListChanged(MetricsDynamicMBeanBase.java:75) at org.apache.hadoop.metrics.util.MetricsDynamicMBeanBase.getAttribute(MetricsDynamicMBeanBase.java:133) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:666) at com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:638) at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1404) -- at java.util.TimerThread.run(Timer.java:462) Timer thread for monitoring hbase daemon prio=10 tid=0x7f9d648fe000 nid=0x2b5 runnable [0x7f9d624c7000] java.lang.Thread.State: RUNNABLE at java.util.HashMap.put(HashMap.java:374) at org.apache.hadoop.metrics.util.MetricsDynamicMBeanBase.createMBeanInfo(MetricsDynamicMBeanBase.java:103) at org.apache.hadoop.metrics.util.MetricsDynamicMBeanBase.updateMbeanInfoIfMetricsListChanged(MetricsDynamicMBeanBase.java:75) at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.regionserver.metrics.RegionServerDynamicMetrics.setNumericMetric(RegionServerDynamicMetrics.java:105) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6523) HConnectionImplementation still does not recover from all ZK issues.
[ https://issues.apache.org/jira/browse/HBASE-6523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439509#comment-13439509 ] Hudson commented on HBASE-6523: --- Integrated in HBase-0.94-security #48 (See [https://builds.apache.org/job/HBase-0.94-security/48/]) HBASE-6523 Revert (Revision 1370633) HBASE-6523 HConnectionImplementation still does not recover from all ZK issues. (Revision 1370493) Result = FAILURE larsh : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java larsh : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java HConnectionImplementation still does not recover from all ZK issues. Key: HBASE-6523 URL: https://issues.apache.org/jira/browse/HBASE-6523 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Lars Hofhansl Assignee: Lars Hofhansl Attachments: 6523.txt During some testing here at Salesforce.com we found another scenario where an HConnectionImplementation would never recover from a lost ZK connection. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6565) Coprocessor exec result Map is not thread safe
[ https://issues.apache.org/jira/browse/HBASE-6565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439511#comment-13439511 ] Hudson commented on HBASE-6565: --- Integrated in HBase-0.94-security #48 (See [https://builds.apache.org/job/HBase-0.94-security/48/]) HBASE-6565. Coprocessor exec result Map is not thread safe (Yuan Kang) (Revision 1373975) Result = FAILURE apurtell : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/HTable.java Coprocessor exec result Map is not thread safe -- Key: HBASE-6565 URL: https://issues.apache.org/jira/browse/HBASE-6565 Project: HBase Issue Type: Bug Components: client, coprocessors Affects Versions: 0.92.2, 0.94.0, 0.96.0 Environment: hadoop1.0.2,hbase0.94,jdk1.6 Reporter: Yuan Kang Assignee: Yuan Kang Labels: coprocessors, patch Fix For: 0.92.2, 0.96.0, 0.94.2 Attachments: Coprocessor-result-thread unsafe-bug-fix.patch Original Estimate: 168h Remaining Estimate: 168h I develop a coprocessor program ,but found some different results in repeated tests.for example,normally,the result's size is 10.but sometimes it appears 9. I read the HTable.java code,found a TreeMap(thread-unsafe) be used in multithreading environment.It cause the bug happened -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6603) RegionMetricsStorage.incrNumericMetric is called too often
[ https://issues.apache.org/jira/browse/HBASE-6603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439510#comment-13439510 ] Hudson commented on HBASE-6603: --- Integrated in HBase-0.94-security #48 (See [https://builds.apache.org/job/HBase-0.94-security/48/]) HBASE-6603 RegionMetricsStorage.incrNumericMetric is called too often (M. Chen and Lars H) (Revision 1375318) Result = FAILURE larsh : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java RegionMetricsStorage.incrNumericMetric is called too often -- Key: HBASE-6603 URL: https://issues.apache.org/jira/browse/HBASE-6603 Project: HBase Issue Type: Bug Components: performance Reporter: Lars Hofhansl Assignee: M. Chen Fix For: 0.96.0, 0.94.2 Attachments: 6503-0.96.txt, 6603-0.94.txt Running an HBase scan load through the profiler revealed that RegionMetricsStorage.incrNumericMetric is called way too often. It turns out that we make this call for *each* KV in StoreScanner.next(...). Incrementing AtomicLong requires expensive memory barriers. The observation here is that StoreScanner.next(...) can maintain a simple long in its internal loop and only update the metric upon exit. Thus the AtomicLong is not updated nearly as often. That cuts about 10% runtime from scan only load (I'll quantify this better soon). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4054) Usability improvement to HTablePool
[ https://issues.apache.org/jira/browse/HBASE-4054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439512#comment-13439512 ] Hudson commented on HBASE-4054: --- Integrated in HBase-0.94-security #48 (See [https://builds.apache.org/job/HBase-0.94-security/48/]) HBASE-6471 Performance regression caused by HBASE-4054 (Revision 1373916) Result = FAILURE jxiang : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/HTablePool.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/client/TestHTablePool.java Usability improvement to HTablePool --- Key: HBASE-4054 URL: https://issues.apache.org/jira/browse/HBASE-4054 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.90.3 Reporter: Daniel Iancu Assignee: Daniel Iancu Priority: Minor Fix For: 0.92.0 Attachments: 4054.addendum, HBASE-4054_Usability_improvement_to_HTablePool.patch, HBASE-4054_Usability_improvement_to_HTablePool-trunk.patch, TestHTablePool-trunk.patch To improve the usability of the HTablePool the implementation should not rely on the user returning the connection to the pool but rather do that transparently when user closes the HTableImplementation it got. To do that a HTableImplementation proxy implementation should be returned that wraps a HTable object and holds a reference to the pool. When the client close the proxy it will actually automatically return the wrapped HTable back in pool to be reused. In this case the method HTablePool.putTable don't need to be public -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5022) Optimize HBaseConfiguration#create
[ https://issues.apache.org/jira/browse/HBASE-5022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439514#comment-13439514 ] Hudson commented on HBASE-5022: --- Integrated in HBase-0.94-security #48 (See [https://builds.apache.org/job/HBase-0.94-security/48/]) HBASE-6596 Revert HBASE-5022; it undoes HBC.create; ADDENDUM 2 (Revision 1374098) HBASE-6596 Revert HBASE-5022; it undoes HBC.create -- ADDENDUM (Revision 1374013) HBASE-6596 Revert HBASE-5022; it undoes HBC.create (Revision 1373935) Result = FAILURE stack : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/HBaseConfiguration.java stack : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java stack : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/HBaseConfiguration.java Optimize HBaseConfiguration#create -- Key: HBASE-5022 URL: https://issues.apache.org/jira/browse/HBASE-5022 Project: HBase Issue Type: Improvement Components: performance Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 5022.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6577) RegionScannerImpl.nextRow() should seek to next row
[ https://issues.apache.org/jira/browse/HBASE-6577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439513#comment-13439513 ] Hudson commented on HBASE-6577: --- Integrated in HBase-0.94-security #48 (See [https://builds.apache.org/job/HBase-0.94-security/48/]) HBASE-6577 RegionScannerImpl.nextRow() should seek to next row (Revision 1373533) Result = FAILURE larsh : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java RegionScannerImpl.nextRow() should seek to next row --- Key: HBASE-6577 URL: https://issues.apache.org/jira/browse/HBASE-6577 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.96.0, 0.94.2 Attachments: 6577-0.94.txt, 6577.txt, 6577-v2.txt RegionScannerImpl.nextRow() is called when a filter filters the entire row. In that case we should seek to the next row rather then iterating over all versions of all columns to get there. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6308) Coprocessors should be loaded in a custom ClassLoader to prevent dependency conflicts with HBase
[ https://issues.apache.org/jira/browse/HBASE-6308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439522#comment-13439522 ] Hudson commented on HBASE-6308: --- Integrated in HBase-0.94-security #48 (See [https://builds.apache.org/job/HBase-0.94-security/48/]) HBASE-6308. Coprocessors should be loaded in a custom ClassLoader (James Baldassari) (Revision 1372561) Result = FAILURE apurtell : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorClassLoader.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/coprocessor/TestClassLoading.java Coprocessors should be loaded in a custom ClassLoader to prevent dependency conflicts with HBase Key: HBASE-6308 URL: https://issues.apache.org/jira/browse/HBASE-6308 Project: HBase Issue Type: Improvement Components: coprocessors Affects Versions: 0.92.1, 0.94.0 Reporter: James Baldassari Assignee: Andrew Purtell Fix For: 0.92.2, 0.96.0, 0.94.2 Attachments: 6308-v2.txt, HBASE-6308-0.92.patch, HBASE-6308-0.92-with-test.patch, HBASE-6308-0.94-with-test.patch, HBASE-6308-trunk.patch, HBASE-6308-trunk-with-test.patch Currently each coprocessor is loaded with a URLClassLoader that puts the coprocessor's jar at the beginning of the classpath. The URLClassLoader always tries to load classes from the parent ClassLoader first and only attempts to load from its own configured URLs if the class was not found by the parent. This class loading behavior can be problematic for coprocessors that have common dependencies with HBase but whose versions are incompatible. For example, I have a coprocessor that depends on a different version of Avro than the version used by HBase. The current class loading behavior results in NoSuchMethodErrors in my coprocessor because some Avro classes have already been loaded by HBase, and the ClassLoader for my coprocessor picks up HBase's loaded classes first. My proposed solution to this problem is to use a custom ClassLoader when instantiating coprocessor instances. This custom ClassLoader would always attempt to load classes from the coprocessor's jar first and would only delegate to the parent ClassLoader if the class were not found in the coprocessor jar. However, certain classes would need to be exempt from this behavior. As an example, if the Copcoessor interface were loaded by both the region server's ClassLoader and the coprocessor's custom ClassLoader, then the region server would get a ClassCastException when attempting to cast the coprocessor instance to the Coprocessor interface. This problem can be avoided by defining a set of class name prefixes that would be exempt from loading by the custom ClassLoader. When loading a class, if the class starts with any of these prefixes (e.g. org.apache.hadoop), then the ClassLoader would delegate immediately to the parent ClassLoader. I've already implemented a patch to provide this functionality which I'll attach shortly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6576) HBaseAdmin.createTable should wait until the table is enabled
[ https://issues.apache.org/jira/browse/HBASE-6576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439515#comment-13439515 ] Hudson commented on HBASE-6576: --- Integrated in HBase-0.94-security #48 (See [https://builds.apache.org/job/HBase-0.94-security/48/]) HBASE-6576 HBaseAdmin.createTable should wait until the table is enabled (Gregory) (Revision 1373832) Result = FAILURE tedyu : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java HBaseAdmin.createTable should wait until the table is enabled - Key: HBASE-6576 URL: https://issues.apache.org/jira/browse/HBASE-6576 Project: HBase Issue Type: Bug Components: client, test Reporter: Gregory Chanan Assignee: Gregory Chanan Fix For: 0.92.2, 0.96.0, 0.94.2 Attachments: HBASE-6576-92.patch, HBASE-6576-94.patch, HBASE-6576-trunk.patch The function: {code} public void createTable(final HTableDescriptor desc, byte [][] splitKeys) {code} in HBaseAdmin is synchronous and returns once all the regions of the table are online, but does not wait for the table to be enabled, which is the last step of table creation (see CreateTableHandler). This is confusing and leads to racy code because users do not realize that this is the case. For example, I saw the following test failure in 0.92 when I ran: mvn test -Dtest=org.apache.hadoop.hbase.client.TestAdmin#testEnableDisableAddColumnDeleteColumn {code} Error Message org.apache.hadoop.hbase.TableNotEnabledException: testMasterAdmin at org.apache.hadoop.hbase.master.handler.DisableTableHandler.init(DisableTableHandler.java:75) at org.apache.hadoop.hbase.master.HMaster.disableTable(HMaster.java:1154) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1336) Stacktrace org.apache.hadoop.hbase.TableNotEnabledException: org.apache.hadoop.hbase.TableNotEnabledException: testMasterAdmin at org.apache.hadoop.hbase.master.handler.DisableTableHandler.init(DisableTableHandler.java:75) at org.apache.hadoop.hbase.master.HMaster.disableTable(HMaster.java:1154) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1336) {code} The issue is that code will create and table and immediately disable it in order to do some testing, for example, to test an operation that only works when the table is disabled. If the table has not been enabled yet, they will get back a TableNotEnabledException. The specific test above was fixed in HBASE-5206, but other examples exist in the code, for example the following: {code} hbase org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat newtable asdf14 {code} The code in question is: {code} byte[] tname = args[1].getBytes(); HTable table = util.createTable(tname, FAMILIES); HBaseAdmin admin = new HBaseAdmin(conf); admin.disableTable(tname); {code} It would be better if createTable just waited until the table was enabled, or threw a TableNotEnabledException if it exhausted the configured number of retries. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6505) Allow shared RegionObserver state
[ https://issues.apache.org/jira/browse/HBASE-6505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439520#comment-13439520 ] Hudson commented on HBASE-6505: --- Integrated in HBase-0.94-security #48 (See [https://builds.apache.org/job/HBase-0.94-security/48/]) HBASE-6505 Allow shared RegionObserver state (Revision 1369515) Result = FAILURE larsh : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/coprocessor/RegionCoprocessorEnvironment.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java Allow shared RegionObserver state - Key: HBASE-6505 URL: https://issues.apache.org/jira/browse/HBASE-6505 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.96.0, 0.94.2 Attachments: 6505-0.94.txt, 6505-trunk.txt, 6505.txt, 6505-v2.txt, 6505-v3.txt, 6505-v4.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6373) Add more context information to audit log messages
[ https://issues.apache.org/jira/browse/HBASE-6373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439518#comment-13439518 ] Hudson commented on HBASE-6373: --- Integrated in HBase-0.94-security #48 (See [https://builds.apache.org/job/HBase-0.94-security/48/]) HBASE-6373 Add more context information to audit log messages (Revision 1370005) Result = FAILURE stack : Files : * /hbase/branches/0.94/security/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java Add more context information to audit log messages -- Key: HBASE-6373 URL: https://issues.apache.org/jira/browse/HBASE-6373 Project: HBase Issue Type: Improvement Components: security Affects Versions: 0.96.0, 0.94.2 Reporter: Marcelo Vanzin Priority: Minor Fix For: 0.96.0, 0.94.2 Attachments: accesscontroller094.patch, accesscontroller.patch, accesscontroller.patch The attached patch adds more information to the audit log messages; namely, it includes the IP address where the request originated, if it's available. The patch is against trunk, but I've tested it against the 0.92 branch. I didn't find any unit test for this code, please let me know if I missed something. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6471) Performance regression caused by HBASE-4054
[ https://issues.apache.org/jira/browse/HBASE-6471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439516#comment-13439516 ] Hudson commented on HBASE-6471: --- Integrated in HBase-0.94-security #48 (See [https://builds.apache.org/job/HBase-0.94-security/48/]) HBASE-6471 Performance regression caused by HBASE-4054 (Revision 1373916) Result = FAILURE jxiang : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/HTablePool.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/client/TestHTablePool.java Performance regression caused by HBASE-4054 --- Key: HBASE-6471 URL: https://issues.apache.org/jira/browse/HBASE-6471 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.92.0 Reporter: Lars George Assignee: Jimmy Xiang Priority: Critical Fix For: 0.96.0, 0.94.2 Attachments: trunk-6471.patch, trunk-6471.patch, trunk-6471_v2.patch The patch in HBASE-4054 switches the PooledHTable to extend HTable as opposed to implement HTableInterface. Since HTable does not have an empty constructor, the patch added a call to the super() constructor, which though does trigger the ZooKeeper and META scan, causing a considerable delay. With multiple threads using the pool in parallel, the first thread is holding up all the subsequent ones, in effect it negates the whole reason we have a HTable pool. We should complete HBASE-5728, or alternatively add a protected, empty constructor the HTable. I am +1 for the former. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6579) Unnecessary KV order check in StoreScanner
[ https://issues.apache.org/jira/browse/HBASE-6579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439519#comment-13439519 ] Hudson commented on HBASE-6579: --- Integrated in HBase-0.94-security #48 (See [https://builds.apache.org/job/HBase-0.94-security/48/]) HBASE-6579 Unnecessary KV order check in StoreScanner (Revision 1373540) Result = FAILURE larsh : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java Unnecessary KV order check in StoreScanner -- Key: HBASE-6579 URL: https://issues.apache.org/jira/browse/HBASE-6579 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Fix For: 0.96.0, 0.94.2 Attachments: 6579.txt In StoreScanner.next(ListKeyValue, int, String) I find this code: {code} // Check that the heap gives us KVs in an increasing order. if (prevKV != null comparator != null comparator.compare(prevKV, kv) 0) { throw new IOException(Key + prevKV + followed by a + smaller key + kv + in cf + store); } prevKV = kv; {code} So this checks for bugs in the HFiles or the scanner code. It needs to compare each KVs with its predecessor. This seems unnecessary now, I propose that we remove this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6608) Fix for HBASE-6160, META entries from daughters can be deleted before parent entries, shouldn't compare HRegionInfo's
[ https://issues.apache.org/jira/browse/HBASE-6608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439517#comment-13439517 ] Hudson commented on HBASE-6608: --- Integrated in HBase-0.94-security #48 (See [https://builds.apache.org/job/HBase-0.94-security/48/]) HBASE-6608 Fix for HBASE-6160, META entries from daughters can be deleted before parent entries, shouldn't compare HRegionInfo's (Enis) (Revision 1375158) Result = FAILURE tedyu : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/CatalogJanitor.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java Fix for HBASE-6160, META entries from daughters can be deleted before parent entries, shouldn't compare HRegionInfo's - Key: HBASE-6608 URL: https://issues.apache.org/jira/browse/HBASE-6608 Project: HBase Issue Type: Bug Components: client, regionserver Affects Versions: 0.92.1, 0.96.0, 0.94.2 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.92.2, 0.96.0, 0.94.2 Attachments: 6608-v2.patch, hbase-6608_v1-0.92+0.94.patch, hbase-6608_v1.patch Our nightlies discovered that the patch for HBASE-6160 did not actually fix the issue of META entries from daughters can be deleted before parent entries. Instead of reopening the HBASE-6160, it is cleaner to track it here. The original issue is: {quote} HBASE-5986 fixed and issue, where the client sees the META entry for the parent, but not the children. However, after the fix, we have seen the following issue in tests: Region A is split to - B, C Region B is split to - D, E After some time, META entry for B is deleted since it is not needed anymore, but META entry for Region A stays in META (C still refers it). In this case, the client throws RegionOfflineException for B. {quote} The problem with the fix seems to be that we keep and compare HRegionInfo's in the HashSet at CatalogJanitor.java#scan(), but HRI that are compared are not equal. {code} HashSetHRegionInfo parentNotCleaned = new HashSetHRegionInfo(); //regions whose parents are still around for (Map.EntryHRegionInfo, Result e : splitParents.entrySet()) { if (!parentNotCleaned.contains(e.getKey()) cleanParent(e.getKey(), e.getValue())) { cleaned++; } else { ... {code} In the above case, Meta row for region A will contain a serialized version of B that is not offline. However Meta row for region B will contain a serialized version of B that is offline (MetaEditor.offlineParentInMeta() does that). So the deserialized version we put to HashSet and the deserialized version we query contains() from HashSet are different in the offline field, thus HRI.equals() fail. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6596) Revert HBASE-5022; it undoes HBC.create
[ https://issues.apache.org/jira/browse/HBASE-6596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439521#comment-13439521 ] Hudson commented on HBASE-6596: --- Integrated in HBase-0.94-security #48 (See [https://builds.apache.org/job/HBase-0.94-security/48/]) HBASE-6596 Revert HBASE-5022; it undoes HBC.create; ADDENDUM 2 (Revision 1374098) HBASE-6596 Revert HBASE-5022; it undoes HBC.create -- ADDENDUM (Revision 1374013) HBASE-6596 Revert HBASE-5022; it undoes HBC.create (Revision 1373935) Result = FAILURE stack : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/HBaseConfiguration.java stack : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java stack : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/HBaseConfiguration.java Revert HBASE-5022; it undoes HBC.create --- Key: HBASE-6596 URL: https://issues.apache.org/jira/browse/HBASE-6596 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Fix For: 0.94.2 Attachments: 6596_094.txt, 6596_trunk.txt, addendum2.txt, addendum.txt See this thread: http://search-hadoop.com/m/WAXXV1OQ1QY/What+is+HBaseConfiguration.create%2528Configuration%2529+good+for%253Fsubj=What+is+HBaseConfiguration+create+Configuration+good+for+ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6632) [0.92 UNIT TESTS] testCreateTableRPCTimeOut sets rpc timeout to 1500ms and leaves it (testHundredsOfTable fails w/ 1500ms timeout)
[ https://issues.apache.org/jira/browse/HBASE-6632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439523#comment-13439523 ] Hudson commented on HBASE-6632: --- Integrated in HBase-0.94-security #48 (See [https://builds.apache.org/job/HBase-0.94-security/48/]) HBASE-6632 [0.92 UNIT TESTS] testCreateTableRPCTimeOut sets rpc timeout to 1500ms and leaves it (testHundredsOfTable fails w/ 1500ms timeout) (Revision 1375898) Result = FAILURE stack : Files : * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java [0.92 UNIT TESTS] testCreateTableRPCTimeOut sets rpc timeout to 1500ms and leaves it (testHundredsOfTable fails w/ 1500ms timeout) -- Key: HBASE-6632 URL: https://issues.apache.org/jira/browse/HBASE-6632 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Fix For: 0.92.2, 0.94.2 Attachments: 6632-trunk.txt, 6632.txt I see that in 0.92 #502 and #501 that TestAdmin.testHundredsOfTable fails because socket times out after 1500ms. I see in TestAdmin that before this test runs, testCreateTableRPCTimeOut sets the socket timeout to 1500 and then does not set it back. Maybe the obnoxious testHundredsOfTable will pass more often if it has the default rpc timeout. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5714) Add write permissions check before any hbck run that modifies hdfs.
[ https://issues.apache.org/jira/browse/HBASE-5714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439529#comment-13439529 ] Hudson commented on HBASE-5714: --- Integrated in HBase-0.94-security #48 (See [https://builds.apache.org/job/HBase-0.94-security/48/]) HBASE-5714 Add write permissions check before any hbck run that modifies hdfs (Liang Xie) (Revision 1375228) Result = FAILURE jmhsieh : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/util/FSUtils.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java Add write permissions check before any hbck run that modifies hdfs. --- Key: HBASE-5714 URL: https://issues.apache.org/jira/browse/HBASE-5714 Project: HBase Issue Type: Improvement Components: hbck Affects Versions: 0.90.6, 0.92.2, 0.94.0, 0.96.0 Reporter: Jonathan Hsieh Assignee: liang xie Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.2 Attachments: HBASE-5628.patch, HBASE-5628.patch.v2, hbase-5714-90.patch, hbase-5714-92.patch, hbase-5714-94.patch We encoutered a situation where hbck was run by an under-privileged user that was unable to write/modify/merge regions due to hdfs perms. Unfortunately, this user was alerted of this after several minutes of read-only operations. hbck should fail early by having a write perm check and providing actionable advice to the hbase admin. Maybe something like: Current user yy does not have write perms to hbase home. Please run hbck as hdfs user xxx -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6364) Powering down the server host holding the .META. table causes HBase Client to take excessively long to recover and connect to reassigned .META. table
[ https://issues.apache.org/jira/browse/HBASE-6364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439525#comment-13439525 ] Hudson commented on HBASE-6364: --- Integrated in HBase-0.94-security #48 (See [https://builds.apache.org/job/HBase-0.94-security/48/]) HBASE-6364 Powering down the server host holding the .META. table causes HBase Client to take excessively long to recover and connect to reassigned .META. table (Revision 1376013) Result = FAILURE nkeywal : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/util/EnvironmentEdgeManager.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/util/ManualEnvironmentEdge.java Powering down the server host holding the .META. table causes HBase Client to take excessively long to recover and connect to reassigned .META. table - Key: HBASE-6364 URL: https://issues.apache.org/jira/browse/HBASE-6364 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.6, 0.92.1, 0.94.0 Reporter: Suraj Varma Assignee: nkeywal Labels: client Fix For: 0.96.0, 0.94.2 Attachments: 6364.94.v2.nolargetest.patch, 6364-host-serving-META.v1.patch, 6364.v11.nolargetest.patch, 6364.v1.patch, 6364.v1.patch, 6364.v2.patch, 6364.v3.patch, 6364.v3.patch, 6364.v5.patch, 6364.v5.withtests.patch, 6364.v6.patch, 6364.v6.withtests.patch, 6364.v7.withtests.patch, 6364.v8.withtests.patch, 6364.v9.patch, stacktrace.txt When a server host with a Region Server holding the .META. table is powered down on a live cluster, while the HBase cluster itself detects and reassigns the .META. table, connected HBase Client's take an excessively long time to detect this and re-discover the reassigned .META. Workaround: Decrease the ipc.socket.timeout on HBase Client side to a low value (default is 20s leading to 35 minute recovery time; we were able to get acceptable results with 100ms getting a 3 minute recovery) This was found during some hardware failure testing scenarios. Test Case: 1) Apply load via client app on HBase cluster for several minutes 2) Power down the region server holding the .META. server (i.e. power off ... and keep it off) 3) Measure how long it takes for cluster to reassign META table and for client threads to re-lookup and re-orient to the lesser cluster (minus the RS and DN on that host). Observation: 1) Client threads spike up to maxThreads size ... and take over 35 mins to recover (i.e. for the thread count to go back to normal) - no client calls are serviced - they just back up on a synchronized method (see #2 below) 2) All the client app threads queue up behind the oahh.ipc.HBaseClient#setupIOStreams method http://tinyurl.com/7js53dj After taking several thread dumps we found that the thread within this synchronized method was blocked on NetUtils.connect(this.socket, remoteId.getAddress(), getSocketTimeout(conf)); The client thread that gets the synchronized lock would try to connect to the dead RS (till socket times out after 20s), retries, and then the next thread gets in and so forth in a serial manner. Workaround: --- Default ipc.socket.timeout is set to 20s. We dropped this to a low number (1000 ms, 100 ms, etc) on the client side hbase-site.xml. With this setting, the client threads recovered in a couple of minutes by failing fast and re-discovering the .META. table on a reassigned RS. Assumption: This ipc.socket.timeout is only ever used during the initial HConnection setup via the NetUtils.connect and should only ever be used when connectivity to a region server is lost and needs to be re-established. i.e it does not affect the normal RPC actiivity as this is just the connect timeout. During RS GC periods, any _new_ clients trying to connect will fail and will require .META. table re-lookups. This above timeout workaround is only for the HBase client side. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6538) Remove copy_table.rb script
[ https://issues.apache.org/jira/browse/HBASE-6538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439526#comment-13439526 ] Hudson commented on HBASE-6538: --- Integrated in HBase-0.94-security #48 (See [https://builds.apache.org/job/HBase-0.94-security/48/]) HBASE-6538 Remove copy_table.rb script (Revision 1374372) Result = FAILURE stack : Files : * /hbase/branches/0.94/bin/copy_table.rb Remove copy_table.rb script --- Key: HBASE-6538 URL: https://issues.apache.org/jira/browse/HBASE-6538 Project: HBase Issue Type: Task Components: scripts Affects Versions: 0.96.0 Reporter: David S. Wang Assignee: David S. Wang Priority: Minor Labels: noob Fix For: 0.92.2, 0.94.2 Attachments: hbase-6583-1.patch Remove copy_table.rb script as per mailing list discussion. It hasn't been maintained in a while and does not run against any recent HBase release. There is also an MR job to do the same thing that does work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6631) TestHMasterRPCException in 0.92 failed twice on socket timeout
[ https://issues.apache.org/jira/browse/HBASE-6631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439524#comment-13439524 ] Hudson commented on HBASE-6631: --- Integrated in HBase-0.94-security #48 (See [https://builds.apache.org/job/HBase-0.94-security/48/]) HBASE-6631 TestHMasterRPCException in 0.92 failed twice on socket timeout; ADDENDUM -- FIX COMPILATION ERROR (Revision 1375871) HBASE-6631 TestHMasterRPCException in 0.92 failed twice on socket timeout (Revision 1375845) Result = FAILURE stack : Files : * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestHMasterRPCException.java stack : Files : * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestHMasterRPCException.java TestHMasterRPCException in 0.92 failed twice on socket timeout -- Key: HBASE-6631 URL: https://issues.apache.org/jira/browse/HBASE-6631 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Fix For: 0.92.2, 0.94.2 Attachments: 6631-trunk.txt, 6631-trunk.txt, 6631.txt #502 and #498 0.92 builds have TestHMasterRPCException failing because of socket timeout when servernotrunning is expected. Socket timeout is 100ms only. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6552) TestAcidGuarantees system test should flush more aggressively
[ https://issues.apache.org/jira/browse/HBASE-6552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439528#comment-13439528 ] Hudson commented on HBASE-6552: --- Integrated in HBase-0.94-security #48 (See [https://builds.apache.org/job/HBase-0.94-security/48/]) HBASE-6552 TestAcidGuarantees system test should flush more aggressively (Gregory Chanan) (Revision 1371502) Result = FAILURE jmhsieh : Files : * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java TestAcidGuarantees system test should flush more aggressively - Key: HBASE-6552 URL: https://issues.apache.org/jira/browse/HBASE-6552 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Gregory Chanan Assignee: Gregory Chanan Fix For: 0.92.2, 0.96.0, 0.94.2 Attachments: HBASE-6552-94-92.patch, HBASE-6552-trunk.patch HBASE-5887 allowed TestAcidGuarantees to be run as a system test by avoiding the call to util.flush(). It would be better to go through the HBaseAdmin interface to force flushes. This would unify the code path between the unit test and the system test, as well as forcing more frequent flushes, which have previously been the source of ACID guarantee problems, e.g. HBASE-2856. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6512) Incorrect OfflineMetaRepair log class name
[ https://issues.apache.org/jira/browse/HBASE-6512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439527#comment-13439527 ] Hudson commented on HBASE-6512: --- Integrated in HBase-0.94-security #48 (See [https://builds.apache.org/job/HBase-0.94-security/48/]) HBASE-6512 Incorrect OfflineMetaRepair log class name (Liang Xie) (Revision 1371521) Result = FAILURE jmhsieh : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRepair.java Incorrect OfflineMetaRepair log class name -- Key: HBASE-6512 URL: https://issues.apache.org/jira/browse/HBASE-6512 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.94.0, 0.96.0, 0.94.1, 0.94.2 Reporter: liang xie Assignee: liang xie Fix For: 0.92.2, 0.96.0, 0.94.2 Attachments: HBASE-6512.diff At the beginning of OfflineMetaRepair.java, we can observe: private static final Log LOG = LogFactory.getLog(HBaseFsck.class.getName()); It would be better change to : private static final Log LOG = LogFactory.getLog(OfflineMetaRepair.class.getName()); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6364) Powering down the server host holding the .META. table causes HBase Client to take excessively long to recover and connect to reassigned .META. table
[ https://issues.apache.org/jira/browse/HBASE-6364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439545#comment-13439545 ] Hudson commented on HBASE-6364: --- Integrated in HBase-0.94 #415 (See [https://builds.apache.org/job/HBase-0.94/415/]) HBASE-6364 Powering down the server host holding the .META. table causes HBase Client to take excessively long to recover and connect to reassigned .META. table (Revision 1376013) Result = SUCCESS nkeywal : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/util/EnvironmentEdgeManager.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/util/ManualEnvironmentEdge.java Powering down the server host holding the .META. table causes HBase Client to take excessively long to recover and connect to reassigned .META. table - Key: HBASE-6364 URL: https://issues.apache.org/jira/browse/HBASE-6364 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.6, 0.92.1, 0.94.0 Reporter: Suraj Varma Assignee: nkeywal Labels: client Fix For: 0.96.0, 0.94.2 Attachments: 6364.94.v2.nolargetest.patch, 6364-host-serving-META.v1.patch, 6364.v11.nolargetest.patch, 6364.v1.patch, 6364.v1.patch, 6364.v2.patch, 6364.v3.patch, 6364.v3.patch, 6364.v5.patch, 6364.v5.withtests.patch, 6364.v6.patch, 6364.v6.withtests.patch, 6364.v7.withtests.patch, 6364.v8.withtests.patch, 6364.v9.patch, stacktrace.txt When a server host with a Region Server holding the .META. table is powered down on a live cluster, while the HBase cluster itself detects and reassigns the .META. table, connected HBase Client's take an excessively long time to detect this and re-discover the reassigned .META. Workaround: Decrease the ipc.socket.timeout on HBase Client side to a low value (default is 20s leading to 35 minute recovery time; we were able to get acceptable results with 100ms getting a 3 minute recovery) This was found during some hardware failure testing scenarios. Test Case: 1) Apply load via client app on HBase cluster for several minutes 2) Power down the region server holding the .META. server (i.e. power off ... and keep it off) 3) Measure how long it takes for cluster to reassign META table and for client threads to re-lookup and re-orient to the lesser cluster (minus the RS and DN on that host). Observation: 1) Client threads spike up to maxThreads size ... and take over 35 mins to recover (i.e. for the thread count to go back to normal) - no client calls are serviced - they just back up on a synchronized method (see #2 below) 2) All the client app threads queue up behind the oahh.ipc.HBaseClient#setupIOStreams method http://tinyurl.com/7js53dj After taking several thread dumps we found that the thread within this synchronized method was blocked on NetUtils.connect(this.socket, remoteId.getAddress(), getSocketTimeout(conf)); The client thread that gets the synchronized lock would try to connect to the dead RS (till socket times out after 20s), retries, and then the next thread gets in and so forth in a serial manner. Workaround: --- Default ipc.socket.timeout is set to 20s. We dropped this to a low number (1000 ms, 100 ms, etc) on the client side hbase-site.xml. With this setting, the client threads recovered in a couple of minutes by failing fast and re-discovering the .META. table on a reassigned RS. Assumption: This ipc.socket.timeout is only ever used during the initial HConnection setup via the NetUtils.connect and should only ever be used when connectivity to a region server is lost and needs to be re-established. i.e it does not affect the normal RPC actiivity as this is just the connect timeout. During RS GC periods, any _new_ clients trying to connect will fail and will require .META. table re-lookups. This above timeout workaround is only for the HBase client side. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-6634) REST API ScannerModel's protobuf converter code duplicates the setBatch call
[ https://issues.apache.org/jira/browse/HBASE-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J reassigned HBASE-6634: -- Assignee: Harsh J REST API ScannerModel's protobuf converter code duplicates the setBatch call Key: HBASE-6634 URL: https://issues.apache.org/jira/browse/HBASE-6634 Project: HBase Issue Type: Bug Components: rest Affects Versions: 0.94.0 Reporter: Harsh J Assignee: Harsh J Priority: Trivial Attachments: HBASE-6634.patch There's a dupe call to setBatch when a scanner model object is created for protobuf outputs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6634) REST API ScannerModel's protobuf converter code duplicates the setBatch call
Harsh J created HBASE-6634: -- Summary: REST API ScannerModel's protobuf converter code duplicates the setBatch call Key: HBASE-6634 URL: https://issues.apache.org/jira/browse/HBASE-6634 Project: HBase Issue Type: Bug Components: rest Affects Versions: 0.94.0 Reporter: Harsh J Priority: Trivial Attachments: HBASE-6634.patch There's a dupe call to setBatch when a scanner model object is created for protobuf outputs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6634) REST API ScannerModel's protobuf converter code duplicates the setBatch call
[ https://issues.apache.org/jira/browse/HBASE-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated HBASE-6634: --- Attachment: HBASE-6634.patch REST API ScannerModel's protobuf converter code duplicates the setBatch call Key: HBASE-6634 URL: https://issues.apache.org/jira/browse/HBASE-6634 Project: HBase Issue Type: Bug Components: rest Affects Versions: 0.94.0 Reporter: Harsh J Priority: Trivial Attachments: HBASE-6634.patch There's a dupe call to setBatch when a scanner model object is created for protobuf outputs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6634) REST API ScannerModel's protobuf converter code duplicates the setBatch call
[ https://issues.apache.org/jira/browse/HBASE-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated HBASE-6634: --- Status: Patch Available (was: Open) REST API ScannerModel's protobuf converter code duplicates the setBatch call Key: HBASE-6634 URL: https://issues.apache.org/jira/browse/HBASE-6634 Project: HBase Issue Type: Bug Components: rest Affects Versions: 0.94.0 Reporter: Harsh J Assignee: Harsh J Priority: Trivial Attachments: HBASE-6634.patch There's a dupe call to setBatch when a scanner model object is created for protobuf outputs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6581) Build with hadoop.profile=3.0
[ https://issues.apache.org/jira/browse/HBASE-6581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Kelly updated HBASE-6581: -- Attachment: HBASE-6581.diff #setHostName() no longer exists. Patch dodges around this using reflection now. Build with hadoop.profile=3.0 - Key: HBASE-6581 URL: https://issues.apache.org/jira/browse/HBASE-6581 Project: HBase Issue Type: Bug Reporter: Eric Charles Attachments: HBASE-6581-1.patch, HBASE-6581-2.patch, HBASE-6581.diff Building trunk with hadoop.profile=3.0 gives exceptions (see [1]) due to change in the hadoop maven modules naming (and also usage of 3.0-SNAPSHOT instead of 3.0.0-SNAPSHOT in hbase-common). I can provide a patch that would move most of hadoop dependencies in their respective profiles and will define the correct hadoop deps in the 3.0 profile. Please tell me if that's ok to go this way. Thx, Eric [1] $ mvn clean install -Dhadoop.profile=3.0 [INFO] Scanning for projects... [ERROR] The build could not read 3 projects - [Help 1] [ERROR] [ERROR] The project org.apache.hbase:hbase-server:0.95-SNAPSHOT (/d/hbase.svn/hbase-server/pom.xml) has 3 errors [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-common:jar is missing. @ line 655, column 21 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-annotations:jar is missing. @ line 659, column 21 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-minicluster:jar is missing. @ line 663, column 21 [ERROR] [ERROR] The project org.apache.hbase:hbase-common:0.95-SNAPSHOT (/d/hbase.svn/hbase-common/pom.xml) has 3 errors [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-common:jar is missing. @ line 170, column 21 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-annotations:jar is missing. @ line 174, column 21 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-minicluster:jar is missing. @ line 178, column 21 [ERROR] [ERROR] The project org.apache.hbase:hbase-it:0.95-SNAPSHOT (/d/hbase.svn/hbase-it/pom.xml) has 3 errors [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-common:jar is missing. @ line 220, column 18 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-annotations:jar is missing. @ line 224, column 21 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-minicluster:jar is missing. @ line 228, column 21 [ERROR] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6581) Build with hadoop.profile=3.0
[ https://issues.apache.org/jira/browse/HBASE-6581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439591#comment-13439591 ] Ivan Kelly commented on HBASE-6581: --- Builds for 3.0 now. Some tests fail, though I guess it's to be expected. With default profile, it works fine. Build with hadoop.profile=3.0 - Key: HBASE-6581 URL: https://issues.apache.org/jira/browse/HBASE-6581 Project: HBase Issue Type: Bug Reporter: Eric Charles Attachments: HBASE-6581-1.patch, HBASE-6581-2.patch, HBASE-6581.diff Building trunk with hadoop.profile=3.0 gives exceptions (see [1]) due to change in the hadoop maven modules naming (and also usage of 3.0-SNAPSHOT instead of 3.0.0-SNAPSHOT in hbase-common). I can provide a patch that would move most of hadoop dependencies in their respective profiles and will define the correct hadoop deps in the 3.0 profile. Please tell me if that's ok to go this way. Thx, Eric [1] $ mvn clean install -Dhadoop.profile=3.0 [INFO] Scanning for projects... [ERROR] The build could not read 3 projects - [Help 1] [ERROR] [ERROR] The project org.apache.hbase:hbase-server:0.95-SNAPSHOT (/d/hbase.svn/hbase-server/pom.xml) has 3 errors [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-common:jar is missing. @ line 655, column 21 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-annotations:jar is missing. @ line 659, column 21 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-minicluster:jar is missing. @ line 663, column 21 [ERROR] [ERROR] The project org.apache.hbase:hbase-common:0.95-SNAPSHOT (/d/hbase.svn/hbase-common/pom.xml) has 3 errors [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-common:jar is missing. @ line 170, column 21 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-annotations:jar is missing. @ line 174, column 21 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-minicluster:jar is missing. @ line 178, column 21 [ERROR] [ERROR] The project org.apache.hbase:hbase-it:0.95-SNAPSHOT (/d/hbase.svn/hbase-it/pom.xml) has 3 errors [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-common:jar is missing. @ line 220, column 18 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-annotations:jar is missing. @ line 224, column 21 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-minicluster:jar is missing. @ line 228, column 21 [ERROR] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6581) Build with hadoop.profile=3.0
[ https://issues.apache.org/jira/browse/HBASE-6581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Kelly updated HBASE-6581: -- Status: Patch Available (was: Open) Build with hadoop.profile=3.0 - Key: HBASE-6581 URL: https://issues.apache.org/jira/browse/HBASE-6581 Project: HBase Issue Type: Bug Reporter: Eric Charles Attachments: HBASE-6581-1.patch, HBASE-6581-2.patch, HBASE-6581.diff Building trunk with hadoop.profile=3.0 gives exceptions (see [1]) due to change in the hadoop maven modules naming (and also usage of 3.0-SNAPSHOT instead of 3.0.0-SNAPSHOT in hbase-common). I can provide a patch that would move most of hadoop dependencies in their respective profiles and will define the correct hadoop deps in the 3.0 profile. Please tell me if that's ok to go this way. Thx, Eric [1] $ mvn clean install -Dhadoop.profile=3.0 [INFO] Scanning for projects... [ERROR] The build could not read 3 projects - [Help 1] [ERROR] [ERROR] The project org.apache.hbase:hbase-server:0.95-SNAPSHOT (/d/hbase.svn/hbase-server/pom.xml) has 3 errors [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-common:jar is missing. @ line 655, column 21 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-annotations:jar is missing. @ line 659, column 21 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-minicluster:jar is missing. @ line 663, column 21 [ERROR] [ERROR] The project org.apache.hbase:hbase-common:0.95-SNAPSHOT (/d/hbase.svn/hbase-common/pom.xml) has 3 errors [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-common:jar is missing. @ line 170, column 21 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-annotations:jar is missing. @ line 174, column 21 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-minicluster:jar is missing. @ line 178, column 21 [ERROR] [ERROR] The project org.apache.hbase:hbase-it:0.95-SNAPSHOT (/d/hbase.svn/hbase-it/pom.xml) has 3 errors [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-common:jar is missing. @ line 220, column 18 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-annotations:jar is missing. @ line 224, column 21 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-minicluster:jar is missing. @ line 228, column 21 [ERROR] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-2155) Add the option to bind to a specific IP address to the Nonblocking Thrift servers
[ https://issues.apache.org/jira/browse/HBASE-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liang xie updated HBASE-2155: - Attachment: HBASE-2155.patch If i'm not wrong, It was supported bind IP to Nonblocking servers since Thrift0.3. Attached patch passed test case and was verified at my real cluster env as well. Add the option to bind to a specific IP address to the Nonblocking Thrift servers - Key: HBASE-2155 URL: https://issues.apache.org/jira/browse/HBASE-2155 Project: HBase Issue Type: Improvement Components: thrift Reporter: Lars Francke Assignee: Lars Francke Priority: Minor Attachments: HBASE-2155.patch This is not possible in Thrift 0.2.0 so we'll have to wait until the next version is released (which includes THRIFT-684). After that is released this is an easy and quick fix. For a few more details see HBASE-1373 and HBASE-65. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6634) REST API ScannerModel's protobuf converter code duplicates the setBatch call
[ https://issues.apache.org/jira/browse/HBASE-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439621#comment-13439621 ] Hadoop QA commented on HBASE-6634: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12541975/HBASE-6634.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 5 javac compiler warnings (more than the trunk's current 4 warnings). -1 findbugs. The patch appears to introduce 9 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.io.hfile.TestForceCacheImportantBlocks Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2646//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2646//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2646//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2646//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2646//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2646//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2646//console This message is automatically generated. REST API ScannerModel's protobuf converter code duplicates the setBatch call Key: HBASE-6634 URL: https://issues.apache.org/jira/browse/HBASE-6634 Project: HBase Issue Type: Bug Components: rest Affects Versions: 0.94.0 Reporter: Harsh J Assignee: Harsh J Priority: Trivial Attachments: HBASE-6634.patch There's a dupe call to setBatch when a scanner model object is created for protobuf outputs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6633) Adding new hooks to the split flow - For roll backs and one final hook after split is completed either successfully or failed
[ https://issues.apache.org/jira/browse/HBASE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439626#comment-13439626 ] Andrew Purtell commented on HBASE-6633: --- bq. Pls do suggest if i can provide a patch for this. Sounds good to me. Thanks Ram. Please keep but deprecate postSplit. Adding new hooks to the split flow - For roll backs and one final hook after split is completed either successfully or failed - Key: HBASE-6633 URL: https://issues.apache.org/jira/browse/HBASE-6633 Project: HBase Issue Type: Improvement Components: coprocessors Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Labels: coprocessors Currently we have two hooks in the split flow of a region. PreSplit() and postSplit(). But not always these are helpful in case i have a problem in preSplit() or postSplit() i need to do a rollback of the current region or the region that i am handling thro the hooks. So its better if we have a hook in the rollback code and also one final hook say postCompleteSplit() so that CP can take any corrective action. Pls do suggest if i can provide a patch for this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6581) Build with hadoop.profile=3.0
[ https://issues.apache.org/jira/browse/HBASE-6581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439628#comment-13439628 ] Zhihong Ted Yu commented on HBASE-6581: --- Meanwhile, I think we should maintain no more than 3 test failures for hadoop-2.0 profile. Build with hadoop.profile=3.0 - Key: HBASE-6581 URL: https://issues.apache.org/jira/browse/HBASE-6581 Project: HBase Issue Type: Bug Reporter: Eric Charles Attachments: HBASE-6581-1.patch, HBASE-6581-2.patch, HBASE-6581.diff Building trunk with hadoop.profile=3.0 gives exceptions (see [1]) due to change in the hadoop maven modules naming (and also usage of 3.0-SNAPSHOT instead of 3.0.0-SNAPSHOT in hbase-common). I can provide a patch that would move most of hadoop dependencies in their respective profiles and will define the correct hadoop deps in the 3.0 profile. Please tell me if that's ok to go this way. Thx, Eric [1] $ mvn clean install -Dhadoop.profile=3.0 [INFO] Scanning for projects... [ERROR] The build could not read 3 projects - [Help 1] [ERROR] [ERROR] The project org.apache.hbase:hbase-server:0.95-SNAPSHOT (/d/hbase.svn/hbase-server/pom.xml) has 3 errors [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-common:jar is missing. @ line 655, column 21 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-annotations:jar is missing. @ line 659, column 21 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-minicluster:jar is missing. @ line 663, column 21 [ERROR] [ERROR] The project org.apache.hbase:hbase-common:0.95-SNAPSHOT (/d/hbase.svn/hbase-common/pom.xml) has 3 errors [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-common:jar is missing. @ line 170, column 21 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-annotations:jar is missing. @ line 174, column 21 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-minicluster:jar is missing. @ line 178, column 21 [ERROR] [ERROR] The project org.apache.hbase:hbase-it:0.95-SNAPSHOT (/d/hbase.svn/hbase-it/pom.xml) has 3 errors [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-common:jar is missing. @ line 220, column 18 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-annotations:jar is missing. @ line 224, column 21 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-minicluster:jar is missing. @ line 228, column 21 [ERROR] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2155) Add the option to bind to a specific IP address to the Nonblocking Thrift servers
[ https://issues.apache.org/jira/browse/HBASE-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439632#comment-13439632 ] Zhihong Ted Yu commented on HBASE-2155: --- What about this server type : {code} THREADED_SELECTOR( threadedselector, true, TThreadedSelectorServer.class, false); {code} Does the change apply to the above type ? Add the option to bind to a specific IP address to the Nonblocking Thrift servers - Key: HBASE-2155 URL: https://issues.apache.org/jira/browse/HBASE-2155 Project: HBase Issue Type: Improvement Components: thrift Reporter: Lars Francke Assignee: Lars Francke Priority: Minor Attachments: HBASE-2155.patch This is not possible in Thrift 0.2.0 so we'll have to wait until the next version is released (which includes THRIFT-684). After that is released this is an easy and quick fix. For a few more details see HBASE-1373 and HBASE-65. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6524) Hooks for hbase tracing
[ https://issues.apache.org/jira/browse/HBASE-6524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439633#comment-13439633 ] Zhihong Ted Yu commented on HBASE-6524: --- There were 90 more failed tests introduced for hadoop 2.0 profile. I suggest we temporarily rollback the patch and give Jonathan enough time to iron out the 90+ test failures. Hooks for hbase tracing --- Key: HBASE-6524 URL: https://issues.apache.org/jira/browse/HBASE-6524 Project: HBase Issue Type: Sub-task Reporter: Jonathan Leavitt Fix For: 0.96.0 Attachments: createTableTrace.png, hbase-6524.diff Includes the hooks that use [htrace|http://www.github.com/cloudera/htrace] library to add dapper-like tracing to hbase. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6634) REST API ScannerModel's protobuf converter code duplicates the setBatch call
[ https://issues.apache.org/jira/browse/HBASE-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439640#comment-13439640 ] Zhihong Ted Yu commented on HBASE-6634: --- +1 on patch. REST API ScannerModel's protobuf converter code duplicates the setBatch call Key: HBASE-6634 URL: https://issues.apache.org/jira/browse/HBASE-6634 Project: HBase Issue Type: Bug Components: rest Affects Versions: 0.94.0 Reporter: Harsh J Assignee: Harsh J Priority: Trivial Attachments: HBASE-6634.patch There's a dupe call to setBatch when a scanner model object is created for protobuf outputs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6364) Powering down the server host holding the .META. table causes HBase Client to take excessively long to recover and connect to reassigned .META. table
[ https://issues.apache.org/jira/browse/HBASE-6364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-6364: --- Attachment: 6364.94.v2.nolargetest.security-addendum.patch Powering down the server host holding the .META. table causes HBase Client to take excessively long to recover and connect to reassigned .META. table - Key: HBASE-6364 URL: https://issues.apache.org/jira/browse/HBASE-6364 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.6, 0.92.1, 0.94.0 Reporter: Suraj Varma Assignee: nkeywal Labels: client Fix For: 0.96.0, 0.94.2 Attachments: 6364.94.v2.nolargetest.patch, 6364.94.v2.nolargetest.security-addendum.patch, 6364-host-serving-META.v1.patch, 6364.v11.nolargetest.patch, 6364.v1.patch, 6364.v1.patch, 6364.v2.patch, 6364.v3.patch, 6364.v3.patch, 6364.v5.patch, 6364.v5.withtests.patch, 6364.v6.patch, 6364.v6.withtests.patch, 6364.v7.withtests.patch, 6364.v8.withtests.patch, 6364.v9.patch, stacktrace.txt When a server host with a Region Server holding the .META. table is powered down on a live cluster, while the HBase cluster itself detects and reassigns the .META. table, connected HBase Client's take an excessively long time to detect this and re-discover the reassigned .META. Workaround: Decrease the ipc.socket.timeout on HBase Client side to a low value (default is 20s leading to 35 minute recovery time; we were able to get acceptable results with 100ms getting a 3 minute recovery) This was found during some hardware failure testing scenarios. Test Case: 1) Apply load via client app on HBase cluster for several minutes 2) Power down the region server holding the .META. server (i.e. power off ... and keep it off) 3) Measure how long it takes for cluster to reassign META table and for client threads to re-lookup and re-orient to the lesser cluster (minus the RS and DN on that host). Observation: 1) Client threads spike up to maxThreads size ... and take over 35 mins to recover (i.e. for the thread count to go back to normal) - no client calls are serviced - they just back up on a synchronized method (see #2 below) 2) All the client app threads queue up behind the oahh.ipc.HBaseClient#setupIOStreams method http://tinyurl.com/7js53dj After taking several thread dumps we found that the thread within this synchronized method was blocked on NetUtils.connect(this.socket, remoteId.getAddress(), getSocketTimeout(conf)); The client thread that gets the synchronized lock would try to connect to the dead RS (till socket times out after 20s), retries, and then the next thread gets in and so forth in a serial manner. Workaround: --- Default ipc.socket.timeout is set to 20s. We dropped this to a low number (1000 ms, 100 ms, etc) on the client side hbase-site.xml. With this setting, the client threads recovered in a couple of minutes by failing fast and re-discovering the .META. table on a reassigned RS. Assumption: This ipc.socket.timeout is only ever used during the initial HConnection setup via the NetUtils.connect and should only ever be used when connectivity to a region server is lost and needs to be re-established. i.e it does not affect the normal RPC actiivity as this is just the connect timeout. During RS GC periods, any _new_ clients trying to connect will fail and will require .META. table re-lookups. This above timeout workaround is only for the HBase client side. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6364) Powering down the server host holding the .META. table causes HBase Client to take excessively long to recover and connect to reassigned .META. table
[ https://issues.apache.org/jira/browse/HBASE-6364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439655#comment-13439655 ] Zhihong Ted Yu commented on HBASE-6364: --- Addendum looks good to me. Powering down the server host holding the .META. table causes HBase Client to take excessively long to recover and connect to reassigned .META. table - Key: HBASE-6364 URL: https://issues.apache.org/jira/browse/HBASE-6364 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.6, 0.92.1, 0.94.0 Reporter: Suraj Varma Assignee: nkeywal Labels: client Fix For: 0.96.0, 0.94.2 Attachments: 6364.94.v2.nolargetest.patch, 6364.94.v2.nolargetest.security-addendum.patch, 6364-host-serving-META.v1.patch, 6364.v11.nolargetest.patch, 6364.v1.patch, 6364.v1.patch, 6364.v2.patch, 6364.v3.patch, 6364.v3.patch, 6364.v5.patch, 6364.v5.withtests.patch, 6364.v6.patch, 6364.v6.withtests.patch, 6364.v7.withtests.patch, 6364.v8.withtests.patch, 6364.v9.patch, stacktrace.txt When a server host with a Region Server holding the .META. table is powered down on a live cluster, while the HBase cluster itself detects and reassigns the .META. table, connected HBase Client's take an excessively long time to detect this and re-discover the reassigned .META. Workaround: Decrease the ipc.socket.timeout on HBase Client side to a low value (default is 20s leading to 35 minute recovery time; we were able to get acceptable results with 100ms getting a 3 minute recovery) This was found during some hardware failure testing scenarios. Test Case: 1) Apply load via client app on HBase cluster for several minutes 2) Power down the region server holding the .META. server (i.e. power off ... and keep it off) 3) Measure how long it takes for cluster to reassign META table and for client threads to re-lookup and re-orient to the lesser cluster (minus the RS and DN on that host). Observation: 1) Client threads spike up to maxThreads size ... and take over 35 mins to recover (i.e. for the thread count to go back to normal) - no client calls are serviced - they just back up on a synchronized method (see #2 below) 2) All the client app threads queue up behind the oahh.ipc.HBaseClient#setupIOStreams method http://tinyurl.com/7js53dj After taking several thread dumps we found that the thread within this synchronized method was blocked on NetUtils.connect(this.socket, remoteId.getAddress(), getSocketTimeout(conf)); The client thread that gets the synchronized lock would try to connect to the dead RS (till socket times out after 20s), retries, and then the next thread gets in and so forth in a serial manner. Workaround: --- Default ipc.socket.timeout is set to 20s. We dropped this to a low number (1000 ms, 100 ms, etc) on the client side hbase-site.xml. With this setting, the client threads recovered in a couple of minutes by failing fast and re-discovering the .META. table on a reassigned RS. Assumption: This ipc.socket.timeout is only ever used during the initial HConnection setup via the NetUtils.connect and should only ever be used when connectivity to a region server is lost and needs to be re-established. i.e it does not affect the normal RPC actiivity as this is just the connect timeout. During RS GC periods, any _new_ clients trying to connect will fail and will require .META. table re-lookups. This above timeout workaround is only for the HBase client side. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6364) Powering down the server host holding the .META. table causes HBase Client to take excessively long to recover and connect to reassigned .META. table
[ https://issues.apache.org/jira/browse/HBASE-6364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439659#comment-13439659 ] Hadoop QA commented on HBASE-6364: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12541992/6364.94.v2.nolargetest.security-addendum.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2648//console This message is automatically generated. Powering down the server host holding the .META. table causes HBase Client to take excessively long to recover and connect to reassigned .META. table - Key: HBASE-6364 URL: https://issues.apache.org/jira/browse/HBASE-6364 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.6, 0.92.1, 0.94.0 Reporter: Suraj Varma Assignee: nkeywal Labels: client Fix For: 0.96.0, 0.94.2 Attachments: 6364.94.v2.nolargetest.patch, 6364.94.v2.nolargetest.security-addendum.patch, 6364-host-serving-META.v1.patch, 6364.v11.nolargetest.patch, 6364.v1.patch, 6364.v1.patch, 6364.v2.patch, 6364.v3.patch, 6364.v3.patch, 6364.v5.patch, 6364.v5.withtests.patch, 6364.v6.patch, 6364.v6.withtests.patch, 6364.v7.withtests.patch, 6364.v8.withtests.patch, 6364.v9.patch, stacktrace.txt When a server host with a Region Server holding the .META. table is powered down on a live cluster, while the HBase cluster itself detects and reassigns the .META. table, connected HBase Client's take an excessively long time to detect this and re-discover the reassigned .META. Workaround: Decrease the ipc.socket.timeout on HBase Client side to a low value (default is 20s leading to 35 minute recovery time; we were able to get acceptable results with 100ms getting a 3 minute recovery) This was found during some hardware failure testing scenarios. Test Case: 1) Apply load via client app on HBase cluster for several minutes 2) Power down the region server holding the .META. server (i.e. power off ... and keep it off) 3) Measure how long it takes for cluster to reassign META table and for client threads to re-lookup and re-orient to the lesser cluster (minus the RS and DN on that host). Observation: 1) Client threads spike up to maxThreads size ... and take over 35 mins to recover (i.e. for the thread count to go back to normal) - no client calls are serviced - they just back up on a synchronized method (see #2 below) 2) All the client app threads queue up behind the oahh.ipc.HBaseClient#setupIOStreams method http://tinyurl.com/7js53dj After taking several thread dumps we found that the thread within this synchronized method was blocked on NetUtils.connect(this.socket, remoteId.getAddress(), getSocketTimeout(conf)); The client thread that gets the synchronized lock would try to connect to the dead RS (till socket times out after 20s), retries, and then the next thread gets in and so forth in a serial manner. Workaround: --- Default ipc.socket.timeout is set to 20s. We dropped this to a low number (1000 ms, 100 ms, etc) on the client side hbase-site.xml. With this setting, the client threads recovered in a couple of minutes by failing fast and re-discovering the .META. table on a reassigned RS. Assumption: This ipc.socket.timeout is only ever used during the initial HConnection setup via the NetUtils.connect and should only ever be used when connectivity to a region server is lost and needs to be re-established. i.e it does not affect the normal RPC actiivity as this is just the connect timeout. During RS GC periods, any _new_ clients trying to connect will fail and will require .META. table re-lookups. This above timeout workaround is only for the HBase client side. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6581) Build with hadoop.profile=3.0
[ https://issues.apache.org/jira/browse/HBASE-6581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439664#comment-13439664 ] Ivan Kelly commented on HBASE-6581: --- Actually, the original patch missed src/assembly/hadoop-three-compat.xml also. Build with hadoop.profile=3.0 - Key: HBASE-6581 URL: https://issues.apache.org/jira/browse/HBASE-6581 Project: HBase Issue Type: Bug Reporter: Eric Charles Attachments: HBASE-6581-1.patch, HBASE-6581-2.patch, HBASE-6581.diff Building trunk with hadoop.profile=3.0 gives exceptions (see [1]) due to change in the hadoop maven modules naming (and also usage of 3.0-SNAPSHOT instead of 3.0.0-SNAPSHOT in hbase-common). I can provide a patch that would move most of hadoop dependencies in their respective profiles and will define the correct hadoop deps in the 3.0 profile. Please tell me if that's ok to go this way. Thx, Eric [1] $ mvn clean install -Dhadoop.profile=3.0 [INFO] Scanning for projects... [ERROR] The build could not read 3 projects - [Help 1] [ERROR] [ERROR] The project org.apache.hbase:hbase-server:0.95-SNAPSHOT (/d/hbase.svn/hbase-server/pom.xml) has 3 errors [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-common:jar is missing. @ line 655, column 21 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-annotations:jar is missing. @ line 659, column 21 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-minicluster:jar is missing. @ line 663, column 21 [ERROR] [ERROR] The project org.apache.hbase:hbase-common:0.95-SNAPSHOT (/d/hbase.svn/hbase-common/pom.xml) has 3 errors [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-common:jar is missing. @ line 170, column 21 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-annotations:jar is missing. @ line 174, column 21 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-minicluster:jar is missing. @ line 178, column 21 [ERROR] [ERROR] The project org.apache.hbase:hbase-it:0.95-SNAPSHOT (/d/hbase.svn/hbase-it/pom.xml) has 3 errors [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-common:jar is missing. @ line 220, column 18 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-annotations:jar is missing. @ line 224, column 21 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-minicluster:jar is missing. @ line 228, column 21 [ERROR] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6364) Powering down the server host holding the .META. table causes HBase Client to take excessively long to recover and connect to reassigned .META. table
[ https://issues.apache.org/jira/browse/HBASE-6364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439672#comment-13439672 ] nkeywal commented on HBASE-6364: No failure on the unit tests small medium with the security profile. Committed revision 1376136. Powering down the server host holding the .META. table causes HBase Client to take excessively long to recover and connect to reassigned .META. table - Key: HBASE-6364 URL: https://issues.apache.org/jira/browse/HBASE-6364 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.6, 0.92.1, 0.94.0 Reporter: Suraj Varma Assignee: nkeywal Labels: client Fix For: 0.96.0, 0.94.2 Attachments: 6364.94.v2.nolargetest.patch, 6364.94.v2.nolargetest.security-addendum.patch, 6364-host-serving-META.v1.patch, 6364.v11.nolargetest.patch, 6364.v1.patch, 6364.v1.patch, 6364.v2.patch, 6364.v3.patch, 6364.v3.patch, 6364.v5.patch, 6364.v5.withtests.patch, 6364.v6.patch, 6364.v6.withtests.patch, 6364.v7.withtests.patch, 6364.v8.withtests.patch, 6364.v9.patch, stacktrace.txt When a server host with a Region Server holding the .META. table is powered down on a live cluster, while the HBase cluster itself detects and reassigns the .META. table, connected HBase Client's take an excessively long time to detect this and re-discover the reassigned .META. Workaround: Decrease the ipc.socket.timeout on HBase Client side to a low value (default is 20s leading to 35 minute recovery time; we were able to get acceptable results with 100ms getting a 3 minute recovery) This was found during some hardware failure testing scenarios. Test Case: 1) Apply load via client app on HBase cluster for several minutes 2) Power down the region server holding the .META. server (i.e. power off ... and keep it off) 3) Measure how long it takes for cluster to reassign META table and for client threads to re-lookup and re-orient to the lesser cluster (minus the RS and DN on that host). Observation: 1) Client threads spike up to maxThreads size ... and take over 35 mins to recover (i.e. for the thread count to go back to normal) - no client calls are serviced - they just back up on a synchronized method (see #2 below) 2) All the client app threads queue up behind the oahh.ipc.HBaseClient#setupIOStreams method http://tinyurl.com/7js53dj After taking several thread dumps we found that the thread within this synchronized method was blocked on NetUtils.connect(this.socket, remoteId.getAddress(), getSocketTimeout(conf)); The client thread that gets the synchronized lock would try to connect to the dead RS (till socket times out after 20s), retries, and then the next thread gets in and so forth in a serial manner. Workaround: --- Default ipc.socket.timeout is set to 20s. We dropped this to a low number (1000 ms, 100 ms, etc) on the client side hbase-site.xml. With this setting, the client threads recovered in a couple of minutes by failing fast and re-discovering the .META. table on a reassigned RS. Assumption: This ipc.socket.timeout is only ever used during the initial HConnection setup via the NetUtils.connect and should only ever be used when connectivity to a region server is lost and needs to be re-established. i.e it does not affect the normal RPC actiivity as this is just the connect timeout. During RS GC periods, any _new_ clients trying to connect will fail and will require .META. table re-lookups. This above timeout workaround is only for the HBase client side. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6516) hbck cannot detect any IOException while .tableinfo file is missing
[ https://issues.apache.org/jira/browse/HBASE-6516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439680#comment-13439680 ] Jonathan Hsieh commented on HBASE-6516: --- -ROOT- and .META. have always been special cases, so if that is the case with those tables, special case them. hbck cannot detect any IOException while .tableinfo file is missing - Key: HBASE-6516 URL: https://issues.apache.org/jira/browse/HBASE-6516 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.94.0, 0.96.0 Reporter: Jie Huang Attachments: hbase-6516.patch, hbase-6516-v2.patch HBaseFsck checks those missing .tableinfo files in loadHdfsRegionInfos() function. However, no IoException will be catched while .tableinfo is missing, since FSTableDescriptors.getTableDescriptor doesn't throw any IoException. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3271) Allow .META. table to be exported
[ https://issues.apache.org/jira/browse/HBASE-3271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Ted Yu updated HBASE-3271: -- Fix Version/s: 0.96.0 Hadoop Flags: Reviewed Status: Patch Available (was: Open) nit: the following line is too long: {code} + InputSplit split = new TableSplit(table.getTableName(), HConstants.EMPTY_BYTE_ARRAY, HConstants.EMPTY_BYTE_ARRAY, {code} Allow .META. table to be exported - Key: HBASE-3271 URL: https://issues.apache.org/jira/browse/HBASE-3271 Project: HBase Issue Type: Improvement Components: util Affects Versions: 0.20.6 Reporter: Ted Yu Fix For: 0.96.0 Attachments: HBASE-3271.patch I tried to export .META. table in 0.20.6 and got: [hadoop@us01-ciqps1-name01 hbase]$ bin/hbase org.apache.hadoop.hbase.mapreduce.Export .META. h-meta 1 0 0 10/11/23 20:59:05 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 2010-11-23 20:59:05.255::INFO: Logging to STDERR via org.mortbay.log.StdErrLog 2010-11-23 20:59:05.255::INFO: verisons=1, starttime=0, endtime=9223372036854775807 10/11/23 20:59:05 INFO zookeeper.ZooKeeper: Client environment:zookeeper.version=3.2.2-888565, built on 12/08/2009 21:51 GMT 10/11/23 20:59:05 INFO zookeeper.ZooKeeper: Client environment:host.name=us01-ciqps1-name01.carrieriq.com 10/11/23 20:59:05 INFO zookeeper.ZooKeeper: Client environment:java.version=1.6.0_21 10/11/23 20:59:05 INFO zookeeper.ZooKeeper: Client environment:java.vendor=Sun Microsystems Inc. ... 10/11/23 20:59:05 INFO zookeeper.ClientCnxn: Server connection successful 10/11/23 20:59:05 DEBUG zookeeper.ZooKeeperWrapper: Read ZNode /hbase/root-region-server got 10.202.50.112:60020 10/11/23 20:59:05 DEBUG client.HConnectionManager$TableServers: Found ROOT at 10.202.50.112:60020 10/11/23 20:59:05 DEBUG client.HConnectionManager$TableServers: Cached location for .META.,,1 is us01-ciqps1-grid02.carrieriq.com:60020 Exception in thread main java.io.IOException: Expecting at least one region. at org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:281) at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779) at org.apache.hadoop.mapreduce.Job.submit(Job.java:432) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447) at org.apache.hadoop.hbase.mapreduce.Export.main(Export.java:146) Related code is: if (keys == null || keys.getFirst() == null || keys.getFirst().length == 0) { throw new IOException(Expecting at least one region.); } My intention was to save the dangling rows in .META. (for future investigation) which prevented a table from being created. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (HBASE-3271) Allow .META. table to be exported
[ https://issues.apache.org/jira/browse/HBASE-3271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439684#comment-13439684 ] Zhihong Ted Yu edited comment on HBASE-3271 at 8/23/12 4:07 AM: nit: the following two lines are too long: {code} +GenericOptionsParser opts = new GenericOptionsParser(new Configuration(cluster.getConfiguration()), args); ... + InputSplit split = new TableSplit(table.getTableName(), HConstants.EMPTY_BYTE_ARRAY, HConstants.EMPTY_BYTE_ARRAY, {code} was (Author: zhi...@ebaysf.com): nit: the following line is too long: {code} + InputSplit split = new TableSplit(table.getTableName(), HConstants.EMPTY_BYTE_ARRAY, HConstants.EMPTY_BYTE_ARRAY, {code} Allow .META. table to be exported - Key: HBASE-3271 URL: https://issues.apache.org/jira/browse/HBASE-3271 Project: HBase Issue Type: Improvement Components: util Affects Versions: 0.20.6 Reporter: Ted Yu Fix For: 0.96.0 Attachments: HBASE-3271.patch I tried to export .META. table in 0.20.6 and got: [hadoop@us01-ciqps1-name01 hbase]$ bin/hbase org.apache.hadoop.hbase.mapreduce.Export .META. h-meta 1 0 0 10/11/23 20:59:05 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 2010-11-23 20:59:05.255::INFO: Logging to STDERR via org.mortbay.log.StdErrLog 2010-11-23 20:59:05.255::INFO: verisons=1, starttime=0, endtime=9223372036854775807 10/11/23 20:59:05 INFO zookeeper.ZooKeeper: Client environment:zookeeper.version=3.2.2-888565, built on 12/08/2009 21:51 GMT 10/11/23 20:59:05 INFO zookeeper.ZooKeeper: Client environment:host.name=us01-ciqps1-name01.carrieriq.com 10/11/23 20:59:05 INFO zookeeper.ZooKeeper: Client environment:java.version=1.6.0_21 10/11/23 20:59:05 INFO zookeeper.ZooKeeper: Client environment:java.vendor=Sun Microsystems Inc. ... 10/11/23 20:59:05 INFO zookeeper.ClientCnxn: Server connection successful 10/11/23 20:59:05 DEBUG zookeeper.ZooKeeperWrapper: Read ZNode /hbase/root-region-server got 10.202.50.112:60020 10/11/23 20:59:05 DEBUG client.HConnectionManager$TableServers: Found ROOT at 10.202.50.112:60020 10/11/23 20:59:05 DEBUG client.HConnectionManager$TableServers: Cached location for .META.,,1 is us01-ciqps1-grid02.carrieriq.com:60020 Exception in thread main java.io.IOException: Expecting at least one region. at org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:281) at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779) at org.apache.hadoop.mapreduce.Job.submit(Job.java:432) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447) at org.apache.hadoop.hbase.mapreduce.Export.main(Export.java:146) Related code is: if (keys == null || keys.getFirst() == null || keys.getFirst().length == 0) { throw new IOException(Expecting at least one region.); } My intention was to save the dangling rows in .META. (for future investigation) which prevented a table from being created. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6610) HFileLink: Hardlink alternative for snapshot restore
[ https://issues.apache.org/jira/browse/HBASE-6610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439709#comment-13439709 ] Zhihong Ted Yu commented on HBASE-6610: --- TestStore#testDeleteExpiredStoreFiles failure seems to be related to the patch. HFileLink: Hardlink alternative for snapshot restore Key: HBASE-6610 URL: https://issues.apache.org/jira/browse/HBASE-6610 Project: HBase Issue Type: Sub-task Components: io Affects Versions: 0.96.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Labels: snapshot Fix For: 0.96.0 Attachments: HBASE-6610-v1.patch To avoid copying data during restore snapshot we need to introduce an HFile Link that allows to reference a file that can be in the original path (/hbase/table/region/cf/hfile) or, if the file is archived, in the archive directory (/hbase/.archive/table/region/cf/hfile). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5169) Group of Region Server, a subtask of issue 4120
[ https://issues.apache.org/jira/browse/HBASE-5169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439717#comment-13439717 ] Vandana Ayyalasomayajula commented on HBASE-5169: - Ted, We are taking a look at the current design of this feature and may be in couple of days we can come up with a proposal. Group of Region Server, a subtask of issue 4120 - Key: HBASE-5169 URL: https://issues.apache.org/jira/browse/HBASE-5169 Project: HBase Issue Type: Sub-task Components: master Reporter: Liu Jia Assignee: Liu Jia Fix For: 0.96.0 Attachments: GroupOfRegionServer_v1.patch, GroupOfRegionServer_v2.patch This is a subtask of issue 4120,this patch provides the region server group feature of HBase. With this patch, region servers can be divided into groups,one table could belong to one or more groups while the region server can only belong to one group. Work load in defferent groups will not affect each other. This patch provides table level and group level load balance,the default load balance and region assignments will consider the group configuration and assign regions to their corresponding groups. More information, please check out the documents of issue 4120. There is a web tool of this patch providing operations of group managements like add/delete group, move in/out servers,change table's group attribute ,balance groups, balance tables. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6364) Powering down the server host holding the .META. table causes HBase Client to take excessively long to recover and connect to reassigned .META. table
[ https://issues.apache.org/jira/browse/HBASE-6364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439718#comment-13439718 ] Hudson commented on HBASE-6364: --- Integrated in HBase-0.94 #416 (See [https://builds.apache.org/job/HBase-0.94/416/]) HBASE-6364 Powering down the server host holding the .META. table causes HBase Client to take excessively long to recover and connect to reassigned .META. table - security addendum (Revision 1376136) Result = FAILURE nkeywal : Files : * /hbase/branches/0.94/security/src/main/java/org/apache/hadoop/hbase/ipc/SecureClient.java Powering down the server host holding the .META. table causes HBase Client to take excessively long to recover and connect to reassigned .META. table - Key: HBASE-6364 URL: https://issues.apache.org/jira/browse/HBASE-6364 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.6, 0.92.1, 0.94.0 Reporter: Suraj Varma Assignee: nkeywal Labels: client Fix For: 0.96.0, 0.94.2 Attachments: 6364.94.v2.nolargetest.patch, 6364.94.v2.nolargetest.security-addendum.patch, 6364-host-serving-META.v1.patch, 6364.v11.nolargetest.patch, 6364.v1.patch, 6364.v1.patch, 6364.v2.patch, 6364.v3.patch, 6364.v3.patch, 6364.v5.patch, 6364.v5.withtests.patch, 6364.v6.patch, 6364.v6.withtests.patch, 6364.v7.withtests.patch, 6364.v8.withtests.patch, 6364.v9.patch, stacktrace.txt When a server host with a Region Server holding the .META. table is powered down on a live cluster, while the HBase cluster itself detects and reassigns the .META. table, connected HBase Client's take an excessively long time to detect this and re-discover the reassigned .META. Workaround: Decrease the ipc.socket.timeout on HBase Client side to a low value (default is 20s leading to 35 minute recovery time; we were able to get acceptable results with 100ms getting a 3 minute recovery) This was found during some hardware failure testing scenarios. Test Case: 1) Apply load via client app on HBase cluster for several minutes 2) Power down the region server holding the .META. server (i.e. power off ... and keep it off) 3) Measure how long it takes for cluster to reassign META table and for client threads to re-lookup and re-orient to the lesser cluster (minus the RS and DN on that host). Observation: 1) Client threads spike up to maxThreads size ... and take over 35 mins to recover (i.e. for the thread count to go back to normal) - no client calls are serviced - they just back up on a synchronized method (see #2 below) 2) All the client app threads queue up behind the oahh.ipc.HBaseClient#setupIOStreams method http://tinyurl.com/7js53dj After taking several thread dumps we found that the thread within this synchronized method was blocked on NetUtils.connect(this.socket, remoteId.getAddress(), getSocketTimeout(conf)); The client thread that gets the synchronized lock would try to connect to the dead RS (till socket times out after 20s), retries, and then the next thread gets in and so forth in a serial manner. Workaround: --- Default ipc.socket.timeout is set to 20s. We dropped this to a low number (1000 ms, 100 ms, etc) on the client side hbase-site.xml. With this setting, the client threads recovered in a couple of minutes by failing fast and re-discovering the .META. table on a reassigned RS. Assumption: This ipc.socket.timeout is only ever used during the initial HConnection setup via the NetUtils.connect and should only ever be used when connectivity to a region server is lost and needs to be re-established. i.e it does not affect the normal RPC actiivity as this is just the connect timeout. During RS GC periods, any _new_ clients trying to connect will fail and will require .META. table re-lookups. This above timeout workaround is only for the HBase client side. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6586) Quarantine Corrupted HFiles
[ https://issues.apache.org/jira/browse/HBASE-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-6586: -- Status: Patch Available (was: Open) Quarantine Corrupted HFiles --- Key: HBASE-6586 URL: https://issues.apache.org/jira/browse/HBASE-6586 Project: HBase Issue Type: Improvement Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-hbase-6568-hbck-quarantine-v6.patch, hbase-6586-92-v3.patch, hbase-6586-94-v3.patch, hbase-6586.patch, hbase-6586-trunk-v3.patch We've encountered a few upgrades from 0.90 hbases + 20.2/1.x hdfs to 0.92 hbases + hdfs 2.x that get stuck. I haven't been able to duplicate the problem in my dev environment but we suspect this may be related to HDFS-3731. On the HBase side, it seems reasonable to quarantine what are most likely truncated hfiles, so that can could later be recovered. Here's an example of the exception we've encountered: {code} 2012-07-18 05:55:01,152 ERROR handler.OpenRegionHandler (OpenRegionHandler.java:openRegion(346)) - Failed open of region=user_mappings,080112102AA76EF98197605D341B9E6C5824D2BC|1001,1317824890618.eaed0e7abc6d27d28ff0e5a9b49c4c 0d. java.io.IOException: java.lang.IllegalArgumentException: Invalid HFile version: 842220600 (expected to be between 1 and 2) at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:306) at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:371) at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:387) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.init(StoreFile.java:1026) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:485) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:566) at org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:286) at org.apache.hadoop.hbase.regionserver.Store.init(Store.java:223) at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2534) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:454) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3282) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3230) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:331) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:107) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: java.lang.IllegalArgumentException: Invalid HFile version: 842220600 (expected to be between 1 and 2) at org.apache.hadoop.hbase.io.hfile.HFile.checkFormatVersion(HFile.java:515) at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:303) ... 17 more {code} Specifically -- the FixedFileTrailer are incorrect, and seemingly missing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6586) Quarantine Corrupted HFiles
[ https://issues.apache.org/jira/browse/HBASE-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-6586: -- Attachment: hbase-6586-94-v3.patch hbase-6586-92-v3.patch hbase-6586-trunk-v3.patch v3, including backports to 92/94. Minor changes due to HBASE-5714 perms checks. Quarantine Corrupted HFiles --- Key: HBASE-6586 URL: https://issues.apache.org/jira/browse/HBASE-6586 Project: HBase Issue Type: Improvement Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-hbase-6568-hbck-quarantine-v6.patch, hbase-6586-92-v3.patch, hbase-6586-94-v3.patch, hbase-6586.patch, hbase-6586-trunk-v3.patch We've encountered a few upgrades from 0.90 hbases + 20.2/1.x hdfs to 0.92 hbases + hdfs 2.x that get stuck. I haven't been able to duplicate the problem in my dev environment but we suspect this may be related to HDFS-3731. On the HBase side, it seems reasonable to quarantine what are most likely truncated hfiles, so that can could later be recovered. Here's an example of the exception we've encountered: {code} 2012-07-18 05:55:01,152 ERROR handler.OpenRegionHandler (OpenRegionHandler.java:openRegion(346)) - Failed open of region=user_mappings,080112102AA76EF98197605D341B9E6C5824D2BC|1001,1317824890618.eaed0e7abc6d27d28ff0e5a9b49c4c 0d. java.io.IOException: java.lang.IllegalArgumentException: Invalid HFile version: 842220600 (expected to be between 1 and 2) at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:306) at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:371) at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:387) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.init(StoreFile.java:1026) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:485) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:566) at org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:286) at org.apache.hadoop.hbase.regionserver.Store.init(Store.java:223) at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2534) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:454) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3282) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3230) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:331) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:107) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: java.lang.IllegalArgumentException: Invalid HFile version: 842220600 (expected to be between 1 and 2) at org.apache.hadoop.hbase.io.hfile.HFile.checkFormatVersion(HFile.java:515) at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:303) ... 17 more {code} Specifically -- the FixedFileTrailer are incorrect, and seemingly missing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6524) Hooks for hbase tracing
[ https://issues.apache.org/jira/browse/HBASE-6524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439723#comment-13439723 ] Jonathan Leavitt commented on HBASE-6524: - {quote}There were 90 more failed tests introduced for hadoop 2.0 profile. I suggest we temporarily rollback the patch and give Jonathan enough time to iron out the 90+ test failures.{quote} I think I have the fix. Stay tuned. Hooks for hbase tracing --- Key: HBASE-6524 URL: https://issues.apache.org/jira/browse/HBASE-6524 Project: HBase Issue Type: Sub-task Reporter: Jonathan Leavitt Fix For: 0.96.0 Attachments: createTableTrace.png, hbase-6524.diff Includes the hooks that use [htrace|http://www.github.com/cloudera/htrace] library to add dapper-like tracing to hbase. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6364) Powering down the server host holding the .META. table causes HBase Client to take excessively long to recover and connect to reassigned .META. table
[ https://issues.apache.org/jira/browse/HBASE-6364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439726#comment-13439726 ] nkeywal commented on HBASE-6364: This time it's the usual guilty so I think we're ok. Powering down the server host holding the .META. table causes HBase Client to take excessively long to recover and connect to reassigned .META. table - Key: HBASE-6364 URL: https://issues.apache.org/jira/browse/HBASE-6364 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.6, 0.92.1, 0.94.0 Reporter: Suraj Varma Assignee: nkeywal Labels: client Fix For: 0.96.0, 0.94.2 Attachments: 6364.94.v2.nolargetest.patch, 6364.94.v2.nolargetest.security-addendum.patch, 6364-host-serving-META.v1.patch, 6364.v11.nolargetest.patch, 6364.v1.patch, 6364.v1.patch, 6364.v2.patch, 6364.v3.patch, 6364.v3.patch, 6364.v5.patch, 6364.v5.withtests.patch, 6364.v6.patch, 6364.v6.withtests.patch, 6364.v7.withtests.patch, 6364.v8.withtests.patch, 6364.v9.patch, stacktrace.txt When a server host with a Region Server holding the .META. table is powered down on a live cluster, while the HBase cluster itself detects and reassigns the .META. table, connected HBase Client's take an excessively long time to detect this and re-discover the reassigned .META. Workaround: Decrease the ipc.socket.timeout on HBase Client side to a low value (default is 20s leading to 35 minute recovery time; we were able to get acceptable results with 100ms getting a 3 minute recovery) This was found during some hardware failure testing scenarios. Test Case: 1) Apply load via client app on HBase cluster for several minutes 2) Power down the region server holding the .META. server (i.e. power off ... and keep it off) 3) Measure how long it takes for cluster to reassign META table and for client threads to re-lookup and re-orient to the lesser cluster (minus the RS and DN on that host). Observation: 1) Client threads spike up to maxThreads size ... and take over 35 mins to recover (i.e. for the thread count to go back to normal) - no client calls are serviced - they just back up on a synchronized method (see #2 below) 2) All the client app threads queue up behind the oahh.ipc.HBaseClient#setupIOStreams method http://tinyurl.com/7js53dj After taking several thread dumps we found that the thread within this synchronized method was blocked on NetUtils.connect(this.socket, remoteId.getAddress(), getSocketTimeout(conf)); The client thread that gets the synchronized lock would try to connect to the dead RS (till socket times out after 20s), retries, and then the next thread gets in and so forth in a serial manner. Workaround: --- Default ipc.socket.timeout is set to 20s. We dropped this to a low number (1000 ms, 100 ms, etc) on the client side hbase-site.xml. With this setting, the client threads recovered in a couple of minutes by failing fast and re-discovering the .META. table on a reassigned RS. Assumption: This ipc.socket.timeout is only ever used during the initial HConnection setup via the NetUtils.connect and should only ever be used when connectivity to a region server is lost and needs to be re-established. i.e it does not affect the normal RPC actiivity as this is just the connect timeout. During RS GC periods, any _new_ clients trying to connect will fail and will require .META. table re-lookups. This above timeout workaround is only for the HBase client side. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6364) Powering down the server host holding the .META. table causes HBase Client to take excessively long to recover and connect to reassigned .META. table
[ https://issues.apache.org/jira/browse/HBASE-6364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439725#comment-13439725 ] stack commented on HBASE-6364: -- +1 on addendum Powering down the server host holding the .META. table causes HBase Client to take excessively long to recover and connect to reassigned .META. table - Key: HBASE-6364 URL: https://issues.apache.org/jira/browse/HBASE-6364 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.6, 0.92.1, 0.94.0 Reporter: Suraj Varma Assignee: nkeywal Labels: client Fix For: 0.96.0, 0.94.2 Attachments: 6364.94.v2.nolargetest.patch, 6364.94.v2.nolargetest.security-addendum.patch, 6364-host-serving-META.v1.patch, 6364.v11.nolargetest.patch, 6364.v1.patch, 6364.v1.patch, 6364.v2.patch, 6364.v3.patch, 6364.v3.patch, 6364.v5.patch, 6364.v5.withtests.patch, 6364.v6.patch, 6364.v6.withtests.patch, 6364.v7.withtests.patch, 6364.v8.withtests.patch, 6364.v9.patch, stacktrace.txt When a server host with a Region Server holding the .META. table is powered down on a live cluster, while the HBase cluster itself detects and reassigns the .META. table, connected HBase Client's take an excessively long time to detect this and re-discover the reassigned .META. Workaround: Decrease the ipc.socket.timeout on HBase Client side to a low value (default is 20s leading to 35 minute recovery time; we were able to get acceptable results with 100ms getting a 3 minute recovery) This was found during some hardware failure testing scenarios. Test Case: 1) Apply load via client app on HBase cluster for several minutes 2) Power down the region server holding the .META. server (i.e. power off ... and keep it off) 3) Measure how long it takes for cluster to reassign META table and for client threads to re-lookup and re-orient to the lesser cluster (minus the RS and DN on that host). Observation: 1) Client threads spike up to maxThreads size ... and take over 35 mins to recover (i.e. for the thread count to go back to normal) - no client calls are serviced - they just back up on a synchronized method (see #2 below) 2) All the client app threads queue up behind the oahh.ipc.HBaseClient#setupIOStreams method http://tinyurl.com/7js53dj After taking several thread dumps we found that the thread within this synchronized method was blocked on NetUtils.connect(this.socket, remoteId.getAddress(), getSocketTimeout(conf)); The client thread that gets the synchronized lock would try to connect to the dead RS (till socket times out after 20s), retries, and then the next thread gets in and so forth in a serial manner. Workaround: --- Default ipc.socket.timeout is set to 20s. We dropped this to a low number (1000 ms, 100 ms, etc) on the client side hbase-site.xml. With this setting, the client threads recovered in a couple of minutes by failing fast and re-discovering the .META. table on a reassigned RS. Assumption: This ipc.socket.timeout is only ever used during the initial HConnection setup via the NetUtils.connect and should only ever be used when connectivity to a region server is lost and needs to be re-established. i.e it does not affect the normal RPC actiivity as this is just the connect timeout. During RS GC periods, any _new_ clients trying to connect will fail and will require .META. table re-lookups. This above timeout workaround is only for the HBase client side. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6586) Quarantine Corrupted HFiles
[ https://issues.apache.org/jira/browse/HBASE-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439730#comment-13439730 ] Hadoop QA commented on HBASE-6586: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12542006/hbase-6586-94-v3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 9 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2650//console This message is automatically generated. Quarantine Corrupted HFiles --- Key: HBASE-6586 URL: https://issues.apache.org/jira/browse/HBASE-6586 Project: HBase Issue Type: Improvement Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-hbase-6568-hbck-quarantine-v6.patch, hbase-6586-92-v3.patch, hbase-6586-94-v3.patch, hbase-6586.patch, hbase-6586-trunk-v3.patch We've encountered a few upgrades from 0.90 hbases + 20.2/1.x hdfs to 0.92 hbases + hdfs 2.x that get stuck. I haven't been able to duplicate the problem in my dev environment but we suspect this may be related to HDFS-3731. On the HBase side, it seems reasonable to quarantine what are most likely truncated hfiles, so that can could later be recovered. Here's an example of the exception we've encountered: {code} 2012-07-18 05:55:01,152 ERROR handler.OpenRegionHandler (OpenRegionHandler.java:openRegion(346)) - Failed open of region=user_mappings,080112102AA76EF98197605D341B9E6C5824D2BC|1001,1317824890618.eaed0e7abc6d27d28ff0e5a9b49c4c 0d. java.io.IOException: java.lang.IllegalArgumentException: Invalid HFile version: 842220600 (expected to be between 1 and 2) at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:306) at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:371) at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:387) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.init(StoreFile.java:1026) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:485) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:566) at org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:286) at org.apache.hadoop.hbase.regionserver.Store.init(Store.java:223) at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2534) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:454) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3282) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3230) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:331) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:107) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: java.lang.IllegalArgumentException: Invalid HFile version: 842220600 (expected to be between 1 and 2) at org.apache.hadoop.hbase.io.hfile.HFile.checkFormatVersion(HFile.java:515) at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:303) ... 17 more {code} Specifically -- the FixedFileTrailer are incorrect, and seemingly missing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6586) Quarantine Corrupted HFiles
[ https://issues.apache.org/jira/browse/HBASE-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-6586: -- Attachment: (was: hbase-6586-trunk-v3.patch) Quarantine Corrupted HFiles --- Key: HBASE-6586 URL: https://issues.apache.org/jira/browse/HBASE-6586 Project: HBase Issue Type: Improvement Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-hbase-6568-hbck-quarantine-v6.patch, hbase-6586-92-v3.patch, hbase-6586-94-v3.patch, hbase-6586.patch, hbase-6586-trunk-v3.patch We've encountered a few upgrades from 0.90 hbases + 20.2/1.x hdfs to 0.92 hbases + hdfs 2.x that get stuck. I haven't been able to duplicate the problem in my dev environment but we suspect this may be related to HDFS-3731. On the HBase side, it seems reasonable to quarantine what are most likely truncated hfiles, so that can could later be recovered. Here's an example of the exception we've encountered: {code} 2012-07-18 05:55:01,152 ERROR handler.OpenRegionHandler (OpenRegionHandler.java:openRegion(346)) - Failed open of region=user_mappings,080112102AA76EF98197605D341B9E6C5824D2BC|1001,1317824890618.eaed0e7abc6d27d28ff0e5a9b49c4c 0d. java.io.IOException: java.lang.IllegalArgumentException: Invalid HFile version: 842220600 (expected to be between 1 and 2) at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:306) at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:371) at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:387) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.init(StoreFile.java:1026) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:485) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:566) at org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:286) at org.apache.hadoop.hbase.regionserver.Store.init(Store.java:223) at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2534) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:454) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3282) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3230) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:331) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:107) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: java.lang.IllegalArgumentException: Invalid HFile version: 842220600 (expected to be between 1 and 2) at org.apache.hadoop.hbase.io.hfile.HFile.checkFormatVersion(HFile.java:515) at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:303) ... 17 more {code} Specifically -- the FixedFileTrailer are incorrect, and seemingly missing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6586) Quarantine Corrupted HFiles
[ https://issues.apache.org/jira/browse/HBASE-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-6586: -- Status: Open (was: Patch Available) Quarantine Corrupted HFiles --- Key: HBASE-6586 URL: https://issues.apache.org/jira/browse/HBASE-6586 Project: HBase Issue Type: Improvement Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-hbase-6568-hbck-quarantine-v6.patch, hbase-6586-92-v3.patch, hbase-6586-94-v3.patch, hbase-6586.patch, hbase-6586-trunk-v3.patch We've encountered a few upgrades from 0.90 hbases + 20.2/1.x hdfs to 0.92 hbases + hdfs 2.x that get stuck. I haven't been able to duplicate the problem in my dev environment but we suspect this may be related to HDFS-3731. On the HBase side, it seems reasonable to quarantine what are most likely truncated hfiles, so that can could later be recovered. Here's an example of the exception we've encountered: {code} 2012-07-18 05:55:01,152 ERROR handler.OpenRegionHandler (OpenRegionHandler.java:openRegion(346)) - Failed open of region=user_mappings,080112102AA76EF98197605D341B9E6C5824D2BC|1001,1317824890618.eaed0e7abc6d27d28ff0e5a9b49c4c 0d. java.io.IOException: java.lang.IllegalArgumentException: Invalid HFile version: 842220600 (expected to be between 1 and 2) at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:306) at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:371) at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:387) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.init(StoreFile.java:1026) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:485) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:566) at org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:286) at org.apache.hadoop.hbase.regionserver.Store.init(Store.java:223) at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2534) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:454) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3282) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3230) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:331) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:107) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: java.lang.IllegalArgumentException: Invalid HFile version: 842220600 (expected to be between 1 and 2) at org.apache.hadoop.hbase.io.hfile.HFile.checkFormatVersion(HFile.java:515) at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:303) ... 17 more {code} Specifically -- the FixedFileTrailer are incorrect, and seemingly missing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6586) Quarantine Corrupted HFiles
[ https://issues.apache.org/jira/browse/HBASE-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-6586: -- Attachment: hbase-6586-trunk-v3.patch Quarantine Corrupted HFiles --- Key: HBASE-6586 URL: https://issues.apache.org/jira/browse/HBASE-6586 Project: HBase Issue Type: Improvement Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-hbase-6568-hbck-quarantine-v6.patch, hbase-6586-92-v3.patch, hbase-6586-94-v3.patch, hbase-6586.patch, hbase-6586-trunk-v3.patch We've encountered a few upgrades from 0.90 hbases + 20.2/1.x hdfs to 0.92 hbases + hdfs 2.x that get stuck. I haven't been able to duplicate the problem in my dev environment but we suspect this may be related to HDFS-3731. On the HBase side, it seems reasonable to quarantine what are most likely truncated hfiles, so that can could later be recovered. Here's an example of the exception we've encountered: {code} 2012-07-18 05:55:01,152 ERROR handler.OpenRegionHandler (OpenRegionHandler.java:openRegion(346)) - Failed open of region=user_mappings,080112102AA76EF98197605D341B9E6C5824D2BC|1001,1317824890618.eaed0e7abc6d27d28ff0e5a9b49c4c 0d. java.io.IOException: java.lang.IllegalArgumentException: Invalid HFile version: 842220600 (expected to be between 1 and 2) at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:306) at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:371) at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:387) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.init(StoreFile.java:1026) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:485) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:566) at org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:286) at org.apache.hadoop.hbase.regionserver.Store.init(Store.java:223) at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2534) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:454) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3282) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3230) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:331) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:107) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: java.lang.IllegalArgumentException: Invalid HFile version: 842220600 (expected to be between 1 and 2) at org.apache.hadoop.hbase.io.hfile.HFile.checkFormatVersion(HFile.java:515) at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:303) ... 17 more {code} Specifically -- the FixedFileTrailer are incorrect, and seemingly missing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6586) Quarantine Corrupted HFiles
[ https://issues.apache.org/jira/browse/HBASE-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-6586: -- Status: Patch Available (was: Open) trying again Quarantine Corrupted HFiles --- Key: HBASE-6586 URL: https://issues.apache.org/jira/browse/HBASE-6586 Project: HBase Issue Type: Improvement Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-hbase-6568-hbck-quarantine-v6.patch, hbase-6586-92-v3.patch, hbase-6586-94-v3.patch, hbase-6586.patch, hbase-6586-trunk-v3.patch We've encountered a few upgrades from 0.90 hbases + 20.2/1.x hdfs to 0.92 hbases + hdfs 2.x that get stuck. I haven't been able to duplicate the problem in my dev environment but we suspect this may be related to HDFS-3731. On the HBase side, it seems reasonable to quarantine what are most likely truncated hfiles, so that can could later be recovered. Here's an example of the exception we've encountered: {code} 2012-07-18 05:55:01,152 ERROR handler.OpenRegionHandler (OpenRegionHandler.java:openRegion(346)) - Failed open of region=user_mappings,080112102AA76EF98197605D341B9E6C5824D2BC|1001,1317824890618.eaed0e7abc6d27d28ff0e5a9b49c4c 0d. java.io.IOException: java.lang.IllegalArgumentException: Invalid HFile version: 842220600 (expected to be between 1 and 2) at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:306) at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:371) at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:387) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.init(StoreFile.java:1026) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:485) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:566) at org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:286) at org.apache.hadoop.hbase.regionserver.Store.init(Store.java:223) at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2534) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:454) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3282) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3230) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:331) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:107) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: java.lang.IllegalArgumentException: Invalid HFile version: 842220600 (expected to be between 1 and 2) at org.apache.hadoop.hbase.io.hfile.HFile.checkFormatVersion(HFile.java:515) at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:303) ... 17 more {code} Specifically -- the FixedFileTrailer are incorrect, and seemingly missing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6635) Refactor HFile version selection and exception handling.
Jonathan Hsieh created HBASE-6635: - Summary: Refactor HFile version selection and exception handling. Key: HBASE-6635 URL: https://issues.apache.org/jira/browse/HBASE-6635 Project: HBase Issue Type: Bug Reporter: Jonathan Hsieh Trunk and 0.94's HFile code has some fairly convoluted code for bypassing checksums and has mixed usage of runtime and io exceptions when error conditions arise. This jira would clean up the code to have better encapsulation and be more explicit about what kinds of exceptions are thrown and what they mean. (This was partially spurred by comments in reviews of HBASE-6586). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6586) Quarantine Corrupted HFiles
[ https://issues.apache.org/jira/browse/HBASE-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439742#comment-13439742 ] Hadoop QA commented on HBASE-6586: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12542008/hbase-6586-trunk-v3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 9 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. -1 javadoc. The javadoc tool appears to have generated 7 warning messages. -1 javac. The applied patch generated 5 javac compiler warnings (more than the trunk's current 4 warnings). -1 findbugs. The patch appears to introduce 10 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.io.hfile.TestFixedFileTrailer Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2651//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2651//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2651//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2651//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2651//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2651//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2651//console This message is automatically generated. Quarantine Corrupted HFiles --- Key: HBASE-6586 URL: https://issues.apache.org/jira/browse/HBASE-6586 Project: HBase Issue Type: Improvement Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-hbase-6568-hbck-quarantine-v6.patch, hbase-6586-92-v3.patch, hbase-6586-94-v3.patch, hbase-6586.patch, hbase-6586-trunk-v3.patch We've encountered a few upgrades from 0.90 hbases + 20.2/1.x hdfs to 0.92 hbases + hdfs 2.x that get stuck. I haven't been able to duplicate the problem in my dev environment but we suspect this may be related to HDFS-3731. On the HBase side, it seems reasonable to quarantine what are most likely truncated hfiles, so that can could later be recovered. Here's an example of the exception we've encountered: {code} 2012-07-18 05:55:01,152 ERROR handler.OpenRegionHandler (OpenRegionHandler.java:openRegion(346)) - Failed open of region=user_mappings,080112102AA76EF98197605D341B9E6C5824D2BC|1001,1317824890618.eaed0e7abc6d27d28ff0e5a9b49c4c 0d. java.io.IOException: java.lang.IllegalArgumentException: Invalid HFile version: 842220600 (expected to be between 1 and 2) at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:306) at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:371) at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:387) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.init(StoreFile.java:1026) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:485) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:566) at org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:286) at org.apache.hadoop.hbase.regionserver.Store.init(Store.java:223) at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2534) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:454) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3282) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3230) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:331) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:107) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: java.lang.IllegalArgumentException: Invalid HFile version:
[jira] [Commented] (HBASE-5320) Create client API to handle HBase maintenance gracefully
[ https://issues.apache.org/jira/browse/HBASE-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439748#comment-13439748 ] Ian Varley commented on HBASE-5320: --- Seems like something along the lines of the circuit breaker pattern would be useful in the HBase client: http://techblog.netflix.com/2012/02/fault-tolerance-in-high-volume.html In other words, whether it's planned or unplanned, inability to reach the cluster should be something the client can detect and pass along to clients. The simplest way is by throwing immediate exceptions when the client detects the cluster is unreachable (while, in the background, the client continues to attempt to reconnect). A more sophisticated way might have some API clients can check about up-ness; but, of course, then they have to check that on every call, so it's not really all that different from just throwing a specific kind of exception and having them handle that at whatever level they feel like it. The key part is maintaining that state in some central place in the client's JVM. This is tricky to get right (you don't want to be throwing exceptions if the cluster is really up and it's a false alarm!). Create client API to handle HBase maintenance gracefully Key: HBASE-5320 URL: https://issues.apache.org/jira/browse/HBASE-5320 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Priority: Minor When we do HBase cluster maintenance, we typically have to manually stop or disable the client temporarily. It would be nice to have a way for the client to find out that HBase in undergoing maintenance through an appropriate API and gracefully handle it on its own. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5549) Master can fail if ZooKeeper session expires
[ https://issues.apache.org/jira/browse/HBASE-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439753#comment-13439753 ] stack commented on HBASE-5549: -- @Himanshu What JIRA did you upload your fix too? Master can fail if ZooKeeper session expires Key: HBASE-5549 URL: https://issues.apache.org/jira/browse/HBASE-5549 Project: HBase Issue Type: Bug Components: master, zookeeper Affects Versions: 0.96.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5549.v10.patch, 5549.v11.patch, 5549.v6.patch, 5549.v7.patch, 5549.v8.patch, 5549.v9.patch, nochange.patch There is a retry mechanism in RecoverableZooKeeper, but when the session expires, the whole ZooKeeperWatcher is recreated, hence the retry mechanism does not work in this case. This is why a sleep is needed in TestZooKeeper#testMasterSessionExpired: we need to wait for ZooKeeperWatcher to be recreated before using the connection. This can happen in real life, it can happen when: - master zookeeper starts - zookeeper connection is cut - master enters the retry loop - in the meantime the session expires - the network comes back, the session is recreated - the retries continues, but on the wrong object, hence fails. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6372) Add scanner batching to Export job
[ https://issues.apache.org/jira/browse/HBASE-6372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Alten-Lorenz updated HBASE-6372: -- Attachment: HBASE-6372.5.patch Add scanner batching to Export job -- Key: HBASE-6372 URL: https://issues.apache.org/jira/browse/HBASE-6372 Project: HBase Issue Type: Improvement Components: mapreduce Affects Versions: 0.96.0, 0.94.2 Reporter: Lars George Assignee: Shengsheng Huang Priority: Minor Labels: newbie Attachments: HBASE-6372.2.patch, HBASE-6372.3.patch, HBASE-6372.4.patch, HBASE-6372.5.patch, HBASE-6372.patch When a single row is too large for the RS heap then an OOME can take out the entire RS. Setting scanner batching in custom scans helps avoiding this scenario, but for the supplied Export job this is not set. Similar to HBASE-3421 we can set the batching to a low number - or if needed make it a command line option. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6372) Add scanner batching to Export job
[ https://issues.apache.org/jira/browse/HBASE-6372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Alten-Lorenz updated HBASE-6372: -- Attachment: (was: HBASE-6372.5.patch) Add scanner batching to Export job -- Key: HBASE-6372 URL: https://issues.apache.org/jira/browse/HBASE-6372 Project: HBase Issue Type: Improvement Components: mapreduce Affects Versions: 0.96.0, 0.94.2 Reporter: Lars George Assignee: Shengsheng Huang Priority: Minor Labels: newbie Attachments: HBASE-6372.2.patch, HBASE-6372.3.patch, HBASE-6372.4.patch, HBASE-6372.5.patch, HBASE-6372.patch When a single row is too large for the RS heap then an OOME can take out the entire RS. Setting scanner batching in custom scans helps avoiding this scenario, but for the supplied Export job this is not set. Similar to HBASE-3421 we can set the batching to a low number - or if needed make it a command line option. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6636) [0.92 UNIT TESTS] TestMasterZKSessionRecovery.testRegionAssignmentAfterMasterRecoveryDueToZKExpiry failed in 0.92 #513 and #496
stack created HBASE-6636: Summary: [0.92 UNIT TESTS] TestMasterZKSessionRecovery.testRegionAssignmentAfterMasterRecoveryDueToZKExpiry failed in 0.92 #513 and #496 Key: HBASE-6636 URL: https://issues.apache.org/jira/browse/HBASE-6636 Project: HBase Issue Type: Bug Affects Versions: 0.92.1 Reporter: stack Assignee: stack Priority: Critical The test class TestMasterZKSessionRecovery has been removed in trunk. Its master tests were moved elsewhere or removed because useless (See nkeywal reasoning over in HBASE-5572 KeeperException.SessionExpiredException management could be improved in Master; it was actually removed by HBASE-5549 Master can fail if ZooKeeper session expires). TestMasterZKSessionRecovery in 0.92 and 0.94 has an extra test that was not in trunk, the sporadically failing testRegionAssignmentAfterMasterRecoveryDueToZKExpiry. This was added by HBASE-6046 Master retry on ZK session expiry causes inconsistent region assignments. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6610) HFileLink: Hardlink alternative for snapshot restore
[ https://issues.apache.org/jira/browse/HBASE-6610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439775#comment-13439775 ] Matteo Bertozzi commented on HBASE-6610: {quote}TestStore#testDeleteExpiredStoreFiles failure seems to be related to the patch.{quote} Not sure if is related, since TestStore doesn't follow any code path for the FileLink. seems more a flaky test that uses sleep to verify the result... but I'll take a deeper look at that. HFileLink: Hardlink alternative for snapshot restore Key: HBASE-6610 URL: https://issues.apache.org/jira/browse/HBASE-6610 Project: HBase Issue Type: Sub-task Components: io Affects Versions: 0.96.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Labels: snapshot Fix For: 0.96.0 Attachments: HBASE-6610-v1.patch To avoid copying data during restore snapshot we need to introduce an HFile Link that allows to reference a file that can be in the original path (/hbase/table/region/cf/hfile) or, if the file is archived, in the archive directory (/hbase/.archive/table/region/cf/hfile). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6636) [0.92 UNIT TESTS] TestMasterZKSessionRecovery.testRegionAssignmentAfterMasterRecoveryDueToZKExpiry failed in 0.92 #513 and #496
[ https://issues.apache.org/jira/browse/HBASE-6636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439781#comment-13439781 ] stack commented on HBASE-6636: -- The failing test is in TestZooKeeper in trunk, not in the removed TestMasterZKSessionRecovery. [0.92 UNIT TESTS] TestMasterZKSessionRecovery.testRegionAssignmentAfterMasterRecoveryDueToZKExpiry failed in 0.92 #513 and #496 --- Key: HBASE-6636 URL: https://issues.apache.org/jira/browse/HBASE-6636 Project: HBase Issue Type: Bug Affects Versions: 0.92.1 Reporter: stack Assignee: stack Priority: Critical The test class TestMasterZKSessionRecovery has been removed in trunk. Its master tests were moved elsewhere or removed because useless (See nkeywal reasoning over in HBASE-5572 KeeperException.SessionExpiredException management could be improved in Master; it was actually removed by HBASE-5549 Master can fail if ZooKeeper session expires). TestMasterZKSessionRecovery in 0.92 and 0.94 has an extra test that was not in trunk, the sporadically failing testRegionAssignmentAfterMasterRecoveryDueToZKExpiry. This was added by HBASE-6046 Master retry on ZK session expiry causes inconsistent region assignments. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6586) Quarantine Corrupted HFiles
[ https://issues.apache.org/jira/browse/HBASE-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439788#comment-13439788 ] Jonathan Hsieh commented on HBASE-6586: --- interesting -- something on my internal builds didn't run this test to report failure. Quarantine Corrupted HFiles --- Key: HBASE-6586 URL: https://issues.apache.org/jira/browse/HBASE-6586 Project: HBase Issue Type: Improvement Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-hbase-6568-hbck-quarantine-v6.patch, hbase-6586-92-v3.patch, hbase-6586-94-v3.patch, hbase-6586.patch, hbase-6586-trunk-v3.patch We've encountered a few upgrades from 0.90 hbases + 20.2/1.x hdfs to 0.92 hbases + hdfs 2.x that get stuck. I haven't been able to duplicate the problem in my dev environment but we suspect this may be related to HDFS-3731. On the HBase side, it seems reasonable to quarantine what are most likely truncated hfiles, so that can could later be recovered. Here's an example of the exception we've encountered: {code} 2012-07-18 05:55:01,152 ERROR handler.OpenRegionHandler (OpenRegionHandler.java:openRegion(346)) - Failed open of region=user_mappings,080112102AA76EF98197605D341B9E6C5824D2BC|1001,1317824890618.eaed0e7abc6d27d28ff0e5a9b49c4c 0d. java.io.IOException: java.lang.IllegalArgumentException: Invalid HFile version: 842220600 (expected to be between 1 and 2) at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:306) at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:371) at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:387) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.init(StoreFile.java:1026) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:485) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:566) at org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:286) at org.apache.hadoop.hbase.regionserver.Store.init(Store.java:223) at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2534) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:454) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3282) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3230) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:331) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:107) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: java.lang.IllegalArgumentException: Invalid HFile version: 842220600 (expected to be between 1 and 2) at org.apache.hadoop.hbase.io.hfile.HFile.checkFormatVersion(HFile.java:515) at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:303) ... 17 more {code} Specifically -- the FixedFileTrailer are incorrect, and seemingly missing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira