[jira] [Commented] (HBASE-13403) Make waitOnSafeMode configurable in MasterFileSystem
[ https://issues.apache.org/jira/browse/HBASE-13403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481458#comment-14481458 ] Hadoop QA commented on HBASE-13403: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12723360/0001-HBASE-13403-Make-waitOnSafeMode-configurable-in-Mast.patch against master branch at commit 057499474c346b28ad5ac3ab7da420814eba547d. ATTACHMENT ID: 12723360 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/13581//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/13581//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/13581//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/13581//console This message is automatically generated. Make waitOnSafeMode configurable in MasterFileSystem Key: HBASE-13403 URL: https://issues.apache.org/jira/browse/HBASE-13403 Project: HBase Issue Type: Bug Components: master Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Priority: Minor Attachments: 0001-HBASE-13403-Make-waitOnSafeMode-configurable-in-Mast.patch, 0001-HBASE-13403-Make-waitOnSafeMode-configurable-in-Mast.patch We currently wait whatever is the configured value of hbase.server.thread.wakefrequency or the default 10 seconds. We should have a configuration to control how long we wait until the HDFS is no longer in safe mode, since using the existing hbase.server.thread.wakefrequency property to tune that can have adverse side effects. My proposal is to add a new property called hbase.master.waitonsafemode and start with the current default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13395) Remove HTableInterface
[ https://issues.apache.org/jira/browse/HBASE-13395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481486#comment-14481486 ] Sean Busbey commented on HBASE-13395: - [~lars_francke] has a thread going on dev@ where they're starting to prune these off. The way I read the ref guide, deprecated in 1.0.0 would mean that it's deprecated for 1.y as one major release, which would mean removal in 2.0 is fine. This is what I told Lars F on the thread, so if we want to be more conservative someone should chime in. Remove HTableInterface -- Key: HBASE-13395 URL: https://issues.apache.org/jira/browse/HBASE-13395 Project: HBase Issue Type: Sub-task Components: API Affects Versions: 2.0.0 Reporter: Mikhail Antonov Fix For: 2.0.0 This class is marked as deprecated, probably can remove it, and if any methods from this specific class are in active use, need to decide what to do on callers' side. Should be able to replace with just Table interface usage. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-13413) Create an integration test for Replication
Rajesh Nishtala created HBASE-13413: --- Summary: Create an integration test for Replication Key: HBASE-13413 URL: https://issues.apache.org/jira/browse/HBASE-13413 Project: HBase Issue Type: Test Components: integration tests Reporter: Rajesh Nishtala Assignee: Rajesh Nishtala Priority: Minor We want to have an end-to-end test for replication. it can write data into one cluster (with replication setup) and then read data from the other. The test should be capable of running for a long time and be reliant even under chaos monkey testing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13409) Add categories to uncategorized tests
[ https://issues.apache.org/jira/browse/HBASE-13409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481509#comment-14481509 ] stack commented on HBASE-13409: --- +1 Add categories to uncategorized tests - Key: HBASE-13409 URL: https://issues.apache.org/jira/browse/HBASE-13409 Project: HBase Issue Type: Bug Affects Versions: 2.0.0, 1.1.0 Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Trivial Fix For: 2.0.0, 1.1.0 Attachments: HBASE-13409.patch A couple tests without categories were flagged recently by TestCheckTestClasses in a precommit build. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13413) Create an integration test for Replication
[ https://issues.apache.org/jira/browse/HBASE-13413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481520#comment-14481520 ] Dima Spivak commented on HBASE-13413: - Hey Rajesh, I'd love to work on this with you. How are you envisioning an implementation with regard to two clusters? Create an integration test for Replication -- Key: HBASE-13413 URL: https://issues.apache.org/jira/browse/HBASE-13413 Project: HBase Issue Type: Test Components: integration tests Reporter: Rajesh Nishtala Assignee: Rajesh Nishtala Priority: Minor We want to have an end-to-end test for replication. it can write data into one cluster (with replication setup) and then read data from the other. The test should be capable of running for a long time and be reliant even under chaos monkey testing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13413) Create an integration test for Replication
[ https://issues.apache.org/jira/browse/HBASE-13413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481529#comment-14481529 ] Rajesh Nishtala commented on HBASE-13413: - I've put up an initial version of the test here: https://reviews.facebook.net/D36423 Create an integration test for Replication -- Key: HBASE-13413 URL: https://issues.apache.org/jira/browse/HBASE-13413 Project: HBase Issue Type: Test Components: integration tests Reporter: Rajesh Nishtala Assignee: Rajesh Nishtala Priority: Minor We want to have an end-to-end test for replication. it can write data into one cluster (with replication setup) and then read data from the other. The test should be capable of running for a long time and be reliant even under chaos monkey testing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13413) Create an integration test for Replication
[ https://issues.apache.org/jira/browse/HBASE-13413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Nishtala updated HBASE-13413: Attachment: HBASE-13413.patch Create an integration test for Replication -- Key: HBASE-13413 URL: https://issues.apache.org/jira/browse/HBASE-13413 Project: HBase Issue Type: Test Components: integration tests Reporter: Rajesh Nishtala Assignee: Rajesh Nishtala Priority: Minor Attachments: HBASE-13413.patch We want to have an end-to-end test for replication. it can write data into one cluster (with replication setup) and then read data from the other. The test should be capable of running for a long time and be reliant even under chaos monkey testing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13391) TestRegionObserverInterface frequently failing on branch-1
[ https://issues.apache.org/jira/browse/HBASE-13391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481492#comment-14481492 ] Sean Busbey commented on HBASE-13391: - If there's still recovery happening then that would explain the saw 0 failure.Does the test actually attempt to verify that replay is done? Is the failure log with distributed log replay off? I'll have to dig in some to figure out what would lead to seeing 3x the recovery. nothing obvious comes to mind. The part I don't understand is why we'd get different numbers for the non-legacy and the legacy version, given that they get called in the same loop. TestRegionObserverInterface frequently failing on branch-1 --- Key: HBASE-13391 URL: https://issues.apache.org/jira/browse/HBASE-13391 Project: HBase Issue Type: Bug Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 2.0.0, 1.1.0 Attachments: test.log.fail.txt, test.log.pass.txt TestRegionObserverInterface is frequently failing on branch-1 . Example: {noformat} java.lang.AssertionError: Result of org.apache.hadoop.hbase.coprocessor.SimpleRegionObserver$Legacy.getCtPreWALRestore is expected to be 1, while we get 0 at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.assertTrue(Assert.java:41) at org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface.verifyMethodResult(TestRegionObserverInterface.java:751) at org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface.testLegacyRecovery(TestRegionObserverInterface.java:685) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13391) TestRegionObserverInterface frequently failing on branch-1
[ https://issues.apache.org/jira/browse/HBASE-13391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481543#comment-14481543 ] Sean Busbey commented on HBASE-13391: - {quote} bq. Is the failure log with distributed log replay off? I'll have to dig in some to figure out what would lead to seeing 3x the recovery. I was thinking that we could turn off distributed replay, to achieve the same aim as adding code to the test to wait for replay to finish. However then the test fails with got 3 expected 1. {quote} I think with DLR off it would just expand the window of the race condition, unless the test tries to scan to check that the region is available or something like that. Does the got 3 expect 1 thing happen on master as well? Sounds like we should break it off into a different jira. If the failure isn't pressing I'll take that new jira (barring a shift in severity, I don't expect to be able to look at it until next week). TestRegionObserverInterface frequently failing on branch-1 --- Key: HBASE-13391 URL: https://issues.apache.org/jira/browse/HBASE-13391 Project: HBase Issue Type: Bug Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 2.0.0, 1.1.0 Attachments: test.log.fail.txt, test.log.pass.txt TestRegionObserverInterface is frequently failing on branch-1 . Example: {noformat} java.lang.AssertionError: Result of org.apache.hadoop.hbase.coprocessor.SimpleRegionObserver$Legacy.getCtPreWALRestore is expected to be 1, while we get 0 at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.assertTrue(Assert.java:41) at org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface.verifyMethodResult(TestRegionObserverInterface.java:751) at org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface.testLegacyRecovery(TestRegionObserverInterface.java:685) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13291) Lift the scan ceiling
[ https://issues.apache.org/jira/browse/HBASE-13291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481499#comment-14481499 ] stack commented on HBASE-13291: --- [~apurtell] The perf symbols get regenerated on each invocation. I'll run it after the rig has been running a while. Hotspot compile should have settled. Lift the scan ceiling - Key: HBASE-13291 URL: https://issues.apache.org/jira/browse/HBASE-13291 Project: HBase Issue Type: Improvement Components: Scanners Affects Versions: 1.0.0 Reporter: stack Assignee: stack Attachments: 13291.hacks.txt, 13291.inlining.txt, Screen Shot 2015-03-26 at 12.12.13 PM.png, Screen Shot 2015-03-26 at 3.39.33 PM.png, hack_to_bypass_bb.txt, nonBBposAndInineMvccVint.txt, q (1).png, scan_no_mvcc_optimized.svg, traces.7.svg, traces.filterall.svg, traces.nofilter.svg, traces.small2.svg, traces.smaller.svg Scanning medium sized rows with multiple concurrent scanners exhibits interesting 'ceiling' properties. A server runs at about 6.7k ops a second using 450% of possible 1600% of CPUs when 4 clients each with 10 threads doing scan 1000 rows. If I add '--filterAll' argument (do not return results), then we run at 1450% of possible 1600% possible but we do 8k ops a second. Let me attach flame graphs for two cases. Unfortunately, there is some frustrating dark art going on. Let me try figure it... Filing issue in meantime to keep score in. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13412) Region split decisions should have jitter
[ https://issues.apache.org/jira/browse/HBASE-13412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481573#comment-14481573 ] Hadoop QA commented on HBASE-13412: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12723391/HBASE-13412.patch against master branch at commit 057499474c346b28ad5ac3ab7da420814eba547d. ATTACHMENT ID: 12723391 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestRegionSplitPolicy Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/13583//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/13583//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/13583//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/13583//console This message is automatically generated. Region split decisions should have jitter - Key: HBASE-13412 URL: https://issues.apache.org/jira/browse/HBASE-13412 Project: HBase Issue Type: New Feature Components: regionserver Affects Versions: 1.0.0, 2.0.0 Reporter: Elliott Clark Assignee: Elliott Clark Fix For: 2.0.0, 1.1.0 Attachments: HBASE-13412.patch Whenever a region splits it causes lots of IO (compactions are queued for a while). Because of this it's important to make sure that well distributed tables don't have all of their regions split at exactly the same time. This is basically the same as our compaction jitter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13412) Region split decisions should have jitter
[ https://issues.apache.org/jira/browse/HBASE-13412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark updated HBASE-13412: -- Status: Patch Available (was: Open) Region split decisions should have jitter - Key: HBASE-13412 URL: https://issues.apache.org/jira/browse/HBASE-13412 Project: HBase Issue Type: New Feature Components: regionserver Affects Versions: 1.0.0, 2.0.0 Reporter: Elliott Clark Assignee: Elliott Clark Fix For: 2.0.0, 1.1.0 Attachments: HBASE-13412.patch Whenever a region splits it causes lots of IO (compactions are queued for a while). Because of this it's important to make sure that well distributed tables don't have all of their regions split at exactly the same time. This is basically the same as our compaction jitter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13397) Purge duplicate rpc request thread local
[ https://issues.apache.org/jira/browse/HBASE-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-13397: --- Attachment: HBASE-13397-0.98.patch This cleanup applies to 0.98 also. It's a nice idea and I pushed it there too so we aren't wasting cycles maintaining an unneeded threadlocal and so we won't take a blind turn down RequestContext at some point given it's been removed from later versions. (Sometimes contributors start with a patch against 0.98.) The changes are to private interfaces except one change to RpcServer.Call, where RpcServer is a LimitedPrivate/Evolving interface with coprocessor and Phoenix audiences. Over in Phoenix there is only one place (a test, PhoenixIndexRpcSchedulerTest) that uses the RpcServer.Call constructor so I've left the old constructor in place with a deprecation tag and Javadoc indicating it shouldn't be used. Therefore there are no source or binary compatibility issues resulting from the change. The test is just mocking Calls so is not affected. 0.98 tests pass with the change applied. Purge duplicate rpc request thread local Key: HBASE-13397 URL: https://issues.apache.org/jira/browse/HBASE-13397 Project: HBase Issue Type: Bug Components: rpc Reporter: stack Assignee: stack Fix For: 2.0.0, 1.1.0, 0.98.13 Attachments: 13397.txt, HBASE-13397-0.98.patch Serverside, in a few locations, code wants access to RPC context to get user and remote client address. A thread local makes it so this info is accessible anywhere on the processing chain. Turns out we have this mechanism twice (noticed by our Matteo). This patch purges one of the thread locals. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13275) Setting hbase.security.authorization to false does not disable authorization
[ https://issues.apache.org/jira/browse/HBASE-13275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-13275: --- Attachment: HBASE-13275-0.98.patch Here's a rebased 0.98 patch that will fix the compilation issue. bq. org.apache.hadoop.hbase.TestCheckTestClasses Any further failures of this are not related to my patch, see HBASE-13409 Setting hbase.security.authorization to false does not disable authorization Key: HBASE-13275 URL: https://issues.apache.org/jira/browse/HBASE-13275 Project: HBase Issue Type: Bug Reporter: William Watson Assignee: Andrew Purtell Fix For: 2.0.0, 1.1.0, 0.98.13, 1.0.2 Attachments: HBASE-13275-0.98.patch, HBASE-13275-0.98.patch, HBASE-13275-branch-1.patch, HBASE-13275-branch-1.patch, HBASE-13275.patch, HBASE-13275.patch, HBASE-13275.patch, HBASE-13275.patch According to the docs provided by Cloudera (we're not running Cloudera, BTW), this is the list of configs to enable authorization in HBase: {code} property namehbase.security.authorization/name valuetrue/value /property property namehbase.coprocessor.master.classes/name valueorg.apache.hadoop.hbase.security.access.AccessController/value /property property namehbase.coprocessor.region.classes/name valueorg.apache.hadoop.hbase.security.token.TokenProvider,org.apache.hadoop.hbase.security.access.AccessController/value /property {code} We wanted to then disable authorization but simply setting hbase.security.authorization to false did not disable the authorization -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13412) Region split decisions should have jitter
[ https://issues.apache.org/jira/browse/HBASE-13412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark updated HBASE-13412: -- Attachment: HBASE-13412.patch Region split decisions should have jitter - Key: HBASE-13412 URL: https://issues.apache.org/jira/browse/HBASE-13412 Project: HBase Issue Type: New Feature Components: regionserver Affects Versions: 1.0.0, 2.0.0 Reporter: Elliott Clark Assignee: Elliott Clark Fix For: 2.0.0, 1.1.0 Attachments: HBASE-13412.patch Whenever a region splits it causes lots of IO (compactions are queued for a while). Because of this it's important to make sure that well distributed tables don't have all of their regions split at exactly the same time. This is basically the same as our compaction jitter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13397) Purge duplicate rpc request thread local
[ https://issues.apache.org/jira/browse/HBASE-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-13397: --- Fix Version/s: 0.98.13 Purge duplicate rpc request thread local Key: HBASE-13397 URL: https://issues.apache.org/jira/browse/HBASE-13397 Project: HBase Issue Type: Bug Components: rpc Reporter: stack Assignee: stack Fix For: 2.0.0, 1.1.0, 0.98.13 Attachments: 13397.txt, HBASE-13397-0.98.patch Serverside, in a few locations, code wants access to RPC context to get user and remote client address. A thread local makes it so this info is accessible anywhere on the processing chain. Turns out we have this mechanism twice (noticed by our Matteo). This patch purges one of the thread locals. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13397) Purge duplicate rpc request thread local
[ https://issues.apache.org/jira/browse/HBASE-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481501#comment-14481501 ] stack commented on HBASE-13397: --- +1 [~apurtell] for 0.98 then... Purge duplicate rpc request thread local Key: HBASE-13397 URL: https://issues.apache.org/jira/browse/HBASE-13397 Project: HBase Issue Type: Bug Components: rpc Reporter: stack Assignee: stack Fix For: 2.0.0, 1.1.0, 0.98.13 Attachments: 13397.txt, HBASE-13397-0.98.patch Serverside, in a few locations, code wants access to RPC context to get user and remote client address. A thread local makes it so this info is accessible anywhere on the processing chain. Turns out we have this mechanism twice (noticed by our Matteo). This patch purges one of the thread locals. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13291) Lift the scan ceiling
[ https://issues.apache.org/jira/browse/HBASE-13291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481479#comment-14481479 ] Andrew Purtell commented on HBASE-13291: bq. interesting how different the picture perf top and flight recorder present... same basic cast of characters but attribution is wildly different Guessing: perf top isn't aware like flight recorder on how HotSpit can de- and re-optimize methods. (It's a lovely hack though.) If you look in the perf map file in /tmp you might find several addresses for the same JIT-ed Java method. perf doesn't know the invocation count for the method should be aggregated from more than one emitted location, but I suspect FR does. Let me make a note to research if this is true or not. Lift the scan ceiling - Key: HBASE-13291 URL: https://issues.apache.org/jira/browse/HBASE-13291 Project: HBase Issue Type: Improvement Components: Scanners Affects Versions: 1.0.0 Reporter: stack Assignee: stack Attachments: 13291.hacks.txt, 13291.inlining.txt, Screen Shot 2015-03-26 at 12.12.13 PM.png, Screen Shot 2015-03-26 at 3.39.33 PM.png, hack_to_bypass_bb.txt, nonBBposAndInineMvccVint.txt, q (1).png, scan_no_mvcc_optimized.svg, traces.7.svg, traces.filterall.svg, traces.nofilter.svg, traces.small2.svg, traces.smaller.svg Scanning medium sized rows with multiple concurrent scanners exhibits interesting 'ceiling' properties. A server runs at about 6.7k ops a second using 450% of possible 1600% of CPUs when 4 clients each with 10 threads doing scan 1000 rows. If I add '--filterAll' argument (do not return results), then we run at 1450% of possible 1600% possible but we do 8k ops a second. Let me attach flame graphs for two cases. Unfortunately, there is some frustrating dark art going on. Let me try figure it... Filing issue in meantime to keep score in. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13395) Remove HTableInterface
[ https://issues.apache.org/jira/browse/HBASE-13395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481481#comment-14481481 ] Nick Dimiduk commented on HBASE-13395: -- I think our language in the [book|http://hbase.apache.org/book.html#hbase.versioning.post10] is such that we need to have something deprecated for an entire release. Meaning, if it was first marked deprecate in 1.0.0, it must be present for all 1.x releases and all 2.x releases. [~apurtell] and [~busbey] may have some interpretation on this as well. Remove HTableInterface -- Key: HBASE-13395 URL: https://issues.apache.org/jira/browse/HBASE-13395 Project: HBase Issue Type: Sub-task Components: API Affects Versions: 2.0.0 Reporter: Mikhail Antonov Fix For: 2.0.0 This class is marked as deprecated, probably can remove it, and if any methods from this specific class are in active use, need to decide what to do on callers' side. Should be able to replace with just Table interface usage. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13376) Improvements to Stochastic load balancer
[ https://issues.apache.org/jira/browse/HBASE-13376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vandana Ayyalasomayajula updated HBASE-13376: - Attachment: HBASE-13376_0.98.txt Renamed the patch file. [~stack] I will try to provide a patch for the master branch today. Sorry for the late replies, I was out of office last week. Improvements to Stochastic load balancer Key: HBASE-13376 URL: https://issues.apache.org/jira/browse/HBASE-13376 Project: HBase Issue Type: Improvement Components: Balancer Affects Versions: 1.0.0, 0.98.12 Reporter: Vandana Ayyalasomayajula Assignee: Vandana Ayyalasomayajula Priority: Minor Attachments: HBASE-13376_0.98.txt, HBASE-13376_98.patch There are two things this jira tries to address: 1. The locality picker in the stochastic balancer does not pick regions with least locality as candidates for swap/move. So when any user configures locality cost in the configs, the balancer does not always seems to move regions with bad locality. 2. When a cluster has equal number of loaded regions, it always picks the first one. It should pick a random region on one of the equally loaded servers. This improves a chance of finding a good candidate, when load picker is invoked several times. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13409) Add categories to uncategorized tests
[ https://issues.apache.org/jira/browse/HBASE-13409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481462#comment-14481462 ] Andrew Purtell commented on HBASE-13409: Any concerns with this patch? Add categories to uncategorized tests - Key: HBASE-13409 URL: https://issues.apache.org/jira/browse/HBASE-13409 Project: HBase Issue Type: Bug Affects Versions: 2.0.0, 1.1.0 Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Trivial Fix For: 2.0.0, 1.1.0 Attachments: HBASE-13409.patch A couple tests without categories were flagged recently by TestCheckTestClasses in a precommit build. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13413) Create an integration test for Replication
[ https://issues.apache.org/jira/browse/HBASE-13413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481531#comment-14481531 ] Rajesh Nishtala commented on HBASE-13413: - Hi Dima, Thanks for the support! I'd love to get your comments on the diff thats up. Right now its a simple extension of the IntegrationTestBigLinkedList. Thanks! Create an integration test for Replication -- Key: HBASE-13413 URL: https://issues.apache.org/jira/browse/HBASE-13413 Project: HBase Issue Type: Test Components: integration tests Reporter: Rajesh Nishtala Assignee: Rajesh Nishtala Priority: Minor We want to have an end-to-end test for replication. it can write data into one cluster (with replication setup) and then read data from the other. The test should be capable of running for a long time and be reliant even under chaos monkey testing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HBASE-13373) Squash HFileReaderV3 together with HFileReaderV2 and AbstractHFileReader; ditto for Scanners and BlockReader, etc.
[ https://issues.apache.org/jira/browse/HBASE-13373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-13373. --- Resolution: Fixed Fix Version/s: (was: 1.1.0) Ok. Let this patch only apply to 2.0 for now. The migration unit tests -- e.g. TestMetaMigrationConvertingToPB -- wants us to be able to migrate versions of hfile that are pre-protobuf (minor version 3) which only happens in 0.98. Lets see how this goes. If having HFileReaderV2/V3 in branch-1 and not in master is a pain, I'll argue we should backport this. Squash HFileReaderV3 together with HFileReaderV2 and AbstractHFileReader; ditto for Scanners and BlockReader, etc. -- Key: HBASE-13373 URL: https://issues.apache.org/jira/browse/HBASE-13373 Project: HBase Issue Type: Task Reporter: stack Assignee: stack Fix For: 2.0.0 Attachments: 0001-HBASE-13373-Squash-HFileReaderV3-together-with-HFile.patch, 13373.txt, 13373.v3.txt, 13373.v3.txt, 13373.v5.txt, 13373.v6.txt, 13373.v6.txt, 13373.v6.txt, 13373.v6.txt, 13373.v6.txt, 13373.wip.txt Profiling I actually ran into case where complaint that could not inline because: MaxInlineLevel maximum number of nested calls that are inlined 9 intx i.e. method was more than 9 levels deep. The HFileReaderV? with Abstracts is not needed anymore now we are into the clear with V3 enabled since hbase 1.0.0; we can have just an Interface and an implementation. If we need to support a new hfile type, can hopefully do it in a backward compatible way now we have Cell Interface, etc. Squashing all this stuff together actually makes it easier figuring what is going on when reading code. I can also get rid of a bunch of duplication too. Attached is a WIP. Doesn't fully compile yet but you get the idea. I'll keep on unless objection. Will try it against data written with old classes as soon as I have something working. I don't believe we write classnames into our data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13391) TestRegionObserverInterface frequently failing on branch-1
[ https://issues.apache.org/jira/browse/HBASE-13391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481502#comment-14481502 ] Andrew Purtell commented on HBASE-13391: Just to clarify, there are two test failure cases I've discussed above: # An intermittent failure where the timing of a bunch of asynchronous activity is slightly different, so recovery is still happening, and so we see got 0 expected 1. # If distributed replay is turned off, then we see got 3 expected 1 bq. If there's still recovery happening then that would explain the saw 0 failure. That is my take. bq. Does the test actually attempt to verify that replay is done? It does not appear to, and I think this would fix the problem as reported on this issue's description. bq. Is the failure log with distributed log replay off? I'll have to dig in some to figure out what would lead to seeing 3x the recovery. I was thinking that we could turn off distributed replay, to achieve the same aim as adding code to the test to wait for replay to finish. However then the test fails with got 3 expected 1. TestRegionObserverInterface frequently failing on branch-1 --- Key: HBASE-13391 URL: https://issues.apache.org/jira/browse/HBASE-13391 Project: HBase Issue Type: Bug Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 2.0.0, 1.1.0 Attachments: test.log.fail.txt, test.log.pass.txt TestRegionObserverInterface is frequently failing on branch-1 . Example: {noformat} java.lang.AssertionError: Result of org.apache.hadoop.hbase.coprocessor.SimpleRegionObserver$Legacy.getCtPreWALRestore is expected to be 1, while we get 0 at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.assertTrue(Assert.java:41) at org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface.verifyMethodResult(TestRegionObserverInterface.java:751) at org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface.testLegacyRecovery(TestRegionObserverInterface.java:685) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13412) Region split decisions should have jitter
[ https://issues.apache.org/jira/browse/HBASE-13412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481507#comment-14481507 ] stack commented on HBASE-13412: --- Use ThreadLocalRandom rather than make your own? Having regions split at different sizes is going to confuse operators? Region split decisions should have jitter - Key: HBASE-13412 URL: https://issues.apache.org/jira/browse/HBASE-13412 Project: HBase Issue Type: New Feature Components: regionserver Affects Versions: 1.0.0, 2.0.0 Reporter: Elliott Clark Assignee: Elliott Clark Fix For: 2.0.0, 1.1.0 Attachments: HBASE-13412.patch Whenever a region splits it causes lots of IO (compactions are queued for a while). Because of this it's important to make sure that well distributed tables don't have all of their regions split at exactly the same time. This is basically the same as our compaction jitter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HBASE-10709) HTableClientScanner with background thread
[ https://issues.apache.org/jira/browse/HBASE-10709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-10709. --- Resolution: Duplicate Closing marking as duplicate of HBASE-13071 HTableClientScanner with background thread -- Key: HBASE-10709 URL: https://issues.apache.org/jira/browse/HBASE-10709 Project: HBase Issue Type: New Feature Components: Client, Scanners Affects Versions: 0.89-fb Reporter: @deprecated Yi Deng 1. Extract ResultScanner from HTable as HTableClientScanner 2. A background thread will fetch the results from server when the main thread is consuming the the data 3. Fix a bug in current scanner that loses some data if region server switches during scanning 4. Extract Result's Iterator as ResultScannerIterator -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13409) Add categories to uncategorized tests
[ https://issues.apache.org/jira/browse/HBASE-13409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481908#comment-14481908 ] Hudson commented on HBASE-13409: FAILURE: Integrated in HBase-1.1 #369 (See [https://builds.apache.org/job/HBase-1.1/369/]) HBASE-13409 Add categories to uncategorized tests (apurtell: rev 935352c4d819b54625fe567747bec9d31e4f3dd6) * hbase-client/src/test/java/org/apache/hadoop/hbase/filter/TestLongComparator.java * hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestClientExponentialBackoff.java Amend HBASE-13409 Add categories to uncategorized tests; fix compliation error (apurtell: rev 29827681533613854269282aa61256127c492c45) * hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestClientExponentialBackoff.java Add categories to uncategorized tests - Key: HBASE-13409 URL: https://issues.apache.org/jira/browse/HBASE-13409 Project: HBase Issue Type: Bug Affects Versions: 2.0.0, 1.1.0 Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Trivial Fix For: 2.0.0, 1.1.0, 0.98.13, 1.0.2 Attachments: HBASE-13409.patch A couple tests without categories were flagged recently by TestCheckTestClasses in a precommit build. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13391) TestRegionObserverInterface frequently failing on branch-1
[ https://issues.apache.org/jira/browse/HBASE-13391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481636#comment-14481636 ] Andrew Purtell commented on HBASE-13391: bq. Does the got 3 expect 1 thing happen on master as well? Haven't tried it. bq. Sounds like we should break it off into a different jira. That's fine. For this issue, the test is only sleeping, according to comment above the sleep in order to let the kill soak in, and then waiting for all regions to be reassigned, but not waiting for replay to finish. We should add a helper to HBaseTestingUtility for that if there isn't one already and use it in the test. TestRegionObserverInterface frequently failing on branch-1 --- Key: HBASE-13391 URL: https://issues.apache.org/jira/browse/HBASE-13391 Project: HBase Issue Type: Bug Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 2.0.0, 1.1.0 Attachments: test.log.fail.txt, test.log.pass.txt TestRegionObserverInterface is frequently failing on branch-1 . Example: {noformat} java.lang.AssertionError: Result of org.apache.hadoop.hbase.coprocessor.SimpleRegionObserver$Legacy.getCtPreWALRestore is expected to be 1, while we get 0 at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.assertTrue(Assert.java:41) at org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface.verifyMethodResult(TestRegionObserverInterface.java:751) at org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface.testLegacyRecovery(TestRegionObserverInterface.java:685) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13413) Create an integration test for Replication
[ https://issues.apache.org/jira/browse/HBASE-13413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark updated HBASE-13413: -- Affects Version/s: 2.0.0 1.0.0 Status: Patch Available (was: Open) Create an integration test for Replication -- Key: HBASE-13413 URL: https://issues.apache.org/jira/browse/HBASE-13413 Project: HBase Issue Type: Test Components: integration tests Affects Versions: 1.0.0, 2.0.0 Reporter: Rajesh Nishtala Assignee: Rajesh Nishtala Priority: Minor Attachments: HBASE-13413.patch We want to have an end-to-end test for replication. it can write data into one cluster (with replication setup) and then read data from the other. The test should be capable of running for a long time and be reliant even under chaos monkey testing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
[ https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481916#comment-14481916 ] Lars Hofhansl commented on HBASE-13389: --- We do need to revisit the 6 days, right? Would 3 days be enough? Lemme try to understand the cases: # when we replay data due to recovery we want it to fall into the right place w.r.t to existing data. Why do we need more then the maximum time to roll a log (1h)? # replication... Yeah, that's important. I'd say if you have a replication lag of more than a few hours you have a larger problem anyway. # This too... Although I do not actually agree that this is an advantage. Mutations (including deletes) being idempotent in HBase is a feature and not a problem. So with all this I do need any reason to keep these for more than a few hours. It's very possible that I am missing something. [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations - Key: HBASE-13389 URL: https://issues.apache.org/jira/browse/HBASE-13389 Project: HBase Issue Type: Sub-task Components: Performance Reporter: stack Attachments: 13389.txt HBASE-12600 moved the edit sequenceid from tags to instead exploit the mvcc/sequenceid slot in a key. Now Cells near-always have an associated mvcc/sequenceid where previous it was rare or the mvcc was kept up at the file level. This is sort of how it should be many of us would argue but as a side-effect of this change, read-time optimizations that helped speed scans were undone by this change. In this issue, lets see if we can get the optimizations back -- or just remove the optimizations altogether. The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291. The optimizations undone by this changes are (to quote the optimizer himself, Mr [~lhofhansl]): {quote} Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166. We're always storing the mvcc readpoints, and we never compare them against the actual smallestReadpoint, and hence we're always performing all the checks, tests, and comparisons that these jiras removed in addition to actually storing the data - which with up to 8 bytes per Cell is not trivial. {quote} This is the 'breaking' change: https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13275) Setting hbase.security.authorization to false does not disable authorization
[ https://issues.apache.org/jira/browse/HBASE-13275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481885#comment-14481885 ] Hadoop QA commented on HBASE-13275: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12723399/HBASE-13275-0.98.patch against 0.98 branch at commit 057499474c346b28ad5ac3ab7da420814eba547d. ATTACHMENT ID: 12723399 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 20 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 26 warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . {color:red}-1 core zombie tests{color}. There are 1 zombie test(s): Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/13585//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/13585//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/13585//artifact/patchprocess/checkstyle-aggregate.html Javadoc warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/13585//artifact/patchprocess/patchJavadocWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/13585//console This message is automatically generated. Setting hbase.security.authorization to false does not disable authorization Key: HBASE-13275 URL: https://issues.apache.org/jira/browse/HBASE-13275 Project: HBase Issue Type: Bug Reporter: William Watson Assignee: Andrew Purtell Fix For: 2.0.0, 1.1.0, 0.98.13, 1.0.2 Attachments: HBASE-13275-0.98.patch, HBASE-13275-0.98.patch, HBASE-13275-branch-1.patch, HBASE-13275-branch-1.patch, HBASE-13275.patch, HBASE-13275.patch, HBASE-13275.patch, HBASE-13275.patch According to the docs provided by Cloudera (we're not running Cloudera, BTW), this is the list of configs to enable authorization in HBase: {code} property namehbase.security.authorization/name valuetrue/value /property property namehbase.coprocessor.master.classes/name valueorg.apache.hadoop.hbase.security.access.AccessController/value /property property namehbase.coprocessor.region.classes/name valueorg.apache.hadoop.hbase.security.token.TokenProvider,org.apache.hadoop.hbase.security.access.AccessController/value /property {code} We wanted to then disable authorization but simply setting hbase.security.authorization to false did not disable the authorization -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13409) Add categories to uncategorized tests
[ https://issues.apache.org/jira/browse/HBASE-13409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481912#comment-14481912 ] Hudson commented on HBASE-13409: FAILURE: Integrated in HBase-1.0 #850 (See [https://builds.apache.org/job/HBase-1.0/850/]) HBASE-13409 Add categories to uncategorized tests (apurtell: rev 418e61ab0002bf8cd57d7643ff4f83759199534a) * hbase-client/src/test/java/org/apache/hadoop/hbase/filter/TestLongComparator.java * hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestClientExponentialBackoff.java Add categories to uncategorized tests - Key: HBASE-13409 URL: https://issues.apache.org/jira/browse/HBASE-13409 Project: HBase Issue Type: Bug Affects Versions: 2.0.0, 1.1.0 Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Trivial Fix For: 2.0.0, 1.1.0, 0.98.13, 1.0.2 Attachments: HBASE-13409.patch A couple tests without categories were flagged recently by TestCheckTestClasses in a precommit build. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-10800) Use CellComparator instead of KVComparator
[ https://issues.apache.org/jira/browse/HBASE-10800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481654#comment-14481654 ] Hadoop QA commented on HBASE-10800: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12723386/HBASE-10800_6.patch against master branch at commit 057499474c346b28ad5ac3ab7da420814eba547d. ATTACHMENT ID: 12723386 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 150 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 3 warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/13582//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/13582//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/13582//artifact/patchprocess/checkstyle-aggregate.html Javadoc warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/13582//artifact/patchprocess/patchJavadocWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/13582//console This message is automatically generated. Use CellComparator instead of KVComparator -- Key: HBASE-10800 URL: https://issues.apache.org/jira/browse/HBASE-10800 Project: HBase Issue Type: Sub-task Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 1.1.0 Attachments: HBASE-10800_1.patch, HBASE-10800_2.patch, HBASE-10800_3.patch, HBASE-10800_4.patch, HBASE-10800_4.patch, HBASE-10800_5.patch, HBASE-10800_6.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12990) MetaScanner should be replaced by MetaTableAccessor
[ https://issues.apache.org/jira/browse/HBASE-12990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481653#comment-14481653 ] stack commented on HBASE-12990: --- [~octo47] Can we make progress on this now that unmanaged connections have been purged from master by [~mantonov]? MetaScanner should be replaced by MetaTableAccessor --- Key: HBASE-12990 URL: https://issues.apache.org/jira/browse/HBASE-12990 Project: HBase Issue Type: Improvement Components: Client Affects Versions: 2.0.0, 1.1.0 Reporter: Andrey Stepachev Assignee: Andrey Stepachev Fix For: 2.0.0, 1.1.0 Attachments: HBASE-12990-branch-1.v1.patch, HBASE-12990.patch, HBASE-12990.v2.patch, HBASE-12990.v3.patch, HBASE-12990.v4.patch, HBASE-12990.v5.patch, HBASE-12990.v5.patch, HBASE-12990.v5.patch, HBASE-12990.v6.patch, HBASE-12990.v7.patch, HBASE-12990.v7.patch, HBASE-12990.v7.patch, HBASE-12990.v8.patch MetaScanner and MetaTableAccessor do similar things, but seems they tend to diverge. Let's have only one thing to enquiry META. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13391) TestRegionObserverInterface frequently failing on branch-1
[ https://issues.apache.org/jira/browse/HBASE-13391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481662#comment-14481662 ] Andrew Purtell commented on HBASE-13391: Also although a bisect implicated HBASE-12975 this could also be related to HBASE-12972. I will audit both diffs for branch-1 and master for any changes that touch replay or recovery state. TestRegionObserverInterface frequently failing on branch-1 --- Key: HBASE-13391 URL: https://issues.apache.org/jira/browse/HBASE-13391 Project: HBase Issue Type: Bug Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 2.0.0, 1.1.0 Attachments: test.log.fail.txt, test.log.pass.txt TestRegionObserverInterface is frequently failing on branch-1 . Example: {noformat} java.lang.AssertionError: Result of org.apache.hadoop.hbase.coprocessor.SimpleRegionObserver$Legacy.getCtPreWALRestore is expected to be 1, while we get 0 at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.assertTrue(Assert.java:41) at org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface.verifyMethodResult(TestRegionObserverInterface.java:751) at org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface.testLegacyRecovery(TestRegionObserverInterface.java:685) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HBASE-7868) HFile performance regression between 0.92 and 0.94
[ https://issues.apache.org/jira/browse/HBASE-7868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-7868. -- Resolution: Won't Fix Stale HFile performance regression between 0.92 and 0.94 -- Key: HBASE-7868 URL: https://issues.apache.org/jira/browse/HBASE-7868 Project: HBase Issue Type: Bug Components: io Affects Versions: 0.94.5 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Attachments: FilteredScan.png, HFilePerformanceEvaluation.txt, hfileperf-graphs.png, performances.pdf, performances.pdf By HFilePerformanceEvaluation seems that 0.94 is slower then 0.92 Looking at the profiler for the Scan path, seems that most of the time, compared to 92, is spent in the metrics dictionary lookup. [~eclark] pointed out the new per family/block metrics. By commenting the metrics call in HFileReaderV2, the performance seems to get better, but maybe metrics is not the only problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12990) MetaScanner should be replaced by MetaTableAccessor
[ https://issues.apache.org/jira/browse/HBASE-12990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481660#comment-14481660 ] Andrey Stepachev commented on HBASE-12990: -- [~stack] this jira is for branch-1, MetaTableAccessor already commited to master (there was no problem with managed connections). {code} commit 948746ce4ed3bd174927c41bd4884cad70d693ef Author: Andrey Stepachev oct...@gmail.com Date: Mon Mar 9 10:39:59 2015 + HBASE-12990 MetaScanner should be replaced by MetaTableAccessor {code} will try to fix that hanging issues upon return from my trip. sorry long delay there. MetaScanner should be replaced by MetaTableAccessor --- Key: HBASE-12990 URL: https://issues.apache.org/jira/browse/HBASE-12990 Project: HBase Issue Type: Improvement Components: Client Affects Versions: 2.0.0, 1.1.0 Reporter: Andrey Stepachev Assignee: Andrey Stepachev Fix For: 2.0.0, 1.1.0 Attachments: HBASE-12990-branch-1.v1.patch, HBASE-12990.patch, HBASE-12990.v2.patch, HBASE-12990.v3.patch, HBASE-12990.v4.patch, HBASE-12990.v5.patch, HBASE-12990.v5.patch, HBASE-12990.v5.patch, HBASE-12990.v6.patch, HBASE-12990.v7.patch, HBASE-12990.v7.patch, HBASE-12990.v7.patch, HBASE-12990.v8.patch MetaScanner and MetaTableAccessor do similar things, but seems they tend to diverge. Let's have only one thing to enquiry META. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HBASE-5194) Backport HBASE-4465 (Lazy-seek optimization for StoreFile scanners) to 0.92
[ https://issues.apache.org/jira/browse/HBASE-5194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-5194. -- Resolution: Won't Fix Stale Backport HBASE-4465 (Lazy-seek optimization for StoreFile scanners) to 0.92 --- Key: HBASE-5194 URL: https://issues.apache.org/jira/browse/HBASE-5194 Project: HBase Issue Type: Task Reporter: Ted Yu Attachments: 4465-92.txt Lazy-seek optimization for StoreFile scanners is important feature This JIRA backports the feature to 0.92 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13297) 0.98 and 1.0: Remove client side result size calculation
[ https://issues.apache.org/jira/browse/HBASE-13297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-13297: -- Resolution: Won't Fix Fix Version/s: (was: 1.0.2) (was: 0.98.13) Assignee: (was: Lars Hofhansl) Status: Resolved (was: Patch Available) 0.98 and 1.0: Remove client side result size calculation Key: HBASE-13297 URL: https://issues.apache.org/jira/browse/HBASE-13297 Project: HBase Issue Type: Sub-task Components: Client Reporter: Lars Hofhansl Attachments: 13297-0.98.txt, 13297-v2-0.98.txt As described in parent, this can lead to missed rows when the client and server calculate different size values. The patch here proposes a backwards compatible patch for 0.98 and 1.0.x. Parent will do a patch for 1.1 and 2.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13373) Squash HFileReaderV3 together with HFileReaderV2 and AbstractHFileReader; ditto for Scanners and BlockReader, etc.
[ https://issues.apache.org/jira/browse/HBASE-13373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481602#comment-14481602 ] stack commented on HBASE-13373: --- Here is exception I get when patch applied to branch-1: {code} ... 3 more Caused by: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile Trailer from file hdfs://localhost:60932/user/stack/test-data/7a7e1f35-8b1c-4f74-bc51-ca5d8697ca53/data/hbase/meta/1588230740/info/a75a20537bba41fd8d602917a08ca937 at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:496) at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:525) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.init(StoreFile.java:1042) at org.apache.hadoop.hbase.regionserver.StoreFileInfo.open(StoreFileInfo.java:251) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:374) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:471) at org.apache.hadoop.hbase.regionserver.HStore.createStoreFileAndReader(HStore.java:660) at org.apache.hadoop.hbase.regionserver.HStore.access$8(HStore.java:655) at org.apache.hadoop.hbase.regionserver.HStore$1.call(HStore.java:530) at org.apache.hadoop.hbase.regionserver.HStore$1.call(HStore.java:1) ... 6 more Caused by: java.lang.IllegalArgumentException: Invalid HFile version: major=2, minor=0: expected at least major=2 and minor=3 at org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.checkFileVersion(HFileReaderImpl.java:295) at org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.init(HFileReaderImpl.java:183) at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:486) {code} Squash HFileReaderV3 together with HFileReaderV2 and AbstractHFileReader; ditto for Scanners and BlockReader, etc. -- Key: HBASE-13373 URL: https://issues.apache.org/jira/browse/HBASE-13373 Project: HBase Issue Type: Task Reporter: stack Assignee: stack Fix For: 2.0.0 Attachments: 0001-HBASE-13373-Squash-HFileReaderV3-together-with-HFile.patch, 13373.txt, 13373.v3.txt, 13373.v3.txt, 13373.v5.txt, 13373.v6.txt, 13373.v6.txt, 13373.v6.txt, 13373.v6.txt, 13373.v6.txt, 13373.wip.txt Profiling I actually ran into case where complaint that could not inline because: MaxInlineLevel maximum number of nested calls that are inlined 9 intx i.e. method was more than 9 levels deep. The HFileReaderV? with Abstracts is not needed anymore now we are into the clear with V3 enabled since hbase 1.0.0; we can have just an Interface and an implementation. If we need to support a new hfile type, can hopefully do it in a backward compatible way now we have Cell Interface, etc. Squashing all this stuff together actually makes it easier figuring what is going on when reading code. I can also get rid of a bunch of duplication too. Attached is a WIP. Doesn't fully compile yet but you get the idea. I'll keep on unless objection. Will try it against data written with old classes as soon as I have something working. I don't believe we write classnames into our data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13409) Add categories to uncategorized tests
[ https://issues.apache.org/jira/browse/HBASE-13409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-13409: --- Resolution: Fixed Fix Version/s: 1.0.2 0.98.13 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks for the review [~stack]. I pushed this to 0.98 and up. Add categories to uncategorized tests - Key: HBASE-13409 URL: https://issues.apache.org/jira/browse/HBASE-13409 Project: HBase Issue Type: Bug Affects Versions: 2.0.0, 1.1.0 Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Trivial Fix For: 2.0.0, 1.1.0, 0.98.13, 1.0.2 Attachments: HBASE-13409.patch A couple tests without categories were flagged recently by TestCheckTestClasses in a precommit build. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12990) MetaScanner should be replaced by MetaTableAccessor
[ https://issues.apache.org/jira/browse/HBASE-12990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481695#comment-14481695 ] Andrey Stepachev commented on HBASE-12990: -- thanks [~stack], I thought :) that it will be addendum, but there was much more differences that seemed at first glance. MetaScanner should be replaced by MetaTableAccessor --- Key: HBASE-12990 URL: https://issues.apache.org/jira/browse/HBASE-12990 Project: HBase Issue Type: Improvement Components: Client Affects Versions: 2.0.0, 1.1.0 Reporter: Andrey Stepachev Assignee: Andrey Stepachev Fix For: 2.0.0, 1.1.0 Attachments: HBASE-12990-branch-1.v1.patch, HBASE-12990.patch, HBASE-12990.v2.patch, HBASE-12990.v3.patch, HBASE-12990.v4.patch, HBASE-12990.v5.patch, HBASE-12990.v5.patch, HBASE-12990.v5.patch, HBASE-12990.v6.patch, HBASE-12990.v7.patch, HBASE-12990.v7.patch, HBASE-12990.v7.patch, HBASE-12990.v8.patch MetaScanner and MetaTableAccessor do similar things, but seems they tend to diverge. Let's have only one thing to enquiry META. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13375) Provide HBase superuser higher priority over other users in the RPC handling
[ https://issues.apache.org/jira/browse/HBASE-13375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481839#comment-14481839 ] Mikhail Antonov commented on HBASE-13375: - Got busy with other stuff last days.. Should be able to get to it today or tomorrow. Provide HBase superuser higher priority over other users in the RPC handling Key: HBASE-13375 URL: https://issues.apache.org/jira/browse/HBASE-13375 Project: HBase Issue Type: Improvement Components: rpc Reporter: Devaraj Das Assignee: Mikhail Antonov Fix For: 1.1.0 Attachments: HBASE-13375-v0.patch HBASE-13351 annotates Master RPCs so that RegionServer RPCs are treated with a higher priority compared to user RPCs (and they are handled by a separate set of handlers, etc.). It may be good to stretch this to users too - hbase superuser (configured via hbase.superuser) gets higher priority over other users in the RPC handling. That way the superuser can always perform administrative operations on the cluster even if all the normal priority handlers are occupied (for example, we had a situation where all the master's handlers were tied up with many simultaneous createTable RPC calls from multiple users and the master wasn't able to perform any operations initiated by the admin). (Discussed this some with [~enis] and [~elserj]). Does this make sense to others? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13376) Improvements to Stochastic load balancer
[ https://issues.apache.org/jira/browse/HBASE-13376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481844#comment-14481844 ] Hadoop QA commented on HBASE-13376: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12723393/HBASE-13376_0.98.txt against 0.98 branch at commit 057499474c346b28ad5ac3ab7da420814eba547d. ATTACHMENT ID: 12723393 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 26 warning messages. {color:red}-1 checkstyle{color}. The applied patch generated 3846 checkstyle errors (more than the master's current 3840 errors). {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/13584//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/13584//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/13584//artifact/patchprocess/checkstyle-aggregate.html Javadoc warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/13584//artifact/patchprocess/patchJavadocWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/13584//console This message is automatically generated. Improvements to Stochastic load balancer Key: HBASE-13376 URL: https://issues.apache.org/jira/browse/HBASE-13376 Project: HBase Issue Type: Improvement Components: Balancer Affects Versions: 1.0.0, 0.98.12 Reporter: Vandana Ayyalasomayajula Assignee: Vandana Ayyalasomayajula Priority: Minor Attachments: HBASE-13376_0.98.txt, HBASE-13376_98.patch There are two things this jira tries to address: 1. The locality picker in the stochastic balancer does not pick regions with least locality as candidates for swap/move. So when any user configures locality cost in the configs, the balancer does not always seems to move regions with bad locality. 2. When a cluster has equal number of loaded regions, it always picks the first one. It should pick a random region on one of the equally loaded servers. This improves a chance of finding a good candidate, when load picker is invoked several times. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HBASE-6726) Port HBASE-4465 'Lazy-seek optimization for StoreFile scanners' to 0.92
[ https://issues.apache.org/jira/browse/HBASE-6726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-6726. -- Resolution: Won't Fix Stale Port HBASE-4465 'Lazy-seek optimization for StoreFile scanners' to 0.92 --- Key: HBASE-6726 URL: https://issues.apache.org/jira/browse/HBASE-6726 Project: HBase Issue Type: Task Reporter: Ted Yu This is expected to significantly reduce the amount of disk IO -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12990) MetaScanner should be replaced by MetaTableAccessor
[ https://issues.apache.org/jira/browse/HBASE-12990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-12990: -- Resolution: Fixed Status: Resolved (was: Patch Available) My fault. Let me just resolve this then [~octo47] (usually addendums go in within a day or so... ). Lets open a new issue for backport (could also punt on backport given 1.1 is on its way). No hurry. Enjoy your trip. MetaScanner should be replaced by MetaTableAccessor --- Key: HBASE-12990 URL: https://issues.apache.org/jira/browse/HBASE-12990 Project: HBase Issue Type: Improvement Components: Client Affects Versions: 2.0.0, 1.1.0 Reporter: Andrey Stepachev Assignee: Andrey Stepachev Fix For: 2.0.0, 1.1.0 Attachments: HBASE-12990-branch-1.v1.patch, HBASE-12990.patch, HBASE-12990.v2.patch, HBASE-12990.v3.patch, HBASE-12990.v4.patch, HBASE-12990.v5.patch, HBASE-12990.v5.patch, HBASE-12990.v5.patch, HBASE-12990.v6.patch, HBASE-12990.v7.patch, HBASE-12990.v7.patch, HBASE-12990.v7.patch, HBASE-12990.v8.patch MetaScanner and MetaTableAccessor do similar things, but seems they tend to diverge. Let's have only one thing to enquiry META. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13409) Add categories to uncategorized tests
[ https://issues.apache.org/jira/browse/HBASE-13409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481820#comment-14481820 ] Hudson commented on HBASE-13409: FAILURE: Integrated in HBase-TRUNK #6349 (See [https://builds.apache.org/job/HBase-TRUNK/6349/]) HBASE-13409 Add categories to uncategorized tests (apurtell: rev 8c707499baa2a5472680f98b19356bc1e104bcd8) * hbase-client/src/test/java/org/apache/hadoop/hbase/filter/TestLongComparator.java * hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestClientExponentialBackoff.java Add categories to uncategorized tests - Key: HBASE-13409 URL: https://issues.apache.org/jira/browse/HBASE-13409 Project: HBase Issue Type: Bug Affects Versions: 2.0.0, 1.1.0 Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Trivial Fix For: 2.0.0, 1.1.0, 0.98.13, 1.0.2 Attachments: HBASE-13409.patch A couple tests without categories were flagged recently by TestCheckTestClasses in a precommit build. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13395) Remove HTableInterface
[ https://issues.apache.org/jira/browse/HBASE-13395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481836#comment-14481836 ] Mikhail Antonov commented on HBASE-13395: - It seems I've read it the same way as [~busbey] - bq. MAJOR version when you make incompatible API changes, So I assumed in current master we can do backward-incompatible changes in general, and removal of old APIs is fine. Let me look at the mail thread again, maybe I missed something. Remove HTableInterface -- Key: HBASE-13395 URL: https://issues.apache.org/jira/browse/HBASE-13395 Project: HBase Issue Type: Sub-task Components: API Affects Versions: 2.0.0 Reporter: Mikhail Antonov Fix For: 2.0.0 This class is marked as deprecated, probably can remove it, and if any methods from this specific class are in active use, need to decide what to do on callers' side. Should be able to replace with just Table interface usage. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-13414) Test code uses old Netty internal JVM detection
Sean Busbey created HBASE-13414: --- Summary: Test code uses old Netty internal JVM detection Key: HBASE-13414 URL: https://issues.apache.org/jira/browse/HBASE-13414 Project: HBase Issue Type: Bug Components: test Affects Versions: 1.0.0, 1.0.1 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Minor Fix For: 1.10, 2.0.0, 1.0.2 while doing some work for HADOOP-11804 I realized that TestHCM kept using an internal org.jboss.netty method post HBASE-10573. It's been working because the exclusion on netty for Hadoop doesn't cover the org.jboss version. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-13415) Procedure V2 - Use nonces for double submits from client
Enis Soztutar created HBASE-13415: - Summary: Procedure V2 - Use nonces for double submits from client Key: HBASE-13415 URL: https://issues.apache.org/jira/browse/HBASE-13415 Project: HBase Issue Type: Sub-task Reporter: Enis Soztutar The client can submit a procedure, but before getting the procId back, the master might fail. In this case, the client request will fail and the client will re-submit the request. If 1.1 client or if there is no contention for the table lock, the time window is pretty small, but still might happen. If the proc was accepted and stored in the procedure store, a re-submit from the client will add another procedure, which will execute after the first one. The first one will likely succeed, and the second one will fail (for example in the case of create table, the second one will throw TableExistsException). One idea is to use client generated nonces (that we already have) to guard against these cases. The client will submit the request with the nonce and the nonce will be saved together with the procedure in the store. In case of a double submit, the nonce-cache is checked and the procId of the original request is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482210#comment-14482210 ] Mikhail Antonov commented on HBASE-13103: - [~ndimiduk] any thoughts on the patch? :) [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Brainstorming Components: Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Attachments: HBASE-13103-v0.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
[ https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482262#comment-14482262 ] Lars Hofhansl commented on HBASE-13389: --- Yeah, not related to log roll, sorry. I meant the max time before we force a memstore flush (1 hour by default)... HBASE-5930. I still have not heard a convincing reason why the time to keep the mvcc stuff around needs to be greater than an hour or two :) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations - Key: HBASE-13389 URL: https://issues.apache.org/jira/browse/HBASE-13389 Project: HBase Issue Type: Sub-task Components: Performance Reporter: stack Attachments: 13389.txt HBASE-12600 moved the edit sequenceid from tags to instead exploit the mvcc/sequenceid slot in a key. Now Cells near-always have an associated mvcc/sequenceid where previous it was rare or the mvcc was kept up at the file level. This is sort of how it should be many of us would argue but as a side-effect of this change, read-time optimizations that helped speed scans were undone by this change. In this issue, lets see if we can get the optimizations back -- or just remove the optimizations altogether. The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291. The optimizations undone by this changes are (to quote the optimizer himself, Mr [~lhofhansl]): {quote} Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166. We're always storing the mvcc readpoints, and we never compare them against the actual smallestReadpoint, and hence we're always performing all the checks, tests, and comparisons that these jiras removed in addition to actually storing the data - which with up to 8 bytes per Cell is not trivial. {quote} This is the 'breaking' change: https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13414) TestHCM no longer needs to test for JRE 6.
[ https://issues.apache.org/jira/browse/HBASE-13414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-13414: Summary: TestHCM no longer needs to test for JRE 6. (was: Test code uses old Netty internal JVM detection) TestHCM no longer needs to test for JRE 6. -- Key: HBASE-13414 URL: https://issues.apache.org/jira/browse/HBASE-13414 Project: HBase Issue Type: Bug Components: test Affects Versions: 1.0.0, 1.0.1 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Minor Fix For: 2.0.0, 1.0.2, 1.10 while doing some work for HADOOP-11804 I realized that TestHCM kept using an internal org.jboss.netty method post HBASE-10573. It's been working because the exclusion on netty for Hadoop doesn't cover the org.jboss version. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13362) set max result size from client only (like caching)?
[ https://issues.apache.org/jira/browse/HBASE-13362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481969#comment-14481969 ] Lars Hofhansl commented on HBASE-13362: --- Turns out, trunk already does this, mostly. set max result size from client only (like caching)? Key: HBASE-13362 URL: https://issues.apache.org/jira/browse/HBASE-13362 Project: HBase Issue Type: Brainstorming Reporter: Lars Hofhansl With the recent problems we've been seeing client/server result size mismatch, I was thinking: Why was this not a problem with scanner caching? There are two reasons: # number of rows is easy to calculate (and we did it correctly) # caching is only controlled from the client, never set on the server alone We did fix both #1 and #2 in HBASE-13262. Still, I'd like to discuss the following: * default the client sent max result size to 2mb * remove any server only result sizing * continue to use hbase.client.scanner.max.result.size but enforce it via the client only (as the name implies anyway). Comments? Concerns? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13413) Create an integration test for Replication
[ https://issues.apache.org/jira/browse/HBASE-13413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482057#comment-14482057 ] Hadoop QA commented on HBASE-13413: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12723403/HBASE-13413.patch against master branch at commit 8c707499baa2a5472680f98b19356bc1e104bcd8. ATTACHMENT ID: 12723403 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 7 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings (more than the master's current 0 warnings). {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: + * This creates a new ClusterID wrapper that will automatically build connections and configurations + * The main runner loop for the test. It uses the {@link org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList} + * This tears down any tables that existed from before and rebuilds the tables and schemas on the source cluster. + * It then sets up replication from the source to the sink cluster by using the {@link org.apache.hadoop.hbase.client.replication.ReplicationAdmin} + for (HRegionLocation rl : cluster.getConnection().getRegionLocator(tableName).getAllRegionLocations()) { + * Run the {@link org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList.Generator} in the source cluster + * Run the {@link org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList.Verify} in the sink cluster. + * If replication is working properly the data written at the source cluster should be available in the sink cluster +addRequiredOptWithArg(s, SOURCE_CLUSTER_OPT, Cluster ID of the source cluster (e.g. localhost:2181:/hbase)); +addRequiredOptWithArg(r, DEST_CLUSTER_OPT, Cluster ID of the sink cluster (e.g. localhost:2182:/hbase)); {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/13586//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/13586//artifact/patchprocess/patchReleaseAuditWarnings.txt Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/13586//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/13586//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/13586//console This message is automatically generated. Create an integration test for Replication -- Key: HBASE-13413 URL: https://issues.apache.org/jira/browse/HBASE-13413 Project: HBase Issue Type: Test Components: integration tests Affects Versions: 1.0.0, 2.0.0 Reporter: Rajesh Nishtala Assignee: Rajesh Nishtala Priority: Minor Attachments: HBASE-13413.patch We want to have an end-to-end test for replication. it can write data into one cluster (with replication setup) and then read data from the other. The test should be capable of running for a long time and be reliant even under chaos monkey testing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13414) TestHCM no longer needs to test for JRE 6.
[ https://issues.apache.org/jira/browse/HBASE-13414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482056#comment-14482056 ] stack commented on HBASE-13414: --- +1 TestHCM no longer needs to test for JRE 6. -- Key: HBASE-13414 URL: https://issues.apache.org/jira/browse/HBASE-13414 Project: HBase Issue Type: Bug Components: test Affects Versions: 1.0.0, 1.0.1 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Minor Fix For: 2.0.0, 1.0.2, 1.10 Attachments: HBASE-13414.1.patch.txt while doing some work for HADOOP-11804 I realized that TestHCM kept using an internal org.jboss.netty method post HBASE-10573. It's been working because the exclusion on netty for Hadoop doesn't cover the org.jboss version. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13411) Misleading error message when request size quota limit exceeds
[ https://issues.apache.org/jira/browse/HBASE-13411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-13411: Resolution: Fixed Status: Resolved (was: Patch Available) Misleading error message when request size quota limit exceeds -- Key: HBASE-13411 URL: https://issues.apache.org/jira/browse/HBASE-13411 Project: HBase Issue Type: Bug Affects Versions: 2.0.0 Reporter: Ashish Singhi Assignee: Ashish Singhi Priority: Minor Labels: quota Fix For: 2.0.0 Attachments: HBASE-13411.patch User will get the same error message when either number of requests exceeds or request size exceeds. So its better we differentiate them. Thanks to [~mbertozzi] for confirming the same offline. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13362) set max result size from client only (like caching)?
[ https://issues.apache.org/jira/browse/HBASE-13362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-13362: -- Attachment: 13362-0.98.txt 0.98 patch. Since the previous default was Long.MAX_VALUE, this is backwards compatible patch, client now enforced a default limit, without forcing the same on the server. set max result size from client only (like caching)? Key: HBASE-13362 URL: https://issues.apache.org/jira/browse/HBASE-13362 Project: HBase Issue Type: Brainstorming Reporter: Lars Hofhansl Attachments: 13362-0.98.txt, 13362-master.txt With the recent problems we've been seeing client/server result size mismatch, I was thinking: Why was this not a problem with scanner caching? There are two reasons: # number of rows is easy to calculate (and we did it correctly) # caching is only controlled from the client, never set on the server alone We did fix both #1 and #2 in HBASE-13262. Still, I'd like to discuss the following: * default the client sent max result size to 2mb * remove any server only result sizing * continue to use hbase.client.scanner.max.result.size but enforce it via the client only (as the name implies anyway). Comments? Concerns? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13362) set max result size from client only (like caching)?
[ https://issues.apache.org/jira/browse/HBASE-13362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482070#comment-14482070 ] Jonathan Lawlor commented on HBASE-13362: - +1, looks good to me. Question, should we also add an entry for this new configuration to hbase-default.xml? I'm just thinking, as a user, how would I know about this new configuration value and the semantics behind it? set max result size from client only (like caching)? Key: HBASE-13362 URL: https://issues.apache.org/jira/browse/HBASE-13362 Project: HBase Issue Type: Brainstorming Reporter: Lars Hofhansl Attachments: 13362-0.98.txt, 13362-master.txt With the recent problems we've been seeing client/server result size mismatch, I was thinking: Why was this not a problem with scanner caching? There are two reasons: # number of rows is easy to calculate (and we did it correctly) # caching is only controlled from the client, never set on the server alone We did fix both #1 and #2 in HBASE-13262. Still, I'd like to discuss the following: * default the client sent max result size to 2mb * remove any server only result sizing * continue to use hbase.client.scanner.max.result.size but enforce it via the client only (as the name implies anyway). Comments? Concerns? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-13416) Deleted/Recreated tables causes replication of old WALS to get replicated incorrectly
Rajesh Nishtala created HBASE-13416: --- Summary: Deleted/Recreated tables causes replication of old WALS to get replicated incorrectly Key: HBASE-13416 URL: https://issues.apache.org/jira/browse/HBASE-13416 Project: HBase Issue Type: Bug Components: Replication Reporter: Rajesh Nishtala 1) Create a table and setup replication to another cluster 2) Write some data into the source table 3) Disable and delete the table from the source cluster and the sink cluster 4) Recreate the table with the same schema in the source and sink clusters 5) The source cluster is empty but the sink cluster has a copy of the old data that is not in the source cluster. To work around: 1) disable the table in the source cluster 2) Roll the WALs across all region servers 3) Delete the table in the source cluster -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13413) Create an integration test for Replication
[ https://issues.apache.org/jira/browse/HBASE-13413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Nishtala updated HBASE-13413: Attachment: HBASE-13413-v1.patch next rev Create an integration test for Replication -- Key: HBASE-13413 URL: https://issues.apache.org/jira/browse/HBASE-13413 Project: HBase Issue Type: Test Components: integration tests Affects Versions: 1.0.0, 2.0.0 Reporter: Rajesh Nishtala Assignee: Rajesh Nishtala Priority: Minor Attachments: HBASE-13413-v1.patch, HBASE-13413.patch We want to have an end-to-end test for replication. it can write data into one cluster (with replication setup) and then read data from the other. The test should be capable of running for a long time and be reliant even under chaos monkey testing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13291) Lift the scan ceiling
[ https://issues.apache.org/jira/browse/HBASE-13291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481950#comment-14481950 ] Lars Hofhansl commented on HBASE-13291: --- SQM.match() was changed in truck to be expressed in term of operations on Cells (before it was hand optimized to deconstruct KVs in the most efficient way). I do not know how much that costs, but might be worth investigating. So far our introduction of Cells has made a bunch of things slower, but since we haven't finished I am not aware of a single perf advantage (when finished we should be able to make block encoding and prefix tries much faster). Maybe it's time to make a concerted effort and get rid of KeyValue.getKey() and KeyValue.getBuffer() for real, it's those two methods that prevent a lot of cool optimizations. Lift the scan ceiling - Key: HBASE-13291 URL: https://issues.apache.org/jira/browse/HBASE-13291 Project: HBase Issue Type: Improvement Components: Scanners Affects Versions: 1.0.0 Reporter: stack Assignee: stack Attachments: 13291.hacks.txt, 13291.inlining.txt, Screen Shot 2015-03-26 at 12.12.13 PM.png, Screen Shot 2015-03-26 at 3.39.33 PM.png, hack_to_bypass_bb.txt, nonBBposAndInineMvccVint.txt, q (1).png, scan_no_mvcc_optimized.svg, traces.7.svg, traces.filterall.svg, traces.nofilter.svg, traces.small2.svg, traces.smaller.svg Scanning medium sized rows with multiple concurrent scanners exhibits interesting 'ceiling' properties. A server runs at about 6.7k ops a second using 450% of possible 1600% of CPUs when 4 clients each with 10 threads doing scan 1000 rows. If I add '--filterAll' argument (do not return results), then we run at 1450% of possible 1600% possible but we do 8k ops a second. Let me attach flame graphs for two cases. Unfortunately, there is some frustrating dark art going on. Let me try figure it... Filing issue in meantime to keep score in. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13362) set max result size from client only (like caching)?
[ https://issues.apache.org/jira/browse/HBASE-13362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-13362: -- Attachment: 13362-master.txt extra trivial patch * a client will rarely request chunk of 100mb or more even with 10ge networks (about 100ms transmission time with 10ge) * the client's default limit remains 2mb, which is nice for 1ge and acceptable for 10ge networks (15ms and 1.5ms transmission time, resp, both way larger than intra DC latency) I'll make a 0.98 and 1.0 patch as well, to make that part of the logic the same. set max result size from client only (like caching)? Key: HBASE-13362 URL: https://issues.apache.org/jira/browse/HBASE-13362 Project: HBase Issue Type: Brainstorming Reporter: Lars Hofhansl Attachments: 13362-master.txt With the recent problems we've been seeing client/server result size mismatch, I was thinking: Why was this not a problem with scanner caching? There are two reasons: # number of rows is easy to calculate (and we did it correctly) # caching is only controlled from the client, never set on the server alone We did fix both #1 and #2 in HBASE-13262. Still, I'd like to discuss the following: * default the client sent max result size to 2mb * remove any server only result sizing * continue to use hbase.client.scanner.max.result.size but enforce it via the client only (as the name implies anyway). Comments? Concerns? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13414) TestHCM no longer needs to test for JRE 6.
[ https://issues.apache.org/jira/browse/HBASE-13414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-13414: -- Fix Version/s: (was: 1.10) 1.1.0 TestHCM no longer needs to test for JRE 6. -- Key: HBASE-13414 URL: https://issues.apache.org/jira/browse/HBASE-13414 Project: HBase Issue Type: Bug Components: test Affects Versions: 1.0.0, 1.0.1 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Minor Fix For: 2.0.0, 1.1.0, 1.0.2 Attachments: HBASE-13414.1.patch.txt while doing some work for HADOOP-11804 I realized that TestHCM kept using an internal org.jboss.netty method post HBASE-10573. It's been working because the exclusion on netty for Hadoop doesn't cover the org.jboss version. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13397) Purge duplicate rpc request thread local
[ https://issues.apache.org/jira/browse/HBASE-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481939#comment-14481939 ] Hudson commented on HBASE-13397: SUCCESS: Integrated in HBase-0.98 #937 (See [https://builds.apache.org/job/HBase-0.98/937/]) HBASE-13397 Purge duplicate rpc request thread local (apurtell: rev c780de0e4f93f6fe3f20efa402ced7f06bccb584) * hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/CreateTableHandler.java * hbase-server/src/test/java/org/apache/hadoop/hbase/ipc/TestCallRunner.java * hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java * hbase-server/src/main/java/org/apache/hadoop/hbase/security/visibility/VisibilityController.java * hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/SecureBulkLoadEndpoint.java * hbase-server/src/main/java/org/apache/hadoop/hbase/security/token/TokenProvider.java * hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java * hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcCallContext.java * hbase-server/src/test/java/org/apache/hadoop/hbase/security/token/TestTokenAuthentication.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RequestContext.java * hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/CallRunner.java * hbase-server/src/main/java/org/apache/hadoop/hbase/security/visibility/VisibilityUtils.java Purge duplicate rpc request thread local Key: HBASE-13397 URL: https://issues.apache.org/jira/browse/HBASE-13397 Project: HBase Issue Type: Bug Components: rpc Reporter: stack Assignee: stack Fix For: 2.0.0, 1.1.0, 0.98.13 Attachments: 13397.txt, HBASE-13397-0.98.patch Serverside, in a few locations, code wants access to RPC context to get user and remote client address. A thread local makes it so this info is accessible anywhere on the processing chain. Turns out we have this mechanism twice (noticed by our Matteo). This patch purges one of the thread locals. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13362) set max result size from client only (like caching)?
[ https://issues.apache.org/jira/browse/HBASE-13362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481965#comment-14481965 ] Lars Hofhansl commented on HBASE-13362: --- Lemme make a trivial patch - seems to be my lot in life. :) set max result size from client only (like caching)? Key: HBASE-13362 URL: https://issues.apache.org/jira/browse/HBASE-13362 Project: HBase Issue Type: Brainstorming Reporter: Lars Hofhansl With the recent problems we've been seeing client/server result size mismatch, I was thinking: Why was this not a problem with scanner caching? There are two reasons: # number of rows is easy to calculate (and we did it correctly) # caching is only controlled from the client, never set on the server alone We did fix both #1 and #2 in HBASE-13262. Still, I'd like to discuss the following: * default the client sent max result size to 2mb * remove any server only result sizing * continue to use hbase.client.scanner.max.result.size but enforce it via the client only (as the name implies anyway). Comments? Concerns? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13414) TestHCM no longer needs to test for JRE 6.
[ https://issues.apache.org/jira/browse/HBASE-13414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481967#comment-14481967 ] Sean Busbey commented on HBASE-13414: - updated subject. since it looks like this change is in hbase 1.0+ only and those versions don't support java 6, I'm just ripping it out. TestHCM no longer needs to test for JRE 6. -- Key: HBASE-13414 URL: https://issues.apache.org/jira/browse/HBASE-13414 Project: HBase Issue Type: Bug Components: test Affects Versions: 1.0.0, 1.0.1 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Minor Fix For: 2.0.0, 1.0.2, 1.10 while doing some work for HADOOP-11804 I realized that TestHCM kept using an internal org.jboss.netty method post HBASE-10573. It's been working because the exclusion on netty for Hadoop doesn't cover the org.jboss version. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13362) set max result size from client only (like caching)?
[ https://issues.apache.org/jira/browse/HBASE-13362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482254#comment-14482254 ] Lars Hofhansl commented on HBASE-13362: --- I'd be a bit hesitant encoding Long.MAX_VALUE in hbase-default.xml (for 0.98). In trunk we could, since it's set to 100mb there. set max result size from client only (like caching)? Key: HBASE-13362 URL: https://issues.apache.org/jira/browse/HBASE-13362 Project: HBase Issue Type: Brainstorming Reporter: Lars Hofhansl Attachments: 13362-0.98.txt, 13362-master.txt With the recent problems we've been seeing client/server result size mismatch, I was thinking: Why was this not a problem with scanner caching? There are two reasons: # number of rows is easy to calculate (and we did it correctly) # caching is only controlled from the client, never set on the server alone We did fix both #1 and #2 in HBASE-13262. Still, I'd like to discuss the following: * default the client sent max result size to 2mb * remove any server only result sizing * continue to use hbase.client.scanner.max.result.size but enforce it via the client only (as the name implies anyway). Comments? Concerns? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13413) Create an integration test for Replication
[ https://issues.apache.org/jira/browse/HBASE-13413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481955#comment-14481955 ] Dima Spivak commented on HBASE-13413: - Would you mind putting the patch up on the Review Board, [~rajesh0042]? I'll add comments. Create an integration test for Replication -- Key: HBASE-13413 URL: https://issues.apache.org/jira/browse/HBASE-13413 Project: HBase Issue Type: Test Components: integration tests Affects Versions: 1.0.0, 2.0.0 Reporter: Rajesh Nishtala Assignee: Rajesh Nishtala Priority: Minor Attachments: HBASE-13413.patch We want to have an end-to-end test for replication. it can write data into one cluster (with replication setup) and then read data from the other. The test should be capable of running for a long time and be reliant even under chaos monkey testing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13414) TestHCM no longer needs to test for JRE 6.
[ https://issues.apache.org/jira/browse/HBASE-13414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-13414: Attachment: HBASE-13414.1.patch.txt Ran through {{mvn -Dtest=TestHCM}} and it passed give: {code} $ mvn --version Apache Maven 3.0.3 (r1075438; 2011-02-28 11:31:09-0600) Maven home: /usr/share/maven Java version: 1.7.0_51, vendor: Oracle Corporation Java home: /Library/Java/JavaVirtualMachines/jdk1.7.0_51.jdk/Contents/Home/jre Default locale: en_US, platform encoding: UTF-8 OS name: mac os x, version: 10.8.5, arch: x86_64, family: mac {code} TestHCM no longer needs to test for JRE 6. -- Key: HBASE-13414 URL: https://issues.apache.org/jira/browse/HBASE-13414 Project: HBase Issue Type: Bug Components: test Affects Versions: 1.0.0, 1.0.1 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Minor Fix For: 2.0.0, 1.0.2, 1.10 Attachments: HBASE-13414.1.patch.txt while doing some work for HADOOP-11804 I realized that TestHCM kept using an internal org.jboss.netty method post HBASE-10573. It's been working because the exclusion on netty for Hadoop doesn't cover the org.jboss version. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13397) Purge duplicate rpc request thread local
[ https://issues.apache.org/jira/browse/HBASE-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482153#comment-14482153 ] Hudson commented on HBASE-13397: FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #890 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/890/]) HBASE-13397 Purge duplicate rpc request thread local (apurtell: rev c780de0e4f93f6fe3f20efa402ced7f06bccb584) * hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcCallContext.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/CreateTableHandler.java * hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/SecureBulkLoadEndpoint.java * hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/CallRunner.java * hbase-server/src/test/java/org/apache/hadoop/hbase/security/token/TestTokenAuthentication.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * hbase-server/src/main/java/org/apache/hadoop/hbase/security/visibility/VisibilityController.java * hbase-server/src/main/java/org/apache/hadoop/hbase/security/visibility/VisibilityUtils.java * hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RequestContext.java * hbase-server/src/test/java/org/apache/hadoop/hbase/ipc/TestCallRunner.java * hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java * hbase-server/src/main/java/org/apache/hadoop/hbase/security/token/TokenProvider.java * hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java Purge duplicate rpc request thread local Key: HBASE-13397 URL: https://issues.apache.org/jira/browse/HBASE-13397 Project: HBase Issue Type: Bug Components: rpc Reporter: stack Assignee: stack Fix For: 2.0.0, 1.1.0, 0.98.13 Attachments: 13397.txt, HBASE-13397-0.98.patch Serverside, in a few locations, code wants access to RPC context to get user and remote client address. A thread local makes it so this info is accessible anywhere on the processing chain. Turns out we have this mechanism twice (noticed by our Matteo). This patch purges one of the thread locals. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
[ https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482163#comment-14482163 ] Enis Soztutar commented on HBASE-13389: --- bq. when we replay data due to recovery we want it to fall into the right place w.r.t to existing data. Why do we need more than the maximum time to roll a log (1h)? I think the min time to keep is max time an edit can live in the memstore without being flushed. This is not related to log roll (since we still replay edits from a previous log roll) but how much further an edit can be replayed through dist log replay I think. Case 3 as Jeff puts it is an issue with the comparison order. We compare entries with {{row column - ts - type - seqId}} order, however, we should compare entries in {{row column - ts - seqId - type}} order so that Put, Delete, Put with the same TS works. If we do better resolution for ts's, this is not needed though. [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations - Key: HBASE-13389 URL: https://issues.apache.org/jira/browse/HBASE-13389 Project: HBase Issue Type: Sub-task Components: Performance Reporter: stack Attachments: 13389.txt HBASE-12600 moved the edit sequenceid from tags to instead exploit the mvcc/sequenceid slot in a key. Now Cells near-always have an associated mvcc/sequenceid where previous it was rare or the mvcc was kept up at the file level. This is sort of how it should be many of us would argue but as a side-effect of this change, read-time optimizations that helped speed scans were undone by this change. In this issue, lets see if we can get the optimizations back -- or just remove the optimizations altogether. The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291. The optimizations undone by this changes are (to quote the optimizer himself, Mr [~lhofhansl]): {quote} Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166. We're always storing the mvcc readpoints, and we never compare them against the actual smallestReadpoint, and hence we're always performing all the checks, tests, and comparisons that these jiras removed in addition to actually storing the data - which with up to 8 bytes per Cell is not trivial. {quote} This is the 'breaking' change: https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13291) Lift the scan ceiling
[ https://issues.apache.org/jira/browse/HBASE-13291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481941#comment-14481941 ] Lars Hofhansl commented on HBASE-13291: --- SQM.match(), StoreFileScanner.next()... My old nemesis' :) Lift the scan ceiling - Key: HBASE-13291 URL: https://issues.apache.org/jira/browse/HBASE-13291 Project: HBase Issue Type: Improvement Components: Scanners Affects Versions: 1.0.0 Reporter: stack Assignee: stack Attachments: 13291.hacks.txt, 13291.inlining.txt, Screen Shot 2015-03-26 at 12.12.13 PM.png, Screen Shot 2015-03-26 at 3.39.33 PM.png, hack_to_bypass_bb.txt, nonBBposAndInineMvccVint.txt, q (1).png, scan_no_mvcc_optimized.svg, traces.7.svg, traces.filterall.svg, traces.nofilter.svg, traces.small2.svg, traces.smaller.svg Scanning medium sized rows with multiple concurrent scanners exhibits interesting 'ceiling' properties. A server runs at about 6.7k ops a second using 450% of possible 1600% of CPUs when 4 clients each with 10 threads doing scan 1000 rows. If I add '--filterAll' argument (do not return results), then we run at 1450% of possible 1600% possible but we do 8k ops a second. Let me attach flame graphs for two cases. Unfortunately, there is some frustrating dark art going on. Let me try figure it... Filing issue in meantime to keep score in. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
[ https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481916#comment-14481916 ] Lars Hofhansl edited comment on HBASE-13389 at 4/6/15 9:06 PM: --- We do need to revisit the 6 days, right? Would 3 days be enough? Lemme try to understand the cases: # when we replay data due to recovery we want it to fall into the right place w.r.t to existing data. Why do we need more than the maximum time to roll a log (1h)? # replication... Yeah, that's important. I'd say if you have a replication lag of more than a few hours you have a larger problem anyway. # This too... Although I do not actually agree that this is an advantage. Mutations (including deletes) being idempotent in HBase is a feature and not a problem. So with all this I do see any reason to keep these for more than a few hours. It's very possible that I am missing something. was (Author: lhofhansl): We do need to revisit the 6 days, right? Would 3 days be enough? Lemme try to understand the cases: # when we replay data due to recovery we want it to fall into the right place w.r.t to existing data. Why do we need more then the maximum time to roll a log (1h)? # replication... Yeah, that's important. I'd say if you have a replication lag of more than a few hours you have a larger problem anyway. # This too... Although I do not actually agree that this is an advantage. Mutations (including deletes) being idempotent in HBase is a feature and not a problem. So with all this I do need any reason to keep these for more than a few hours. It's very possible that I am missing something. [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations - Key: HBASE-13389 URL: https://issues.apache.org/jira/browse/HBASE-13389 Project: HBase Issue Type: Sub-task Components: Performance Reporter: stack Attachments: 13389.txt HBASE-12600 moved the edit sequenceid from tags to instead exploit the mvcc/sequenceid slot in a key. Now Cells near-always have an associated mvcc/sequenceid where previous it was rare or the mvcc was kept up at the file level. This is sort of how it should be many of us would argue but as a side-effect of this change, read-time optimizations that helped speed scans were undone by this change. In this issue, lets see if we can get the optimizations back -- or just remove the optimizations altogether. The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291. The optimizations undone by this changes are (to quote the optimizer himself, Mr [~lhofhansl]): {quote} Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166. We're always storing the mvcc readpoints, and we never compare them against the actual smallestReadpoint, and hence we're always performing all the checks, tests, and comparisons that these jiras removed in addition to actually storing the data - which with up to 8 bytes per Cell is not trivial. {quote} This is the 'breaking' change: https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13291) Lift the scan ceiling
[ https://issues.apache.org/jira/browse/HBASE-13291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482093#comment-14482093 ] stack commented on HBASE-13291: --- bq. SQM.match() was changed in truck to be expressed in term of operations on Cells (before it was hand optimized to deconstruct KVs in the most efficient way). [~lhofhansl] The attache hack undoes the Cellification in SQM#match. See in https://issues.apache.org/jira/secure/attachment/12708268/13291.hacks.txt Argument for the 'dirty tricks' is: {code} 321 // Dirty tricks if cell is a KeyValue. Below uglyness is to save on our reparsing lengths of 322 // families, rows, and keys more than once. {code} bq. So far our introduction of Cells ... I am not aware of a single perf advantage Yeah bq. Maybe it's time to make a concerted effort and get rid of KeyValue.getKey() and KeyValue.getBuffer() for real, it's those two methods that prevent a lot of cool optimizations. [~anoop.hbase] and [~ram_krish] have done it in the read path. Need same on write path. Lift the scan ceiling - Key: HBASE-13291 URL: https://issues.apache.org/jira/browse/HBASE-13291 Project: HBase Issue Type: Improvement Components: Scanners Affects Versions: 1.0.0 Reporter: stack Assignee: stack Attachments: 13291.hacks.txt, 13291.inlining.txt, Screen Shot 2015-03-26 at 12.12.13 PM.png, Screen Shot 2015-03-26 at 3.39.33 PM.png, hack_to_bypass_bb.txt, nonBBposAndInineMvccVint.txt, q (1).png, scan_no_mvcc_optimized.svg, traces.7.svg, traces.filterall.svg, traces.nofilter.svg, traces.small2.svg, traces.smaller.svg Scanning medium sized rows with multiple concurrent scanners exhibits interesting 'ceiling' properties. A server runs at about 6.7k ops a second using 450% of possible 1600% of CPUs when 4 clients each with 10 threads doing scan 1000 rows. If I add '--filterAll' argument (do not return results), then we run at 1450% of possible 1600% possible but we do 8k ops a second. Let me attach flame graphs for two cases. Unfortunately, there is some frustrating dark art going on. Let me try figure it... Filing issue in meantime to keep score in. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
[ https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482106#comment-14482106 ] stack commented on HBASE-13389: --- I have been trying to write up life of a sequenceid: https://docs.google.com/document/d/16beczDie-KU1uSpJvd0GoUlQbPtQBL93rOOPqnE5Ma0/edit# Let me pick it up again. Will add in above notes. Would be sweet if could backfill tests that verify our expectations align with the story we are telling. [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations - Key: HBASE-13389 URL: https://issues.apache.org/jira/browse/HBASE-13389 Project: HBase Issue Type: Sub-task Components: Performance Reporter: stack Attachments: 13389.txt HBASE-12600 moved the edit sequenceid from tags to instead exploit the mvcc/sequenceid slot in a key. Now Cells near-always have an associated mvcc/sequenceid where previous it was rare or the mvcc was kept up at the file level. This is sort of how it should be many of us would argue but as a side-effect of this change, read-time optimizations that helped speed scans were undone by this change. In this issue, lets see if we can get the optimizations back -- or just remove the optimizations altogether. The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291. The optimizations undone by this changes are (to quote the optimizer himself, Mr [~lhofhansl]): {quote} Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166. We're always storing the mvcc readpoints, and we never compare them against the actual smallestReadpoint, and hence we're always performing all the checks, tests, and comparisons that these jiras removed in addition to actually storing the data - which with up to 8 bytes per Cell is not trivial. {quote} This is the 'breaking' change: https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13413) Create an integration test for Replication
[ https://issues.apache.org/jira/browse/HBASE-13413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482104#comment-14482104 ] Dima Spivak commented on HBASE-13413: - Looks very cool, [~rajesh0042]. Added comments up on Phabricator. Create an integration test for Replication -- Key: HBASE-13413 URL: https://issues.apache.org/jira/browse/HBASE-13413 Project: HBase Issue Type: Test Components: integration tests Affects Versions: 1.0.0, 2.0.0 Reporter: Rajesh Nishtala Assignee: Rajesh Nishtala Priority: Minor Attachments: HBASE-13413.patch We want to have an end-to-end test for replication. it can write data into one cluster (with replication setup) and then read data from the other. The test should be capable of running for a long time and be reliant even under chaos monkey testing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
[ https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482322#comment-14482322 ] Lars Hofhansl commented on HBASE-13389: --- I think we had comment overlap. :) bq. ...you are not against changing sort order so that seqid prevails over type are you...? I would actually be against it, since it breaks the fact that all mutations in HBase are idempotent - when the client encounters any problem with a batch of updates, it can just do those again, and the outcome would be identical - within the limits of what HBase defines, i.e. with ms resolution, now we would complicate that, and need explaining to do. So with the discussion above in place, can be lower the default time to 3 days? So that we can be reasonably sure that major compactions would purge the mvcc cruft? [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations - Key: HBASE-13389 URL: https://issues.apache.org/jira/browse/HBASE-13389 Project: HBase Issue Type: Sub-task Components: Performance Reporter: stack Attachments: 13389.txt HBASE-12600 moved the edit sequenceid from tags to instead exploit the mvcc/sequenceid slot in a key. Now Cells near-always have an associated mvcc/sequenceid where previous it was rare or the mvcc was kept up at the file level. This is sort of how it should be many of us would argue but as a side-effect of this change, read-time optimizations that helped speed scans were undone by this change. In this issue, lets see if we can get the optimizations back -- or just remove the optimizations altogether. The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291. The optimizations undone by this changes are (to quote the optimizer himself, Mr [~lhofhansl]): {quote} Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166. We're always storing the mvcc readpoints, and we never compare them against the actual smallestReadpoint, and hence we're always performing all the checks, tests, and comparisons that these jiras removed in addition to actually storing the data - which with up to 8 bytes per Cell is not trivial. {quote} This is the 'breaking' change: https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13205) [branch-1] Backport HBASE-11598 Add simple rpc throttling
[ https://issues.apache.org/jira/browse/HBASE-13205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482450#comment-14482450 ] Elliott Clark commented on HBASE-13205: --- Patch looks good to me. Thoughts on this [~mbertozzi] or [~ndimiduk] [branch-1] Backport HBASE-11598 Add simple rpc throttling - Key: HBASE-13205 URL: https://issues.apache.org/jira/browse/HBASE-13205 Project: HBase Issue Type: Task Components: security Reporter: Ashish Singhi Assignee: Ashish Singhi Labels: multitenancy, quota Fix For: 1.1.0 Attachments: HBASE-13205-branch-1.patch, HBASE-13205-v1-branch-1.patch, HBASE-13205-v2-branch-1.patch, HBASE-13205-v2-branch-1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
[ https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482283#comment-14482283 ] stack commented on HBASE-13389: --- bq. So with all this I do see any reason to keep these for more than a few hours. Its not log rolling as per Enis. It is when memstore is flushed. Default is memstores are flushed at least once an hour: public static final int DEFAULT_CACHE_FLUSH_INTERVAL = 360; So if an old edit comes in during distributed log replay, an edit that has already been flushed to an hfile, we need to be able to put it in the appropriate slot (as you say). This can happen if we are overplaying edits in case where Master does not have last flush sequenceid on a region. If HFiles have all their seqids, it is easy. But if mvcc has been purged from hfiles (optimization) and we get an edit that falls into the hfile time range, we are going to be confused. Somehow the optimization purging mvcc should not run until we are sure old WALs with seqids older than those in hfiles for all regions have been let go. For replication, yeah, needs a few days. The root of the lag may take a few days to fix. On the put - delete - put, you are not against changing sort order so that seqid prevails over type are you [~lhofhansl]? Would be good change for 2.0. [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations - Key: HBASE-13389 URL: https://issues.apache.org/jira/browse/HBASE-13389 Project: HBase Issue Type: Sub-task Components: Performance Reporter: stack Attachments: 13389.txt HBASE-12600 moved the edit sequenceid from tags to instead exploit the mvcc/sequenceid slot in a key. Now Cells near-always have an associated mvcc/sequenceid where previous it was rare or the mvcc was kept up at the file level. This is sort of how it should be many of us would argue but as a side-effect of this change, read-time optimizations that helped speed scans were undone by this change. In this issue, lets see if we can get the optimizations back -- or just remove the optimizations altogether. The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291. The optimizations undone by this changes are (to quote the optimizer himself, Mr [~lhofhansl]): {quote} Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166. We're always storing the mvcc readpoints, and we never compare them against the actual smallestReadpoint, and hence we're always performing all the checks, tests, and comparisons that these jiras removed in addition to actually storing the data - which with up to 8 bytes per Cell is not trivial. {quote} This is the 'breaking' change: https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13412) Region split decisions should have jitter
[ https://issues.apache.org/jira/browse/HBASE-13412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark updated HBASE-13412: -- Attachment: HBASE-13412-v1.patch Patch without the embarrassing typo :-) Also fixed the test to account for jitter. Submitting to see if any other tests rely on exact number for splits. Region split decisions should have jitter - Key: HBASE-13412 URL: https://issues.apache.org/jira/browse/HBASE-13412 Project: HBase Issue Type: New Feature Components: regionserver Affects Versions: 1.0.0, 2.0.0 Reporter: Elliott Clark Assignee: Elliott Clark Fix For: 2.0.0, 1.1.0 Attachments: HBASE-13412-v1.patch, HBASE-13412.patch Whenever a region splits it causes lots of IO (compactions are queued for a while). Because of this it's important to make sure that well distributed tables don't have all of their regions split at exactly the same time. This is basically the same as our compaction jitter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13412) Region split decisions should have jitter
[ https://issues.apache.org/jira/browse/HBASE-13412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark updated HBASE-13412: -- Attachment: HBASE-13412-v2.patch Hamcrest sometimes works and other times doesn't. Seems like an issue with ordering of classpath of test dependencies. So this doesn't use hamcrest at all. Region split decisions should have jitter - Key: HBASE-13412 URL: https://issues.apache.org/jira/browse/HBASE-13412 Project: HBase Issue Type: New Feature Components: regionserver Affects Versions: 1.0.0, 2.0.0 Reporter: Elliott Clark Assignee: Elliott Clark Fix For: 2.0.0, 1.1.0 Attachments: HBASE-13412-v1.patch, HBASE-13412-v2.patch, HBASE-13412.patch Whenever a region splits it causes lots of IO (compactions are queued for a while). Because of this it's important to make sure that well distributed tables don't have all of their regions split at exactly the same time. This is basically the same as our compaction jitter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13416) Recreating a deleted table causes replication of old WALS
[ https://issues.apache.org/jira/browse/HBASE-13416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark updated HBASE-13416: -- Priority: Critical (was: Major) Recreating a deleted table causes replication of old WALS - Key: HBASE-13416 URL: https://issues.apache.org/jira/browse/HBASE-13416 Project: HBase Issue Type: Bug Components: Replication Reporter: Rajesh Nishtala Priority: Critical 1) Create a table and setup replication to another cluster 2) Write some data into the source table 3) Disable and delete the table from the source cluster and the sink cluster 4) Recreate the table with the same schema in the source and sink clusters 5) The source cluster is empty but the sink cluster has a copy of the old data that is not in the source cluster. To work around: 1) disable the table in the source cluster 2) Roll the WALs across all region servers 3) Delete the table in the source cluster -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13412) Region split decisions should have jitter
[ https://issues.apache.org/jira/browse/HBASE-13412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482417#comment-14482417 ] Hadoop QA commented on HBASE-13412: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12723503/HBASE-13412-v1.patch against master branch at commit 8c740f43093671cfd4cc2b1052d8556e0d492c13. ATTACHMENT ID: 12723503 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:red}-1 checkstyle{color}. The applied patch generated 1922 checkstyle errors (more than the master's current 1921 errors). {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestRegionSplitPolicy Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/13588//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/13588//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/13588//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/13588//console This message is automatically generated. Region split decisions should have jitter - Key: HBASE-13412 URL: https://issues.apache.org/jira/browse/HBASE-13412 Project: HBase Issue Type: New Feature Components: regionserver Affects Versions: 1.0.0, 2.0.0 Reporter: Elliott Clark Assignee: Elliott Clark Fix For: 2.0.0, 1.1.0 Attachments: HBASE-13412-v1.patch, HBASE-13412-v2.patch, HBASE-13412.patch Whenever a region splits it causes lots of IO (compactions are queued for a while). Because of this it's important to make sure that well distributed tables don't have all of their regions split at exactly the same time. This is basically the same as our compaction jitter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13403) Make waitOnSafeMode configurable in MasterFileSystem
[ https://issues.apache.org/jira/browse/HBASE-13403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482541#comment-14482541 ] Anoop Sam John commented on HBASE-13403: +1 Pls add the new config details in the Release Notes Any place in book we can add this? This seems an important config. Make waitOnSafeMode configurable in MasterFileSystem Key: HBASE-13403 URL: https://issues.apache.org/jira/browse/HBASE-13403 Project: HBase Issue Type: Bug Components: master Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Priority: Minor Attachments: 0001-HBASE-13403-Make-waitOnSafeMode-configurable-in-Mast.patch, 0001-HBASE-13403-Make-waitOnSafeMode-configurable-in-Mast.patch We currently wait whatever is the configured value of hbase.server.thread.wakefrequency or the default 10 seconds. We should have a configuration to control how long we wait until the HDFS is no longer in safe mode, since using the existing hbase.server.thread.wakefrequency property to tune that can have adverse side effects. My proposal is to add a new property called hbase.master.waitonsafemode and start with the current default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13362) set max result size from client only (like caching)?
[ https://issues.apache.org/jira/browse/HBASE-13362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482342#comment-14482342 ] Enis Soztutar commented on HBASE-13362: --- +1 for 1.0 as well. Agreed that this passes as a way to protect the server. set max result size from client only (like caching)? Key: HBASE-13362 URL: https://issues.apache.org/jira/browse/HBASE-13362 Project: HBase Issue Type: Brainstorming Reporter: Lars Hofhansl Attachments: 13362-0.98.txt, 13362-master.txt With the recent problems we've been seeing client/server result size mismatch, I was thinking: Why was this not a problem with scanner caching? There are two reasons: # number of rows is easy to calculate (and we did it correctly) # caching is only controlled from the client, never set on the server alone We did fix both #1 and #2 in HBASE-13262. Still, I'd like to discuss the following: * default the client sent max result size to 2mb * remove any server only result sizing * continue to use hbase.client.scanner.max.result.size but enforce it via the client only (as the name implies anyway). Comments? Concerns? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13413) Create an integration test for Replication
[ https://issues.apache.org/jira/browse/HBASE-13413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482386#comment-14482386 ] Hadoop QA commented on HBASE-13413: --- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12723486/HBASE-13413-v1.patch against master branch at commit 8c740f43093671cfd4cc2b1052d8556e0d492c13. ATTACHMENT ID: 12723486 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 7 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/13587//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/13587//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/13587//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/13587//console This message is automatically generated. Create an integration test for Replication -- Key: HBASE-13413 URL: https://issues.apache.org/jira/browse/HBASE-13413 Project: HBase Issue Type: Test Components: integration tests Affects Versions: 1.0.0, 2.0.0 Reporter: Rajesh Nishtala Assignee: Rajesh Nishtala Priority: Minor Attachments: HBASE-13413-v1.patch, HBASE-13413.patch We want to have an end-to-end test for replication. it can write data into one cluster (with replication setup) and then read data from the other. The test should be capable of running for a long time and be reliant even under chaos monkey testing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13412) Region split decisions should have jitter
[ https://issues.apache.org/jira/browse/HBASE-13412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482514#comment-14482514 ] Hadoop QA commented on HBASE-13412: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12723505/HBASE-13412-v2.patch against master branch at commit 8c740f43093671cfd4cc2b1052d8556e0d492c13. ATTACHMENT ID: 12723505 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:red}-1 checkstyle{color}. The applied patch generated 1922 checkstyle errors (more than the master's current 1921 errors). {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/13589//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/13589//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/13589//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/13589//console This message is automatically generated. Region split decisions should have jitter - Key: HBASE-13412 URL: https://issues.apache.org/jira/browse/HBASE-13412 Project: HBase Issue Type: New Feature Components: regionserver Affects Versions: 1.0.0, 2.0.0 Reporter: Elliott Clark Assignee: Elliott Clark Fix For: 2.0.0, 1.1.0 Attachments: HBASE-13412-v1.patch, HBASE-13412-v2.patch, HBASE-13412.patch Whenever a region splits it causes lots of IO (compactions are queued for a while). Because of this it's important to make sure that well distributed tables don't have all of their regions split at exactly the same time. This is basically the same as our compaction jitter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13412) Region split decisions should have jitter
[ https://issues.apache.org/jira/browse/HBASE-13412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark updated HBASE-13412: -- Attachment: HBASE-13412-v3.patch Patch to make checkstyle happy. Region split decisions should have jitter - Key: HBASE-13412 URL: https://issues.apache.org/jira/browse/HBASE-13412 Project: HBase Issue Type: New Feature Components: regionserver Affects Versions: 1.0.0, 2.0.0 Reporter: Elliott Clark Assignee: Elliott Clark Fix For: 2.0.0, 1.1.0 Attachments: HBASE-13412-v1.patch, HBASE-13412-v2.patch, HBASE-13412-v3.patch, HBASE-13412.patch Whenever a region splits it causes lots of IO (compactions are queued for a while). Because of this it's important to make sure that well distributed tables don't have all of their regions split at exactly the same time. This is basically the same as our compaction jitter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13275) Setting hbase.security.authorization to false does not disable authorization
[ https://issues.apache.org/jira/browse/HBASE-13275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482268#comment-14482268 ] Andrew Purtell commented on HBASE-13275: Ok, that's better. What say ye [~srikanth235] ? Any comments on the VC changes [~ram_krish] or [~anoop.hbase] ? Setting hbase.security.authorization to false does not disable authorization Key: HBASE-13275 URL: https://issues.apache.org/jira/browse/HBASE-13275 Project: HBase Issue Type: Bug Reporter: William Watson Assignee: Andrew Purtell Fix For: 2.0.0, 1.1.0, 0.98.13, 1.0.2 Attachments: HBASE-13275-0.98.patch, HBASE-13275-0.98.patch, HBASE-13275-branch-1.patch, HBASE-13275-branch-1.patch, HBASE-13275.patch, HBASE-13275.patch, HBASE-13275.patch, HBASE-13275.patch According to the docs provided by Cloudera (we're not running Cloudera, BTW), this is the list of configs to enable authorization in HBase: {code} property namehbase.security.authorization/name valuetrue/value /property property namehbase.coprocessor.master.classes/name valueorg.apache.hadoop.hbase.security.access.AccessController/value /property property namehbase.coprocessor.region.classes/name valueorg.apache.hadoop.hbase.security.token.TokenProvider,org.apache.hadoop.hbase.security.access.AccessController/value /property {code} We wanted to then disable authorization but simply setting hbase.security.authorization to false did not disable the authorization -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13411) Misleading error message when request size quota limit exceeds
[ https://issues.apache.org/jira/browse/HBASE-13411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482269#comment-14482269 ] Hudson commented on HBASE-13411: SUCCESS: Integrated in HBase-TRUNK #6350 (See [https://builds.apache.org/job/HBase-TRUNK/6350/]) HBASE-13411 Misleading error message when request size quota limit exceeds (matteo.bertozzi: rev 8c740f43093671cfd4cc2b1052d8556e0d492c13) * hbase-client/src/main/java/org/apache/hadoop/hbase/quotas/ThrottlingException.java * hbase-server/src/main/java/org/apache/hadoop/hbase/quotas/TimeBasedLimiter.java Misleading error message when request size quota limit exceeds -- Key: HBASE-13411 URL: https://issues.apache.org/jira/browse/HBASE-13411 Project: HBase Issue Type: Bug Affects Versions: 2.0.0 Reporter: Ashish Singhi Assignee: Ashish Singhi Priority: Minor Labels: quota Fix For: 2.0.0 Attachments: HBASE-13411.patch User will get the same error message when either number of requests exceeds or request size exceeds. So its better we differentiate them. Thanks to [~mbertozzi] for confirming the same offline. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13412) Region split decisions should have jitter
[ https://issues.apache.org/jira/browse/HBASE-13412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482280#comment-14482280 ] Lars Hofhansl commented on HBASE-13412: --- [~stack] Not a fan of ThreadLocals. These tend to hang around. You concerned about the overhead of creating one? [~eclark] Is the logic correct? {code} +double sizeJitter = conf.getDouble(hbase.hregion.max.filesize.jitter, 0.02D); +this.desiredMaxFileSize = (long)(desiredMaxFileSize * (RANDOM.nextFloat() - 0.5D) * sizeJitter); {code} You mean: {code} +double sizeJitter = conf.getDouble(hbase.hregion.max.filesize.jitter, 0.02D); +this.desiredMaxFileSize += (long)(desiredMaxFileSize * (RANDOM.nextFloat() - 0.5D) * sizeJitter); {code} (note the +=) Region split decisions should have jitter - Key: HBASE-13412 URL: https://issues.apache.org/jira/browse/HBASE-13412 Project: HBase Issue Type: New Feature Components: regionserver Affects Versions: 1.0.0, 2.0.0 Reporter: Elliott Clark Assignee: Elliott Clark Fix For: 2.0.0, 1.1.0 Attachments: HBASE-13412.patch Whenever a region splits it causes lots of IO (compactions are queued for a while). Because of this it's important to make sure that well distributed tables don't have all of their regions split at exactly the same time. This is basically the same as our compaction jitter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13409) Add categories to uncategorized tests
[ https://issues.apache.org/jira/browse/HBASE-13409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482320#comment-14482320 ] Hudson commented on HBASE-13409: SUCCESS: Integrated in HBase-0.98 #938 (See [https://builds.apache.org/job/HBase-0.98/938/]) HBASE-13409 Add categories to uncategorized tests (apurtell: rev 6f463c57408ff7ec05387dc7628a181e3126e1b8) * hbase-client/src/test/java/org/apache/hadoop/hbase/filter/TestLongComparator.java * hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestClientExponentialBackoff.java Add categories to uncategorized tests - Key: HBASE-13409 URL: https://issues.apache.org/jira/browse/HBASE-13409 Project: HBase Issue Type: Bug Affects Versions: 2.0.0, 1.1.0 Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Trivial Fix For: 2.0.0, 1.1.0, 0.98.13, 1.0.2 Attachments: HBASE-13409.patch A couple tests without categories were flagged recently by TestCheckTestClasses in a precommit build. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
[ https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482334#comment-14482334 ] Enis Soztutar commented on HBASE-13389: --- bq. I would actually be against it, since it breaks the fact that all mutations in HBase are idempotent - when the client encounters any problem with a batch of updates, it can just do those again, and the outcome would be identical I don't understand how this is related to idempotent updates. The sort order proposed will still keep ts before the type/seqId. 3 days should be good enough for replication I say. [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations - Key: HBASE-13389 URL: https://issues.apache.org/jira/browse/HBASE-13389 Project: HBase Issue Type: Sub-task Components: Performance Reporter: stack Attachments: 13389.txt HBASE-12600 moved the edit sequenceid from tags to instead exploit the mvcc/sequenceid slot in a key. Now Cells near-always have an associated mvcc/sequenceid where previous it was rare or the mvcc was kept up at the file level. This is sort of how it should be many of us would argue but as a side-effect of this change, read-time optimizations that helped speed scans were undone by this change. In this issue, lets see if we can get the optimizations back -- or just remove the optimizations altogether. The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291. The optimizations undone by this changes are (to quote the optimizer himself, Mr [~lhofhansl]): {quote} Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166. We're always storing the mvcc readpoints, and we never compare them against the actual smallestReadpoint, and hence we're always performing all the checks, tests, and comparisons that these jiras removed in addition to actually storing the data - which with up to 8 bytes per Cell is not trivial. {quote} This is the 'breaking' change: https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13412) Region split decisions should have jitter
[ https://issues.apache.org/jira/browse/HBASE-13412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482335#comment-14482335 ] Elliott Clark commented on HBASE-13412: --- You are correct the logic is missing the + Just a typo. Patch coming with tests. Region split decisions should have jitter - Key: HBASE-13412 URL: https://issues.apache.org/jira/browse/HBASE-13412 Project: HBase Issue Type: New Feature Components: regionserver Affects Versions: 1.0.0, 2.0.0 Reporter: Elliott Clark Assignee: Elliott Clark Fix For: 2.0.0, 1.1.0 Attachments: HBASE-13412.patch Whenever a region splits it causes lots of IO (compactions are queued for a while). Because of this it's important to make sure that well distributed tables don't have all of their regions split at exactly the same time. This is basically the same as our compaction jitter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13373) Squash HFileReaderV3 together with HFileReaderV2 and AbstractHFileReader; ditto for Scanners and BlockReader, etc.
[ https://issues.apache.org/jira/browse/HBASE-13373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482333#comment-14482333 ] Lars Hofhansl commented on HBASE-13373: --- Abstract, V1, V2, and V3 made adding new HFileReaders easy. What if we wanted to add a HFile V4? ... Maybe we just need to squash that in like you squashed v2 and v3. Nice simplification. Belated +1 (although I cannot convince myself that everything works exactly the way it worked before... I trust you :) ) Squash HFileReaderV3 together with HFileReaderV2 and AbstractHFileReader; ditto for Scanners and BlockReader, etc. -- Key: HBASE-13373 URL: https://issues.apache.org/jira/browse/HBASE-13373 Project: HBase Issue Type: Task Reporter: stack Assignee: stack Fix For: 2.0.0 Attachments: 0001-HBASE-13373-Squash-HFileReaderV3-together-with-HFile.patch, 13373.txt, 13373.v3.txt, 13373.v3.txt, 13373.v5.txt, 13373.v6.txt, 13373.v6.txt, 13373.v6.txt, 13373.v6.txt, 13373.v6.txt, 13373.wip.txt Profiling I actually ran into case where complaint that could not inline because: MaxInlineLevel maximum number of nested calls that are inlined 9 intx i.e. method was more than 9 levels deep. The HFileReaderV? with Abstracts is not needed anymore now we are into the clear with V3 enabled since hbase 1.0.0; we can have just an Interface and an implementation. If we need to support a new hfile type, can hopefully do it in a backward compatible way now we have Cell Interface, etc. Squashing all this stuff together actually makes it easier figuring what is going on when reading code. I can also get rid of a bunch of duplication too. Attached is a WIP. Doesn't fully compile yet but you get the idea. I'll keep on unless objection. Will try it against data written with old classes as soon as I have something working. I don't believe we write classnames into our data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Connection pool
Hello guys, Considering to the fact that htablepool has been deprecated and hconnectionmanager should be used instead, I wonder , how can i set the pool size of the mentioned class ? Is there any alternative solution ? Regards,
[jira] [Updated] (HBASE-13291) Lift the scan ceiling
[ https://issues.apache.org/jira/browse/HBASE-13291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-13291: --- Attachment: TimeRange.patch [~stack] simple patch attached. Lift the scan ceiling - Key: HBASE-13291 URL: https://issues.apache.org/jira/browse/HBASE-13291 Project: HBase Issue Type: Improvement Components: Scanners Affects Versions: 1.0.0 Reporter: stack Assignee: stack Attachments: 13291.hacks.txt, 13291.inlining.txt, Screen Shot 2015-03-26 at 12.12.13 PM.png, Screen Shot 2015-03-26 at 3.39.33 PM.png, TimeRange.patch, hack_to_bypass_bb.txt, nonBBposAndInineMvccVint.txt, q (1).png, scan_no_mvcc_optimized.svg, traces.7.svg, traces.filterall.svg, traces.nofilter.svg, traces.small2.svg, traces.smaller.svg Scanning medium sized rows with multiple concurrent scanners exhibits interesting 'ceiling' properties. A server runs at about 6.7k ops a second using 450% of possible 1600% of CPUs when 4 clients each with 10 threads doing scan 1000 rows. If I add '--filterAll' argument (do not return results), then we run at 1450% of possible 1600% possible but we do 8k ops a second. Let me attach flame graphs for two cases. Unfortunately, there is some frustrating dark art going on. Let me try figure it... Filing issue in meantime to keep score in. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13409) Add categories to uncategorized tests
[ https://issues.apache.org/jira/browse/HBASE-13409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482544#comment-14482544 ] Hudson commented on HBASE-13409: SUCCESS: Integrated in HBase-0.98-on-Hadoop-1.1 #891 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/891/]) HBASE-13409 Add categories to uncategorized tests (apurtell: rev 6f463c57408ff7ec05387dc7628a181e3126e1b8) * hbase-client/src/test/java/org/apache/hadoop/hbase/filter/TestLongComparator.java * hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestClientExponentialBackoff.java Add categories to uncategorized tests - Key: HBASE-13409 URL: https://issues.apache.org/jira/browse/HBASE-13409 Project: HBase Issue Type: Bug Affects Versions: 2.0.0, 1.1.0 Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Trivial Fix For: 2.0.0, 1.1.0, 0.98.13, 1.0.2 Attachments: HBASE-13409.patch A couple tests without categories were flagged recently by TestCheckTestClasses in a precommit build. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13412) Region split decisions should have jitter
[ https://issues.apache.org/jira/browse/HBASE-13412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482290#comment-14482290 ] stack commented on HBASE-13412: --- bq. stack Not a fan of ThreadLocals. These tend to hang around. You concerned about the overhead of creating one? This suggestion is a mistake on my part. I misunderstood. Please ignore. Region split decisions should have jitter - Key: HBASE-13412 URL: https://issues.apache.org/jira/browse/HBASE-13412 Project: HBase Issue Type: New Feature Components: regionserver Affects Versions: 1.0.0, 2.0.0 Reporter: Elliott Clark Assignee: Elliott Clark Fix For: 2.0.0, 1.1.0 Attachments: HBASE-13412.patch Whenever a region splits it causes lots of IO (compactions are queued for a while). Because of this it's important to make sure that well distributed tables don't have all of their regions split at exactly the same time. This is basically the same as our compaction jitter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)