[jira] [Updated] (HBASE-12384) TestTags can hang on fast test hosts
[ https://issues.apache.org/jira/browse/HBASE-12384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-12384: --- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Pushed with comment added as requested, one comment on the first sleep after flush in each unit test. Thanks for the review [~stack] TestTags can hang on fast test hosts Key: HBASE-12384 URL: https://issues.apache.org/jira/browse/HBASE-12384 Project: HBase Issue Type: Bug Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 2.0.0, 0.98.8, 0.99.2 Attachments: HBASE-12384-0.98.patch, HBASE-12384-master.patch Waiting indefinitely expecting flushed files to reach a certain count after triggering a flush but compaction has happened between the flush and check for number of store files. {code} admin.flush(tableName); regions = TEST_UTIL.getHBaseCluster().getRegions(tableName); for (HRegion region : regions) { Store store = region.getStore(fam); - Flush and compaction has happened before here --- while (!(store.getStorefilesCount() 2)) { - Hung forever in here --- Thread.sleep(10); } } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-9003) TableMapReduceUtil should not rely on org.apache.hadoop.util.JarFinder#getJar
[ https://issues.apache.org/jira/browse/HBASE-9003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190530#comment-14190530 ] Nick Dimiduk commented on HBASE-9003: - Makes sense. Let's get your fix in. What do you think about removing JarFinder all together for 1.0? TableMapReduceUtil should not rely on org.apache.hadoop.util.JarFinder#getJar - Key: HBASE-9003 URL: https://issues.apache.org/jira/browse/HBASE-9003 Project: HBase Issue Type: Bug Components: mapreduce Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Fix For: 2.0.0, 0.99.2 Attachments: HBASE-9003.v0.patch, HBASE-9003.v1.patch, HBASE-9003.v2.patch, HBASE-9003.v2.patch This is the problem: {{TableMapReduceUtil#addDependencyJars}} relies on {{org.apache.hadoop.util.JarFinder}} if available to call {{getJar()}}. However {{getJar()}} uses File.createTempFile() to create a temporary file under {{hadoop.tmp.dir}}{{/target/test-dir}}. Due HADOOP-9737 the created jar and its content is not purged after the JVM is destroyed. Since most configurations point {{hadoop.tmp.dir}} under {{/tmp}} the generated jar files get purged by {{tmpwatch}} or a similar tool, but boxes that have {{hadoop.tmp.dir}} pointing to a different location not monitored by {{tmpwatch}} will pile up a collection of jars causing all kind of issues. Since {{JarFinder#getJar}} is not a public API from Hadoop (see [~tucu00] comment on HADOOP-9737) we shouldn't use that as part of {{TableMapReduceUtil}} in order to avoid this kind of issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12375) LoadIncrementalHFiles fails to load data in table when CF name starts with '_'
[ https://issues.apache.org/jira/browse/HBASE-12375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190532#comment-14190532 ] Hudson commented on HBASE-12375: FAILURE: Integrated in HBase-1.0 #390 (See [https://builds.apache.org/job/HBase-1.0/390/]) HBASE-12375 LoadIncrementalHFiles fails to load data in table when CF name starts with '_' (stack: rev d8874fbc21525a5af2db3d8b9edd6e67fa1b5572) * hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java * hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestLoadIncrementalHFiles.java LoadIncrementalHFiles fails to load data in table when CF name starts with '_' -- Key: HBASE-12375 URL: https://issues.apache.org/jira/browse/HBASE-12375 Project: HBase Issue Type: Bug Affects Versions: 0.98.5 Reporter: Ashish Singhi Assignee: Ashish Singhi Priority: Minor Fix For: 2.0.0, 0.98.9, 0.99.2 Attachments: HBASE-12375-0.98.patch, HBASE-12375-v2.patch, HBASE-12375.patch We do not restrict user from creating a table having column family starting with '_'. So when user creates a table in such a way then LoadIncrementalHFiles will skip those family data to load into the table. {code} // Skip _logs, etc if (familyDir.getName().startsWith(_)) continue; {code} I think we should remove that check as I do not see any _logs directory being created by the bulkload tool in the output directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-9003) TableMapReduceUtil should not rely on org.apache.hadoop.util.JarFinder#getJar
[ https://issues.apache.org/jira/browse/HBASE-9003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190543#comment-14190543 ] Esteban Gutierrez commented on HBASE-9003: -- I don't think we can get rid of JarFinder as long as we have an option to use the distributed cache in initTable* I remember we used to ship the jars in similar way back in 0.90.x but we cleaned up the temporary jar. Here the only problem we have is that we don't clean up. TableMapReduceUtil should not rely on org.apache.hadoop.util.JarFinder#getJar - Key: HBASE-9003 URL: https://issues.apache.org/jira/browse/HBASE-9003 Project: HBase Issue Type: Bug Components: mapreduce Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Fix For: 2.0.0, 0.99.2 Attachments: HBASE-9003.v0.patch, HBASE-9003.v1.patch, HBASE-9003.v2.patch, HBASE-9003.v2.patch This is the problem: {{TableMapReduceUtil#addDependencyJars}} relies on {{org.apache.hadoop.util.JarFinder}} if available to call {{getJar()}}. However {{getJar()}} uses File.createTempFile() to create a temporary file under {{hadoop.tmp.dir}}{{/target/test-dir}}. Due HADOOP-9737 the created jar and its content is not purged after the JVM is destroyed. Since most configurations point {{hadoop.tmp.dir}} under {{/tmp}} the generated jar files get purged by {{tmpwatch}} or a similar tool, but boxes that have {{hadoop.tmp.dir}} pointing to a different location not monitored by {{tmpwatch}} will pile up a collection of jars causing all kind of issues. Since {{JarFinder#getJar}} is not a public API from Hadoop (see [~tucu00] comment on HADOOP-9737) we shouldn't use that as part of {{TableMapReduceUtil}} in order to avoid this kind of issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12072) We are doing 35 x 35 retries for master operations
[ https://issues.apache.org/jira/browse/HBASE-12072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190546#comment-14190546 ] Enis Soztutar commented on HBASE-12072: --- Thanks Stack for checking the patch. It turned out to be a bigger patch than I anticipated. The patch aims to unify how we call master rpc's. It adds retrying where we had none (for example HBaseAdmin. enableCatalogJanitor(), etc), and removes the retrying at the makeStub() in favor of the higher level retry in the MasterCallable / retrying caller level. Now most of the HBaseAdmin methods use MasterCallable properly. The Exceptions are cleaned a bit for the public Admin interface. bq. Whats thinking behind removing isMasterRunning Enis Soztutar I like not depending on master for ops. I am not sure what is the purpose of isMasterRunning() and why would a user want it. It is removed from Admin interface which is new, but kept in deprecated mode in HBaseAdmin. I can undo that if you think that we need to keep it. I just did not see a use case where the user will call Admin.isMasterRunning() other than internal stuff. bq. When you deprecate, want to point at what folks should use instead (or your thinking this is internal stuff and the heavies will just figure it out?) I thought the deprecated stuff in HConnection was internal. But it is not clear. I think having those live in the Admin layer makes better sense. Let me add javadoc. bq. Not so mad about the flattening of exceptions into IOE exclusively. I see your point. I think you mean these: {code} void move(final byte[] encodedRegionName, final byte[] destServerName) - throws HBaseIOException, MasterNotRunningException, ZooKeeperConnectionException; + throws IOException; {code} With the patch, we are now calling it via the retrying rpc caller, which will throw a RetriesExhaustedException etc which will wrap the other exceptions. That is why they are not explicitly thrown now. bq. What we supposed to use in place of all deprecated stuff in ConnectionManager? Just implement instead up in HBaseAdmin? yeah, let me add javadoc to use Admin methods. We are doing 35 x 35 retries for master operations -- Key: HBASE-12072 URL: https://issues.apache.org/jira/browse/HBASE-12072 Project: HBase Issue Type: Bug Affects Versions: 0.98.6 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 2.0.0, 0.99.2 Attachments: 12072-v1.txt, 12072-v2.txt, hbase-12072_v1.patch For master requests, there are two retry mechanisms in effect. The first one is from HBaseAdmin.executeCallable() {code} private V V executeCallable(MasterCallableV callable) throws IOException { RpcRetryingCallerV caller = rpcCallerFactory.newCaller(); try { return caller.callWithRetries(callable); } finally { callable.close(); } } {code} And inside, the other one is from StubMaker.makeStub(): {code} /** * Create a stub against the master. Retry if necessary. * @return A stub to do codeintf/code against the master * @throws MasterNotRunningException */ @edu.umd.cs.findbugs.annotations.SuppressWarnings (value=SWL_SLEEP_WITH_LOCK_HELD) Object makeStub() throws MasterNotRunningException { {code} The tests will just hang for 10 min * 35 ~= 6hours. {code} 2014-09-23 16:19:05,151 INFO [main] client.ConnectionManager$HConnectionImplementation: getMaster attempt 1 of 35 failed; retrying after sleep of 100, exception=java.io.IOException: Can't get master address from ZooKeeper; znode data == null 2014-09-23 16:19:05,253 INFO [main] client.ConnectionManager$HConnectionImplementation: getMaster attempt 2 of 35 failed; retrying after sleep of 200, exception=java.io.IOException: Can't get master address from ZooKeeper; znode data == null 2014-09-23 16:19:05,456 INFO [main] client.ConnectionManager$HConnectionImplementation: getMaster attempt 3 of 35 failed; retrying after sleep of 300, exception=java.io.IOException: Can't get master address from ZooKeeper; znode data == null 2014-09-23 16:19:05,759 INFO [main] client.ConnectionManager$HConnectionImplementation: getMaster attempt 4 of 35 failed; retrying after sleep of 500, exception=java.io.IOException: Can't get master address from ZooKeeper; znode data == null 2014-09-23 16:19:06,262 INFO [main] client.ConnectionManager$HConnectionImplementation: getMaster attempt 5 of 35 failed; retrying after sleep of 1008, exception=java.io.IOException: Can't get master address from ZooKeeper; znode data == null 2014-09-23 16:19:07,273 INFO [main] client.ConnectionManager$HConnectionImplementation: getMaster attempt 6 of 35 failed; retrying after sleep of 2011,
[jira] [Commented] (HBASE-12381) Add maven enforcer rules for build assumptions
[ https://issues.apache.org/jira/browse/HBASE-12381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190563#comment-14190563 ] Hadoop QA commented on HBASE-12381: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12678228/HBASE-12381.1.patch.txt against trunk revision . ATTACHMENT ID: 12678228 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: {color:red}-1 core zombie tests{color}. There are 1 zombie test(s): at org.apache.hadoop.hbase.master.balancer.TestBaseLoadBalancer.testImmediateAssignment(TestBaseLoadBalancer.java:136) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/11519//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11519//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11519//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11519//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11519//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11519//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11519//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11519//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11519//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11519//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11519//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11519//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/11519//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/11519//console This message is automatically generated. Add maven enforcer rules for build assumptions -- Key: HBASE-12381 URL: https://issues.apache.org/jira/browse/HBASE-12381 Project: HBase Issue Type: Task Components: build Reporter: Sean Busbey Assignee: Sean Busbey Priority: Minor Fix For: 2.0.0, 0.94.26, 0.98.9, 0.99.2 Attachments: HBASE-12381.1.patch.txt our ref guide says that you need maven 3 to build. add an enforcer rule so that people find out early that they have the wrong maven version, rather then however things fall over if someone tries to build with maven 2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-9003) TableMapReduceUtil should not rely on org.apache.hadoop.util.JarFinder#getJar
[ https://issues.apache.org/jira/browse/HBASE-9003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190564#comment-14190564 ] Nick Dimiduk commented on HBASE-9003: - IIRC, I introduced JarFinder for the purpose of launching jobs from the output committer of a running job. In this context, the dependency jars have been unpacked, so to launch the job, JarFinder is used to re-pack the class files into a jar. Which raises an interesting point: you're seeing this accumulation of files under hadoop.tmp.dir even for regular jobs? Should be that nothing is created unless the requested class is found to exist on the class path outside of a jar. I don't remember the details; let me look into the code when I have a few minutes. Back to the point at hand +1 for fixing the accumulation problem. TableMapReduceUtil should not rely on org.apache.hadoop.util.JarFinder#getJar - Key: HBASE-9003 URL: https://issues.apache.org/jira/browse/HBASE-9003 Project: HBase Issue Type: Bug Components: mapreduce Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Fix For: 2.0.0, 0.99.2 Attachments: HBASE-9003.v0.patch, HBASE-9003.v1.patch, HBASE-9003.v2.patch, HBASE-9003.v2.patch This is the problem: {{TableMapReduceUtil#addDependencyJars}} relies on {{org.apache.hadoop.util.JarFinder}} if available to call {{getJar()}}. However {{getJar()}} uses File.createTempFile() to create a temporary file under {{hadoop.tmp.dir}}{{/target/test-dir}}. Due HADOOP-9737 the created jar and its content is not purged after the JVM is destroyed. Since most configurations point {{hadoop.tmp.dir}} under {{/tmp}} the generated jar files get purged by {{tmpwatch}} or a similar tool, but boxes that have {{hadoop.tmp.dir}} pointing to a different location not monitored by {{tmpwatch}} will pile up a collection of jars causing all kind of issues. Since {{JarFinder#getJar}} is not a public API from Hadoop (see [~tucu00] comment on HADOOP-9737) we shouldn't use that as part of {{TableMapReduceUtil}} in order to avoid this kind of issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12375) LoadIncrementalHFiles fails to load data in table when CF name starts with '_'
[ https://issues.apache.org/jira/browse/HBASE-12375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190575#comment-14190575 ] Hudson commented on HBASE-12375: FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #610 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/610/]) HBASE-12375 LoadIncrementalHFiles fails to load data in table when CF name starts with '_' (stack: rev 68eb74b23e6eff60cf4410ff4af1a60b501a7c9c) * hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestLoadIncrementalHFiles.java * hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java LoadIncrementalHFiles fails to load data in table when CF name starts with '_' -- Key: HBASE-12375 URL: https://issues.apache.org/jira/browse/HBASE-12375 Project: HBase Issue Type: Bug Affects Versions: 0.98.5 Reporter: Ashish Singhi Assignee: Ashish Singhi Priority: Minor Fix For: 2.0.0, 0.98.9, 0.99.2 Attachments: HBASE-12375-0.98.patch, HBASE-12375-v2.patch, HBASE-12375.patch We do not restrict user from creating a table having column family starting with '_'. So when user creates a table in such a way then LoadIncrementalHFiles will skip those family data to load into the table. {code} // Skip _logs, etc if (familyDir.getName().startsWith(_)) continue; {code} I think we should remove that check as I do not see any _logs directory being created by the bulkload tool in the output directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12384) TestTags can hang on fast test hosts
[ https://issues.apache.org/jira/browse/HBASE-12384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190604#comment-14190604 ] Hadoop QA commented on HBASE-12384: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12678238/HBASE-12384-master.patch against trunk revision . ATTACHMENT ID: 12678238 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/11520//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11520//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11520//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11520//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11520//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11520//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11520//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11520//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11520//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11520//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11520//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11520//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/11520//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/11520//console This message is automatically generated. TestTags can hang on fast test hosts Key: HBASE-12384 URL: https://issues.apache.org/jira/browse/HBASE-12384 Project: HBase Issue Type: Bug Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Fix For: 2.0.0, 0.98.8, 0.99.2 Attachments: HBASE-12384-0.98.patch, HBASE-12384-master.patch Waiting indefinitely expecting flushed files to reach a certain count after triggering a flush but compaction has happened between the flush and check for number of store files. {code} admin.flush(tableName); regions = TEST_UTIL.getHBaseCluster().getRegions(tableName); for (HRegion region : regions) { Store store = region.getStore(fam); - Flush and compaction has happened before here --- while (!(store.getStorefilesCount() 2)) { - Hung forever in here --- Thread.sleep(10); } } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-9003) TableMapReduceUtil should not rely on org.apache.hadoop.util.JarFinder#getJar
[ https://issues.apache.org/jira/browse/HBASE-9003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190613#comment-14190613 ] stack commented on HBASE-9003: -- bq. Back to the point at hand +1 for fixing the accumulation problem. That is +1 on patch [~ndimiduk]? TableMapReduceUtil should not rely on org.apache.hadoop.util.JarFinder#getJar - Key: HBASE-9003 URL: https://issues.apache.org/jira/browse/HBASE-9003 Project: HBase Issue Type: Bug Components: mapreduce Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Fix For: 2.0.0, 0.99.2 Attachments: HBASE-9003.v0.patch, HBASE-9003.v1.patch, HBASE-9003.v2.patch, HBASE-9003.v2.patch This is the problem: {{TableMapReduceUtil#addDependencyJars}} relies on {{org.apache.hadoop.util.JarFinder}} if available to call {{getJar()}}. However {{getJar()}} uses File.createTempFile() to create a temporary file under {{hadoop.tmp.dir}}{{/target/test-dir}}. Due HADOOP-9737 the created jar and its content is not purged after the JVM is destroyed. Since most configurations point {{hadoop.tmp.dir}} under {{/tmp}} the generated jar files get purged by {{tmpwatch}} or a similar tool, but boxes that have {{hadoop.tmp.dir}} pointing to a different location not monitored by {{tmpwatch}} will pile up a collection of jars causing all kind of issues. Since {{JarFinder#getJar}} is not a public API from Hadoop (see [~tucu00] comment on HADOOP-9737) we shouldn't use that as part of {{TableMapReduceUtil}} in order to avoid this kind of issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11835) Wrong managenement of non expected calls in the client
[ https://issues.apache.org/jira/browse/HBASE-11835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190622#comment-14190622 ] Hadoop QA commented on HBASE-11835: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12678243/11835.rebase.patch against trunk revision . ATTACHMENT ID: 12678243 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:red}-1 checkstyle{color}. The applied patch generated 3785 checkstyle errors (more than the trunk's current 3784 errors). {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: {color:red}-1 core zombie tests{color}. There are 1 zombie test(s): at org.apache.ambari.server.upgrade.UpgradeCatalog150Test.testAddHistoryServer(UpgradeCatalog150Test.java:189) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/11521//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11521//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11521//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11521//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11521//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11521//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11521//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11521//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11521//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11521//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11521//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11521//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/11521//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/11521//console This message is automatically generated. Wrong managenement of non expected calls in the client -- Key: HBASE-11835 URL: https://issues.apache.org/jira/browse/HBASE-11835 Project: HBase Issue Type: Bug Components: Client, Performance Affects Versions: 1.0.0, 2.0.0, 0.98.6 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Fix For: 2.0.0, 0.99.2 Attachments: 11835.rebase.patch, 11835.rebase.patch, 11835.rebase.patch, rpcClient.patch If a call is purged or canceled we try to skip the reply from the server, but we read the wrong number of bytes so we corrupt the tcp channel. It's hidden as it triggers retry and so on, but it's bad for performances obviously. It happens with cell blocks. [~ram_krish_86], [~saint@gmail.com], you know this part better than me, do you agree with the analysis and the patch? The changes in rpcServer are not fully related: as the client close the
[jira] [Commented] (HBASE-11764) Support per cell TTLs
[ https://issues.apache.org/jira/browse/HBASE-11764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190648#comment-14190648 ] Lars Hofhansl commented on HBASE-11764: --- Scanned patch again. Looks good. +1 [~apurtell] you're confident enough that this won't destabilize 0.98? Support per cell TTLs - Key: HBASE-11764 URL: https://issues.apache.org/jira/browse/HBASE-11764 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 2.0.0, 0.98.8, 0.99.2 Attachments: HBASE-11764-0.98.patch, HBASE-11764-0.98.patch, HBASE-11764-0.98.patch, HBASE-11764-0.98.patch, HBASE-11764-0.98.patch, HBASE-11764.patch, HBASE-11764.patch, HBASE-11764.patch, HBASE-11764.patch, HBASE-11764.patch, HBASE-11764.patch, HBASE-11764.patch, HBASE-11764.patch, HBASE-11764.patch, HBASE-11764.patch, HBASE-11764.patch, HBASE-11764.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12381) Add maven enforcer rules for build assumptions
[ https://issues.apache.org/jira/browse/HBASE-12381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190647#comment-14190647 ] Hudson commented on HBASE-12381: FAILURE: Integrated in HBase-TRUNK #5723 (See [https://builds.apache.org/job/HBase-TRUNK/5723/]) HBASE-12381 use the Maven Enforcer Plugin to check maven and java versions. (stack: rev 075fd3032135c55a6874a6f0c091e558540609d0) * pom.xml Add maven enforcer rules for build assumptions -- Key: HBASE-12381 URL: https://issues.apache.org/jira/browse/HBASE-12381 Project: HBase Issue Type: Task Components: build Reporter: Sean Busbey Assignee: Sean Busbey Priority: Minor Fix For: 2.0.0, 0.94.26, 0.98.9, 0.99.2 Attachments: HBASE-12381.1.patch.txt our ref guide says that you need maven 3 to build. add an enforcer rule so that people find out early that they have the wrong maven version, rather then however things fall over if someone tries to build with maven 2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12381) Add maven enforcer rules for build assumptions
[ https://issues.apache.org/jira/browse/HBASE-12381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190651#comment-14190651 ] Hudson commented on HBASE-12381: FAILURE: Integrated in HBase-0.94 #1437 (See [https://builds.apache.org/job/HBase-0.94/1437/]) HBASE-12381 use the Maven Enforcer Plugin to check maven and java versions. (stack: rev f0a8640f0ae3c5750da826e4ab5b847ad1b0ae34) * pom.xml Add maven enforcer rules for build assumptions -- Key: HBASE-12381 URL: https://issues.apache.org/jira/browse/HBASE-12381 Project: HBase Issue Type: Task Components: build Reporter: Sean Busbey Assignee: Sean Busbey Priority: Minor Fix For: 2.0.0, 0.94.26, 0.98.9, 0.99.2 Attachments: HBASE-12381.1.patch.txt our ref guide says that you need maven 3 to build. add an enforcer rule so that people find out early that they have the wrong maven version, rather then however things fall over if someone tries to build with maven 2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12381) Add maven enforcer rules for build assumptions
[ https://issues.apache.org/jira/browse/HBASE-12381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190654#comment-14190654 ] Hudson commented on HBASE-12381: FAILURE: Integrated in HBase-0.94-security #551 (See [https://builds.apache.org/job/HBase-0.94-security/551/]) HBASE-12381 use the Maven Enforcer Plugin to check maven and java versions. (stack: rev f0a8640f0ae3c5750da826e4ab5b847ad1b0ae34) * pom.xml Add maven enforcer rules for build assumptions -- Key: HBASE-12381 URL: https://issues.apache.org/jira/browse/HBASE-12381 Project: HBase Issue Type: Task Components: build Reporter: Sean Busbey Assignee: Sean Busbey Priority: Minor Fix For: 2.0.0, 0.94.26, 0.98.9, 0.99.2 Attachments: HBASE-12381.1.patch.txt our ref guide says that you need maven 3 to build. add an enforcer rule so that people find out early that they have the wrong maven version, rather then however things fall over if someone tries to build with maven 2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12381) Add maven enforcer rules for build assumptions
[ https://issues.apache.org/jira/browse/HBASE-12381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190659#comment-14190659 ] Hudson commented on HBASE-12381: SUCCESS: Integrated in HBase-0.94-JDK7 #206 (See [https://builds.apache.org/job/HBase-0.94-JDK7/206/]) HBASE-12381 use the Maven Enforcer Plugin to check maven and java versions. (stack: rev f0a8640f0ae3c5750da826e4ab5b847ad1b0ae34) * pom.xml Add maven enforcer rules for build assumptions -- Key: HBASE-12381 URL: https://issues.apache.org/jira/browse/HBASE-12381 Project: HBase Issue Type: Task Components: build Reporter: Sean Busbey Assignee: Sean Busbey Priority: Minor Fix For: 2.0.0, 0.94.26, 0.98.9, 0.99.2 Attachments: HBASE-12381.1.patch.txt our ref guide says that you need maven 3 to build. add an enforcer rule so that people find out early that they have the wrong maven version, rather then however things fall over if someone tries to build with maven 2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12384) TestTags can hang on fast test hosts
[ https://issues.apache.org/jira/browse/HBASE-12384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190677#comment-14190677 ] Hudson commented on HBASE-12384: SUCCESS: Integrated in HBase-1.0 #391 (See [https://builds.apache.org/job/HBase-1.0/391/]) HBASE-12384 TestTags can hang on fast test hosts (apurtell: rev f0091a90313f4c92e465df086407266a6ba18486) * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestTags.java TestTags can hang on fast test hosts Key: HBASE-12384 URL: https://issues.apache.org/jira/browse/HBASE-12384 Project: HBase Issue Type: Bug Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Fix For: 2.0.0, 0.98.8, 0.99.2 Attachments: HBASE-12384-0.98.patch, HBASE-12384-master.patch Waiting indefinitely expecting flushed files to reach a certain count after triggering a flush but compaction has happened between the flush and check for number of store files. {code} admin.flush(tableName); regions = TEST_UTIL.getHBaseCluster().getRegions(tableName); for (HRegion region : regions) { Store store = region.getStore(fam); - Flush and compaction has happened before here --- while (!(store.getStorefilesCount() 2)) { - Hung forever in here --- Thread.sleep(10); } } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12381) Add maven enforcer rules for build assumptions
[ https://issues.apache.org/jira/browse/HBASE-12381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190678#comment-14190678 ] Hudson commented on HBASE-12381: SUCCESS: Integrated in HBase-1.0 #391 (See [https://builds.apache.org/job/HBase-1.0/391/]) HBASE-12381 use the Maven Enforcer Plugin to check maven and java versions. (stack: rev 158e009f4c554b792e0b868a8ec77ce19a401d7b) * pom.xml Add maven enforcer rules for build assumptions -- Key: HBASE-12381 URL: https://issues.apache.org/jira/browse/HBASE-12381 Project: HBase Issue Type: Task Components: build Reporter: Sean Busbey Assignee: Sean Busbey Priority: Minor Fix For: 2.0.0, 0.94.26, 0.98.9, 0.99.2 Attachments: HBASE-12381.1.patch.txt our ref guide says that you need maven 3 to build. add an enforcer rule so that people find out early that they have the wrong maven version, rather then however things fall over if someone tries to build with maven 2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11835) Wrong managenement of non expected calls in the client
[ https://issues.apache.org/jira/browse/HBASE-11835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-11835: -- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to branch-1+ Fixed checkstyle bugs on commit. Thanks for the patch [~nkeywal] Wrong managenement of non expected calls in the client -- Key: HBASE-11835 URL: https://issues.apache.org/jira/browse/HBASE-11835 Project: HBase Issue Type: Bug Components: Client, Performance Affects Versions: 1.0.0, 2.0.0, 0.98.6 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Fix For: 2.0.0, 0.99.2 Attachments: 11835.rebase.patch, 11835.rebase.patch, 11835.rebase.patch, rpcClient.patch If a call is purged or canceled we try to skip the reply from the server, but we read the wrong number of bytes so we corrupt the tcp channel. It's hidden as it triggers retry and so on, but it's bad for performances obviously. It happens with cell blocks. [~ram_krish_86], [~saint@gmail.com], you know this part better than me, do you agree with the analysis and the patch? The changes in rpcServer are not fully related: as the client close the connections in such situation, I observed both ClosedChannelException and CancelledKeyException. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11764) Support per cell TTLs
[ https://issues.apache.org/jira/browse/HBASE-11764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190700#comment-14190700 ] Andrew Purtell commented on HBASE-11764: I don't see a stability risk but do intend to (micro)profile this during the next RC qualification. Support per cell TTLs - Key: HBASE-11764 URL: https://issues.apache.org/jira/browse/HBASE-11764 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 2.0.0, 0.98.8, 0.99.2 Attachments: HBASE-11764-0.98.patch, HBASE-11764-0.98.patch, HBASE-11764-0.98.patch, HBASE-11764-0.98.patch, HBASE-11764-0.98.patch, HBASE-11764.patch, HBASE-11764.patch, HBASE-11764.patch, HBASE-11764.patch, HBASE-11764.patch, HBASE-11764.patch, HBASE-11764.patch, HBASE-11764.patch, HBASE-11764.patch, HBASE-11764.patch, HBASE-11764.patch, HBASE-11764.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11764) Support per cell TTLs
[ https://issues.apache.org/jira/browse/HBASE-11764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190702#comment-14190702 ] Andrew Purtell commented on HBASE-11764: Time permitting I'd do that ahead of commit for release... Support per cell TTLs - Key: HBASE-11764 URL: https://issues.apache.org/jira/browse/HBASE-11764 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 2.0.0, 0.98.8, 0.99.2 Attachments: HBASE-11764-0.98.patch, HBASE-11764-0.98.patch, HBASE-11764-0.98.patch, HBASE-11764-0.98.patch, HBASE-11764-0.98.patch, HBASE-11764.patch, HBASE-11764.patch, HBASE-11764.patch, HBASE-11764.patch, HBASE-11764.patch, HBASE-11764.patch, HBASE-11764.patch, HBASE-11764.patch, HBASE-11764.patch, HBASE-11764.patch, HBASE-11764.patch, HBASE-11764.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12072) We are doing 35 x 35 retries for master operations
[ https://issues.apache.org/jira/browse/HBASE-12072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190704#comment-14190704 ] stack commented on HBASE-12072: --- bq. The patch aims to unify how we call master rpc's. Nice. Good. bq. I can undo that if you think that we need to keep it. It is a bad idea, or at least, we should examine all places isMasterRunning is being called as it is inserting master in whatever the path. If admin task, since master is adminsitrator, isMasterRunning might make sense but why this predicate call at all... Just go for it and rely on zk etc., getting you to the master bq. That is why they are not explicitly thrown now. Ok. That makes sense. Go for it. Lets get it in Enis. Any chance of a test or a tool that runs through retries? IIRC, there used to be a tool I or [~nkeywal] did that mocked server and was good for showing retries in. We are doing 35 x 35 retries for master operations -- Key: HBASE-12072 URL: https://issues.apache.org/jira/browse/HBASE-12072 Project: HBase Issue Type: Bug Affects Versions: 0.98.6 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 2.0.0, 0.99.2 Attachments: 12072-v1.txt, 12072-v2.txt, hbase-12072_v1.patch For master requests, there are two retry mechanisms in effect. The first one is from HBaseAdmin.executeCallable() {code} private V V executeCallable(MasterCallableV callable) throws IOException { RpcRetryingCallerV caller = rpcCallerFactory.newCaller(); try { return caller.callWithRetries(callable); } finally { callable.close(); } } {code} And inside, the other one is from StubMaker.makeStub(): {code} /** * Create a stub against the master. Retry if necessary. * @return A stub to do codeintf/code against the master * @throws MasterNotRunningException */ @edu.umd.cs.findbugs.annotations.SuppressWarnings (value=SWL_SLEEP_WITH_LOCK_HELD) Object makeStub() throws MasterNotRunningException { {code} The tests will just hang for 10 min * 35 ~= 6hours. {code} 2014-09-23 16:19:05,151 INFO [main] client.ConnectionManager$HConnectionImplementation: getMaster attempt 1 of 35 failed; retrying after sleep of 100, exception=java.io.IOException: Can't get master address from ZooKeeper; znode data == null 2014-09-23 16:19:05,253 INFO [main] client.ConnectionManager$HConnectionImplementation: getMaster attempt 2 of 35 failed; retrying after sleep of 200, exception=java.io.IOException: Can't get master address from ZooKeeper; znode data == null 2014-09-23 16:19:05,456 INFO [main] client.ConnectionManager$HConnectionImplementation: getMaster attempt 3 of 35 failed; retrying after sleep of 300, exception=java.io.IOException: Can't get master address from ZooKeeper; znode data == null 2014-09-23 16:19:05,759 INFO [main] client.ConnectionManager$HConnectionImplementation: getMaster attempt 4 of 35 failed; retrying after sleep of 500, exception=java.io.IOException: Can't get master address from ZooKeeper; znode data == null 2014-09-23 16:19:06,262 INFO [main] client.ConnectionManager$HConnectionImplementation: getMaster attempt 5 of 35 failed; retrying after sleep of 1008, exception=java.io.IOException: Can't get master address from ZooKeeper; znode data == null 2014-09-23 16:19:07,273 INFO [main] client.ConnectionManager$HConnectionImplementation: getMaster attempt 6 of 35 failed; retrying after sleep of 2011, exception=java.io.IOException: Can't get master address from ZooKeeper; znode data == null 2014-09-23 16:19:09,286 INFO [main] client.ConnectionManager$HConnectionImplementation: getMaster attempt 7 of 35 failed; retrying after sleep of 4012, exception=java.io.IOException: Can't get master address from ZooKeeper; znode data == null 2014-09-23 16:19:13,303 INFO [main] client.ConnectionManager$HConnectionImplementation: getMaster attempt 8 of 35 failed; retrying after sleep of 10033, exception=java.io.IOException: Can't get master address from ZooKeeper; znode data == null 2014-09-23 16:19:23,343 INFO [main] client.ConnectionManager$HConnectionImplementation: getMaster attempt 9 of 35 failed; retrying after sleep of 10089, exception=java.io.IOException: Can't get master address from ZooKeeper; znode data == null 2014-09-23 16:19:33,439 INFO [main] client.ConnectionManager$HConnectionImplementation: getMaster attempt 10 of 35 failed; retrying after sleep of 10027, exception=java.io.IOException: Can't get master address from ZooKeeper; znode data == null 2014-09-23 16:19:43,473 INFO [main] client.ConnectionManager$HConnectionImplementation: getMaster attempt 11 of 35 failed; retrying after sleep of 10004,
[jira] [Commented] (HBASE-12346) Scan's default auths behavior under Visibility labels
[ https://issues.apache.org/jira/browse/HBASE-12346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190714#comment-14190714 ] Andrew Purtell commented on HBASE-12346: {quote} 1. Something similar to EnforcingScanLabelGenerator as the first SLG. But we need to change it a little: a. Get the defined set for the user from the labels table ONLY IF the auths in the scan is null. b. Otherwise does nothing, pass on the auths in the scan to the next SLG. 2. The current DefaultScanLabelGenerator (without the patch) as the second SLG. This SLG will filter/drop labels from the passed in Auths as needed. {quote} Yes. And: - We could rename 'DefaultScanLabelGenerator' to something more suitable. - The default SLG configuration becomes this combined stack. Scan's default auths behavior under Visibility labels - Key: HBASE-12346 URL: https://issues.apache.org/jira/browse/HBASE-12346 Project: HBase Issue Type: Bug Components: API, security Affects Versions: 0.98.7, 0.99.1 Reporter: Jerry He Fix For: 0.98.8, 0.99.2 Attachments: HBASE-12346-master-v2.patch, HBASE-12346-master.patch In Visibility Labels security, a set of labels (auths) are administered and associated with a user. A user can normally only see cell data during scan that are part of the user's label set (auths). Scan uses setAuthorizations to indicates its wants to use the auths to access the cells. Similarly in the shell: {code} scan 'table1', AUTHORIZATIONS = ['private'] {code} But it is a surprise to find that setAuthorizations seems to be 'mandatory' in the default visibility label security setting. Every scan needs to setAuthorizations before the scan can get any cells even the cells are under the labels the request user is part of. The following steps will illustrate the issue: Run as superuser. {code} 1. create a visibility label called 'private' 2. create 'table1' 3. put into 'table1' data and label the data as 'private' 4. set_auths 'user1', 'private' 5. grant 'user1', 'RW', 'table1' {code} Run as 'user1': {code} 1. scan 'table1' This show no cells. 2. scan 'table1', scan 'table1', AUTHORIZATIONS = ['private'] This will show all the data. {code} I am not sure if this is expected by design or a bug. But a more reasonable, more client application backward compatible, and less surprising default behavior should probably look like this: A scan's default auths, if its Authorizations attributes is not set explicitly, should be all the auths the request user is administered and allowed on the server. If scan.setAuthorizations is used, then the server further filter the auths during scan: use the input auths minus what is not in user's label set on the server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors
[ https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190725#comment-14190725 ] Esteban Gutierrez commented on HBASE-12219: --- Tests should pass once HBASE-12380 is committed. Cache more efficiently getAll() and get() in FSTableDescriptors --- Key: HBASE-12219 URL: https://issues.apache.org/jira/browse/HBASE-12219 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.94.24, 0.99.1, 0.98.6.1 Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Labels: scalability Attachments: HBASE-12219-v1.patch, HBASE-12219.v0.txt, list.png Currently table descriptors and tables are cached once they are accessed for the first time. Next calls to the master only require a trip to HDFS to lookup the modified time in order to reload the table descriptors if modified. However in clusters with a large number of tables or concurrent clients and this can be too aggressive to HDFS and the master causing contention to process other requests. A simple solution is to have a TTL based cached for FSTableDescriptors#getAll() and FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to process those calls faster without causing contention without having to perform a trip to HDFS for every call. to listtables() or getTableDescriptor() -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12154) TestLoadAndSwitchEncodeOnDisk is flaky on shared jenkins boxes
[ https://issues.apache.org/jira/browse/HBASE-12154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-12154: -- Fix Version/s: (was: 1.0.0) 0.99.2 TestLoadAndSwitchEncodeOnDisk is flaky on shared jenkins boxes -- Key: HBASE-12154 URL: https://issues.apache.org/jira/browse/HBASE-12154 Project: HBase Issue Type: Bug Components: test Affects Versions: 1.0.0 Reporter: Manukranth Kolloju Assignee: Manukranth Kolloju Priority: Minor Fix For: 0.99.2 Original Estimate: 168h Remaining Estimate: 168h It doesn't make sense to run a load test on a shared box where other tests are being run. We should probably move this to integration tests and make sure its covered. In cases where the jenkins machines are shared across several projects which are doing disk IO, I have observed that this test runs into slow syncs and eventually times out. And that being said, we can't increase the timeout on this test, since its already 3 minutes. This test being disk sensitive, might have better coverage on IT. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12203) Remove Master from table status query path.
[ https://issues.apache.org/jira/browse/HBASE-12203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-12203: -- Fix Version/s: (was: 1.0.0) Remove Master from table status query path. --- Key: HBASE-12203 URL: https://issues.apache.org/jira/browse/HBASE-12203 Project: HBase Issue Type: Improvement Components: Client, master Affects Versions: 2.0.0 Reporter: Andrey Stepachev Priority: Minor Fix For: 2.0.0 With patch HBASE-7767 we moved table statuses from ZK to HDFS. That was a good cleanup, but we put additional (substantial) load on master. Some client requests use checks for table state (for example HBASE-12035). Thats is why patch was not back ported to branch1 (HBASE-11978) Lets replicate state back to zk, but as a mirror of table states. What can be done: 1. TableStateManager would push table state changes to zk 2. Return back ZKTableStateClientSideReader. Alternative way: 1. Move table statuses to separate table like namespaces 2. Issue statuses requests against this table Alternative way2: 1. Extend RS api with getTableState() call 2. Each RS will be able to cache table states 3. Clients will call RS instead of master or zk -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12381) Add maven enforcer rules for build assumptions
[ https://issues.apache.org/jira/browse/HBASE-12381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190738#comment-14190738 ] Hudson commented on HBASE-12381: SUCCESS: Integrated in HBase-0.98 #642 (See [https://builds.apache.org/job/HBase-0.98/642/]) HBASE-12381 use the Maven Enforcer Plugin to check maven and java versions. (stack: rev 5874ae0add074b92827577173c9354a9fee671a6) * pom.xml Add maven enforcer rules for build assumptions -- Key: HBASE-12381 URL: https://issues.apache.org/jira/browse/HBASE-12381 Project: HBase Issue Type: Task Components: build Reporter: Sean Busbey Assignee: Sean Busbey Priority: Minor Fix For: 2.0.0, 0.94.26, 0.98.9, 0.99.2 Attachments: HBASE-12381.1.patch.txt our ref guide says that you need maven 3 to build. add an enforcer rule so that people find out early that they have the wrong maven version, rather then however things fall over if someone tries to build with maven 2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12384) TestTags can hang on fast test hosts
[ https://issues.apache.org/jira/browse/HBASE-12384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190737#comment-14190737 ] Hudson commented on HBASE-12384: SUCCESS: Integrated in HBase-0.98 #642 (See [https://builds.apache.org/job/HBase-0.98/642/]) HBASE-12384 TestTags can hang on fast test hosts (apurtell: rev 415b8ff438b9bdc5c2e1b15a9b6de81bb472fe24) * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestTags.java TestTags can hang on fast test hosts Key: HBASE-12384 URL: https://issues.apache.org/jira/browse/HBASE-12384 Project: HBase Issue Type: Bug Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Fix For: 2.0.0, 0.98.8, 0.99.2 Attachments: HBASE-12384-0.98.patch, HBASE-12384-master.patch Waiting indefinitely expecting flushed files to reach a certain count after triggering a flush but compaction has happened between the flush and check for number of store files. {code} admin.flush(tableName); regions = TEST_UTIL.getHBaseCluster().getRegions(tableName); for (HRegion region : regions) { Store store = region.getStore(fam); - Flush and compaction has happened before here --- while (!(store.getStorefilesCount() 2)) { - Hung forever in here --- Thread.sleep(10); } } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-12386) Replication gets stuck following a transient zookeeper error to remote peer cluster
Adrian Muraru created HBASE-12386: - Summary: Replication gets stuck following a transient zookeeper error to remote peer cluster Key: HBASE-12386 URL: https://issues.apache.org/jira/browse/HBASE-12386 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.98.7 Reporter: Adrian Muraru Following a transient ZK error replication gets stuck and remote peers are never updated. Source region servers are reporting continuously the following error in logs: No replication sinks are available -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12386) Replication gets stuck following a transient zookeeper error to remote peer cluster
[ https://issues.apache.org/jira/browse/HBASE-12386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190767#comment-14190767 ] Adrian Muraru commented on HBASE-12386: --- Looking at the code it seems that once the remote zk peers lookup fails, the refresh ts is updated and the return list of RS peers is empty. Next time org.apache.hadoop.hbase.replication.regionserver.ReplicationSinkManager does not retry the lookup on the next polling as the following condition is not met: {code:java} if (endpoint.getLastRegionServerUpdate() this.lastUpdateToPeers) { LOG.info(Current list of sinks is out of date, updating); chooseSinks(); } {code} A fix would be to force a refresh when the list of peers is empty: {code:java} if (replicationPeers.getTimestampOfLastChangeToPeer(peerClusterId) this.lastUpdateToPeers || sinks.isEmpty()) { LOG.info(Current list of sinks is out of date or empty, updating); chooseSinks(); } {code} Note that this is not reproducing in 0.94 where it seems the refresh is happening in this case. Replication gets stuck following a transient zookeeper error to remote peer cluster --- Key: HBASE-12386 URL: https://issues.apache.org/jira/browse/HBASE-12386 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.98.7 Reporter: Adrian Muraru Following a transient ZK error replication gets stuck and remote peers are never updated. Source region servers are reporting continuously the following error in logs: No replication sinks are available -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-10201) Port 'Make flush decisions per column family' to trunk
[ https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190775#comment-14190775 ] stack commented on HBASE-10201: --- Let me review this patch once more.That all tests pass with it enabled is encouraging. Can work on these it test failures separately. It is not your issue. Port 'Make flush decisions per column family' to trunk -- Key: HBASE-10201 URL: https://issues.apache.org/jira/browse/HBASE-10201 Project: HBase Issue Type: Improvement Components: wal Reporter: Ted Yu Assignee: zhangduo Priority: Critical Fix For: 2.0.0, 0.99.2 Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_2.patch, HBASE-10201_3.patch Currently the flush decision is made using the aggregate size of all column families. When large and small column families co-exist, this causes many small flushes of the smaller CF. We need to make per-CF flush decisions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12384) TestTags can hang on fast test hosts
[ https://issues.apache.org/jira/browse/HBASE-12384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190778#comment-14190778 ] Hudson commented on HBASE-12384: SUCCESS: Integrated in HBase-TRUNK #5724 (See [https://builds.apache.org/job/HBase-TRUNK/5724/]) HBASE-12384 TestTags can hang on fast test hosts (apurtell: rev f20fac41dfdf5ef5b5e91d04796e0cc2adc49904) * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestTags.java TestTags can hang on fast test hosts Key: HBASE-12384 URL: https://issues.apache.org/jira/browse/HBASE-12384 Project: HBase Issue Type: Bug Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Fix For: 2.0.0, 0.98.8, 0.99.2 Attachments: HBASE-12384-0.98.patch, HBASE-12384-master.patch Waiting indefinitely expecting flushed files to reach a certain count after triggering a flush but compaction has happened between the flush and check for number of store files. {code} admin.flush(tableName); regions = TEST_UTIL.getHBaseCluster().getRegions(tableName); for (HRegion region : regions) { Store store = region.getStore(fam); - Flush and compaction has happened before here --- while (!(store.getStorefilesCount() 2)) { - Hung forever in here --- Thread.sleep(10); } } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11819) Unit test for CoprocessorHConnection
[ https://issues.apache.org/jira/browse/HBASE-11819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Talat UYARER updated HBASE-11819: - Attachment: HBASE-11819v3.patch Hi [~stack], Sorry for code formatting. My eclipse's setting was broken. Now I fix it. Unit test for CoprocessorHConnection - Key: HBASE-11819 URL: https://issues.apache.org/jira/browse/HBASE-11819 Project: HBase Issue Type: Test Reporter: Andrew Purtell Assignee: Talat UYARER Priority: Minor Labels: newbie++ Fix For: 2.0.0, 0.98.8, 0.99.2 Attachments: HBASE-11819.patch, HBASE-11819v2.patch, HBASE-11819v3.patch Add a unit test to hbase-server that exercises CoprocessorHConnection . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12380) TestRegionServerNoMaster#testMultipleOpen is flaky after HBASE-11760
[ https://issues.apache.org/jira/browse/HBASE-12380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-12380: Resolution: Fixed Fix Version/s: 2.0.0 Assignee: Esteban Gutierrez Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Integrated into branch master. Thanks Esteban for the patch. TestRegionServerNoMaster#testMultipleOpen is flaky after HBASE-11760 Key: HBASE-12380 URL: https://issues.apache.org/jira/browse/HBASE-12380 Project: HBase Issue Type: Bug Components: test Affects Versions: 2.0.0 Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Fix For: 2.0.0 Attachments: HBASE-12380.v0.patch Noticed this while trying to fix faulty test while working on a fix for HBASE-12219: {code} Tests in error: TestRegionServerNoMaster.testMultipleOpen:237 » Service java.io.IOException: R... TestRegionServerNoMaster.testCloseByRegionServer:211-closeRegionNoZK:201 » Service {code} Initially I thought the problem was on my patch for HBASE-12219 but I noticed that the issue was occurring on the 7th attempt to open the region. However I was able to reproduce the same problem in the master branch after increasing the number of requests in testMultipleOpen(): {code} 2014-10-29 15:03:45,043 INFO [Thread-216] regionserver.RSRpcServices(1334): Receiving OPEN for the region:TestRegionServerNoMaster,,1414620223682.025198143197ea68803e49819eae27ca., which we are already trying to OPEN - ignoring this new request for this region. Submitting openRegion attempt: 16 2014-10-29 15:03:45,044 INFO [Thread-216] regionserver.RSRpcServices(1311): Open TestRegionServerNoMaster,,1414620223682.025198143197ea68803e49819eae27ca. 2014-10-29 15:03:45,044 INFO [PostOpenDeployTasks:025198143197ea68803e49819eae27ca] hbase.MetaTableAccessor(1307): Updated row TestRegionServerNoMaster,,1414620223682.025198143197ea68803e49819eae27ca. with server=192.168.1.105,63082,1414620220789 Submitting openRegion attempt: 17 2014-10-29 15:03:45,046 ERROR [RS_OPEN_REGION-192.168.1.105:63082-2] handler.OpenRegionHandler(88): Region 025198143197ea68803e49819eae27ca was already online when we started processing the opening. Marking this new attempt as failed 2014-10-29 15:03:45,047 FATAL [Thread-216] regionserver.HRegionServer(1931): ABORTING region server 192.168.1.105,63082,1414620220789: Received OPEN for the region:TestRegionServerNoMaster,,1414620223682.025198143197ea68803e49819eae27ca., which is already online 2014-10-29 15:03:45,047 FATAL [Thread-216] regionserver.HRegionServer(1937): RegionServer abort: loaded coprocessors are: [org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint] 2014-10-29 15:03:45,054 WARN [Thread-216] regionserver.HRegionServer(1955): Unable to report fatal error to master com.google.protobuf.ServiceException: java.io.IOException: Call to /192.168.1.105:63079 failed on local exception: java.io.IOException: Connection to /192.168.1.105:63079 is closing. Call id=4, waitTime=2 at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1707) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1757) at org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$BlockingStub.reportRSFatalError(RegionServerStatusProtos.java:8301) at org.apache.hadoop.hbase.regionserver.HRegionServer.abort(HRegionServer.java:1952) at org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.abortRegionServer(MiniHBaseCluster.java:174) at org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.access$100(MiniHBaseCluster.java:108) at org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer$2.run(MiniHBaseCluster.java:167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:356) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528) at org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:277) at org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.abort(MiniHBaseCluster.java:165) at org.apache.hadoop.hbase.regionserver.HRegionServer.abort(HRegionServer.java:1964) at org.apache.hadoop.hbase.regionserver.RSRpcServices.openRegion(RSRpcServices.java:1308) at org.apache.hadoop.hbase.regionserver.TestRegionServerNoMaster.testMultipleOpen(TestRegionServerNoMaster.java:237) at
[jira] [Commented] (HBASE-10201) Port 'Make flush decisions per column family' to trunk
[ https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190804#comment-14190804 ] Gaurav Menghani commented on HBASE-10201: - [~Apache9] Great work porting this patch! Glad to see this getting ported from 0.89-fb to trunk :) Please let me know if you need any help. Port 'Make flush decisions per column family' to trunk -- Key: HBASE-10201 URL: https://issues.apache.org/jira/browse/HBASE-10201 Project: HBase Issue Type: Improvement Components: wal Reporter: Ted Yu Assignee: zhangduo Priority: Critical Fix For: 2.0.0, 0.99.2 Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_2.patch, HBASE-10201_3.patch Currently the flush decision is made using the aggregate size of all column families. When large and small column families co-exist, this causes many small flushes of the smaller CF. We need to make per-CF flush decisions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12384) TestTags can hang on fast test hosts
[ https://issues.apache.org/jira/browse/HBASE-12384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190814#comment-14190814 ] Hudson commented on HBASE-12384: FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #611 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/611/]) HBASE-12384 TestTags can hang on fast test hosts (apurtell: rev 415b8ff438b9bdc5c2e1b15a9b6de81bb472fe24) * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestTags.java TestTags can hang on fast test hosts Key: HBASE-12384 URL: https://issues.apache.org/jira/browse/HBASE-12384 Project: HBase Issue Type: Bug Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Fix For: 2.0.0, 0.98.8, 0.99.2 Attachments: HBASE-12384-0.98.patch, HBASE-12384-master.patch Waiting indefinitely expecting flushed files to reach a certain count after triggering a flush but compaction has happened between the flush and check for number of store files. {code} admin.flush(tableName); regions = TEST_UTIL.getHBaseCluster().getRegions(tableName); for (HRegion region : regions) { Store store = region.getStore(fam); - Flush and compaction has happened before here --- while (!(store.getStorefilesCount() 2)) { - Hung forever in here --- Thread.sleep(10); } } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12381) Add maven enforcer rules for build assumptions
[ https://issues.apache.org/jira/browse/HBASE-12381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190815#comment-14190815 ] Hudson commented on HBASE-12381: FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #611 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/611/]) HBASE-12381 use the Maven Enforcer Plugin to check maven and java versions. (stack: rev 5874ae0add074b92827577173c9354a9fee671a6) * pom.xml Add maven enforcer rules for build assumptions -- Key: HBASE-12381 URL: https://issues.apache.org/jira/browse/HBASE-12381 Project: HBase Issue Type: Task Components: build Reporter: Sean Busbey Assignee: Sean Busbey Priority: Minor Fix For: 2.0.0, 0.94.26, 0.98.9, 0.99.2 Attachments: HBASE-12381.1.patch.txt our ref guide says that you need maven 3 to build. add an enforcer rule so that people find out early that they have the wrong maven version, rather then however things fall over if someone tries to build with maven 2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-10201) Port 'Make flush decisions per column family' to trunk
[ https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190830#comment-14190830 ] Ted Yu commented on HBASE-10201: +1 on turning on this in master branch. Port 'Make flush decisions per column family' to trunk -- Key: HBASE-10201 URL: https://issues.apache.org/jira/browse/HBASE-10201 Project: HBase Issue Type: Improvement Components: wal Reporter: Ted Yu Assignee: zhangduo Priority: Critical Fix For: 2.0.0, 0.99.2 Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_2.patch, HBASE-10201_3.patch Currently the flush decision is made using the aggregate size of all column families. When large and small column families co-exist, this causes many small flushes of the smaller CF. We need to make per-CF flush decisions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11835) Wrong managenement of non expected calls in the client
[ https://issues.apache.org/jira/browse/HBASE-11835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190840#comment-14190840 ] Hudson commented on HBASE-11835: SUCCESS: Integrated in HBase-1.0 #392 (See [https://builds.apache.org/job/HBase-1.0/392/]) HBASE-11835 Wrong managenement of non expected calls in the client (Nicolas Liochon) (stack: rev 29d486ff4e9a8ed87532c434b3bb63fb58cf5310) * hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/RpcClient.java * hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java Wrong managenement of non expected calls in the client -- Key: HBASE-11835 URL: https://issues.apache.org/jira/browse/HBASE-11835 Project: HBase Issue Type: Bug Components: Client, Performance Affects Versions: 1.0.0, 2.0.0, 0.98.6 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Fix For: 2.0.0, 0.99.2 Attachments: 11835.rebase.patch, 11835.rebase.patch, 11835.rebase.patch, rpcClient.patch If a call is purged or canceled we try to skip the reply from the server, but we read the wrong number of bytes so we corrupt the tcp channel. It's hidden as it triggers retry and so on, but it's bad for performances obviously. It happens with cell blocks. [~ram_krish_86], [~saint@gmail.com], you know this part better than me, do you agree with the analysis and the patch? The changes in rpcServer are not fully related: as the client close the connections in such situation, I observed both ClosedChannelException and CancelledKeyException. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-12387) committer guidelines should include patch signoff
Sean Busbey created HBASE-12387: --- Summary: committer guidelines should include patch signoff Key: HBASE-12387 URL: https://issues.apache.org/jira/browse/HBASE-12387 Project: HBase Issue Type: Task Components: documentation Reporter: Sean Busbey Right now our guide for committers apply patches has them use {{git am}} without a signoff flag. This works okay, but it misses adding the signed-off-by blurb in the commit message. Those messages make it easier to see at a glance with e.g. {{git log}} which committer applied the patch. this section: {quote} The directive to use git format-patch rather than git diff, and not to use --no-prefix, is a new one. See the second example for how to apply a patch created with git diff, and educate the person who created the patch. {code} $ git checkout -b HBASE- $ git am ~/Downloads/HBASE--v2.patch $ git checkout master $ git pull --rebase $ git cherry-pick sha-from-commit # Resolve conflicts if necessary or ask the submitter to do it $ git pull --rebase # Better safe than sorry $ git push origin master $ git checkout branch-1 $ git pull --rebase $ git cherry-pick sha-from-commit # Resolve conflicts if necessary $ git pull --rebase # Better safe than sorry $ git push origin branch-1 $ git branch -D HBASE- {code} {quote} Should be {quote} The directive to use git format-patch rather than git diff, and not to use --no-prefix, is a new one. See the second example for how to apply a patch created with git diff, and educate the person who created the patch. Note that the {{--signoff}} flag to {{git am}} will insert a line in the commit message that the patch was checked by your author string. This addition to your inclusion as the commit's committer makes your participation more prominent to users browsing {{git log}}. {code} $ git checkout -b HBASE- $ git am --signoff ~/Downloads/HBASE--v2.patch $ git checkout master $ git pull --rebase $ git cherry-pick sha-from-commit # Resolve conflicts if necessary or ask the submitter to do it $ git pull --rebase # Better safe than sorry $ git push origin master $ git checkout branch-1 $ git pull --rebase $ git cherry-pick sha-from-commit # Resolve conflicts if necessary $ git pull --rebase # Better safe than sorry $ git push origin branch-1 $ git branch -D HBASE- {code} {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12387) committer guidelines should include patch signoff
[ https://issues.apache.org/jira/browse/HBASE-12387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190878#comment-14190878 ] stack commented on HBASE-12387: --- +1 I thought we had text with signoff previously (I read something by [~busbey] to this effect and is why I do signoff if patch format allows over last month or more...) committer guidelines should include patch signoff - Key: HBASE-12387 URL: https://issues.apache.org/jira/browse/HBASE-12387 Project: HBase Issue Type: Task Components: documentation Reporter: Sean Busbey Right now our guide for committers apply patches has them use {{git am}} without a signoff flag. This works okay, but it misses adding the signed-off-by blurb in the commit message. Those messages make it easier to see at a glance with e.g. {{git log}} which committer applied the patch. this section: {quote} The directive to use git format-patch rather than git diff, and not to use --no-prefix, is a new one. See the second example for how to apply a patch created with git diff, and educate the person who created the patch. {code} $ git checkout -b HBASE- $ git am ~/Downloads/HBASE--v2.patch $ git checkout master $ git pull --rebase $ git cherry-pick sha-from-commit # Resolve conflicts if necessary or ask the submitter to do it $ git pull --rebase # Better safe than sorry $ git push origin master $ git checkout branch-1 $ git pull --rebase $ git cherry-pick sha-from-commit # Resolve conflicts if necessary $ git pull --rebase # Better safe than sorry $ git push origin branch-1 $ git branch -D HBASE- {code} {quote} Should be {quote} The directive to use git format-patch rather than git diff, and not to use --no-prefix, is a new one. See the second example for how to apply a patch created with git diff, and educate the person who created the patch. Note that the {{--signoff}} flag to {{git am}} will insert a line in the commit message that the patch was checked by your author string. This addition to your inclusion as the commit's committer makes your participation more prominent to users browsing {{git log}}. {code} $ git checkout -b HBASE- $ git am --signoff ~/Downloads/HBASE--v2.patch $ git checkout master $ git pull --rebase $ git cherry-pick sha-from-commit # Resolve conflicts if necessary or ask the submitter to do it $ git pull --rebase # Better safe than sorry $ git push origin master $ git checkout branch-1 $ git pull --rebase $ git cherry-pick sha-from-commit # Resolve conflicts if necessary $ git pull --rebase # Better safe than sorry $ git push origin branch-1 $ git branch -D HBASE- {code} {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors
[ https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-12219: -- Attachment: HBASE-12219-v1.patch Reapply so get another hadoopqa run Cache more efficiently getAll() and get() in FSTableDescriptors --- Key: HBASE-12219 URL: https://issues.apache.org/jira/browse/HBASE-12219 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.94.24, 0.99.1, 0.98.6.1 Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Labels: scalability Attachments: HBASE-12219-v1.patch, HBASE-12219-v1.patch, HBASE-12219.v0.txt, list.png Currently table descriptors and tables are cached once they are accessed for the first time. Next calls to the master only require a trip to HDFS to lookup the modified time in order to reload the table descriptors if modified. However in clusters with a large number of tables or concurrent clients and this can be too aggressive to HDFS and the master causing contention to process other requests. A simple solution is to have a TTL based cached for FSTableDescriptors#getAll() and FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to process those calls faster without causing contention without having to perform a trip to HDFS for every call. to listtables() or getTableDescriptor() -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HBASE-11819) Unit test for CoprocessorHConnection
[ https://issues.apache.org/jira/browse/HBASE-11819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-11819. --- Resolution: Fixed Fix Version/s: (was: 0.98.8) Hadoop Flags: Reviewed Applied to branch-1+. Thanks for the patch [~talat] What I applied differed in the formatting. I removed white space too. See d719ea1db795686f893cba8ad85d16b3e136e89b and compare to what you posted so you know for next time. Good on you. Unit test for CoprocessorHConnection - Key: HBASE-11819 URL: https://issues.apache.org/jira/browse/HBASE-11819 Project: HBase Issue Type: Test Reporter: Andrew Purtell Assignee: Talat UYARER Priority: Minor Labels: newbie++ Fix For: 2.0.0, 0.99.2 Attachments: HBASE-11819.patch, HBASE-11819v2.patch, HBASE-11819v3.patch Add a unit test to hbase-server that exercises CoprocessorHConnection . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11835) Wrong managenement of non expected calls in the client
[ https://issues.apache.org/jira/browse/HBASE-11835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190891#comment-14190891 ] Hudson commented on HBASE-11835: FAILURE: Integrated in HBase-TRUNK #5725 (See [https://builds.apache.org/job/HBase-TRUNK/5725/]) HBASE-11835 Wrong managenement of non expected calls in the client (Nicolas Liochon) (stack: rev 9f4b6ac06c35fb18acd1951382da024780e01e2f) * hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/RpcClient.java * hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java Wrong managenement of non expected calls in the client -- Key: HBASE-11835 URL: https://issues.apache.org/jira/browse/HBASE-11835 Project: HBase Issue Type: Bug Components: Client, Performance Affects Versions: 1.0.0, 2.0.0, 0.98.6 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Fix For: 2.0.0, 0.99.2 Attachments: 11835.rebase.patch, 11835.rebase.patch, 11835.rebase.patch, rpcClient.patch If a call is purged or canceled we try to skip the reply from the server, but we read the wrong number of bytes so we corrupt the tcp channel. It's hidden as it triggers retry and so on, but it's bad for performances obviously. It happens with cell blocks. [~ram_krish_86], [~saint@gmail.com], you know this part better than me, do you agree with the analysis and the patch? The changes in rpcServer are not fully related: as the client close the connections in such situation, I observed both ClosedChannelException and CancelledKeyException. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11819) Unit test for CoprocessorHConnection
[ https://issues.apache.org/jira/browse/HBASE-11819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190894#comment-14190894 ] Hudson commented on HBASE-11819: FAILURE: Integrated in HBase-1.0 #393 (See [https://builds.apache.org/job/HBase-1.0/393/]) HBASE-11819 Unit test for CoprocessorHConnection (Talat Uyarer) (stack: rev d719ea1db795686f893cba8ad85d16b3e136e89b) * hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorHConnection.java Unit test for CoprocessorHConnection - Key: HBASE-11819 URL: https://issues.apache.org/jira/browse/HBASE-11819 Project: HBase Issue Type: Test Reporter: Andrew Purtell Assignee: Talat UYARER Priority: Minor Labels: newbie++ Fix For: 2.0.0, 0.99.2 Attachments: HBASE-11819.patch, HBASE-11819v2.patch, HBASE-11819v3.patch Add a unit test to hbase-server that exercises CoprocessorHConnection . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors
[ https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Esteban Gutierrez updated HBASE-12219: -- Attachment: HBASE-12219.v2.patch new patch, fixed some nits. Cache more efficiently getAll() and get() in FSTableDescriptors --- Key: HBASE-12219 URL: https://issues.apache.org/jira/browse/HBASE-12219 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.94.24, 0.99.1, 0.98.6.1 Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Labels: scalability Attachments: HBASE-12219-v1.patch, HBASE-12219-v1.patch, HBASE-12219.v0.txt, HBASE-12219.v2.patch, list.png Currently table descriptors and tables are cached once they are accessed for the first time. Next calls to the master only require a trip to HDFS to lookup the modified time in order to reload the table descriptors if modified. However in clusters with a large number of tables or concurrent clients and this can be too aggressive to HDFS and the master causing contention to process other requests. A simple solution is to have a TTL based cached for FSTableDescriptors#getAll() and FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to process those calls faster without causing contention without having to perform a trip to HDFS for every call. to listtables() or getTableDescriptor() -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12386) Replication gets stuck following a transient zookeeper error to remote peer cluster
[ https://issues.apache.org/jira/browse/HBASE-12386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrian Muraru updated HBASE-12386: -- Attachment: HBASE-12386.patch Replication gets stuck following a transient zookeeper error to remote peer cluster --- Key: HBASE-12386 URL: https://issues.apache.org/jira/browse/HBASE-12386 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.98.7 Reporter: Adrian Muraru Attachments: HBASE-12386.patch Following a transient ZK error replication gets stuck and remote peers are never updated. Source region servers are reporting continuously the following error in logs: No replication sinks are available -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-12388) Document that WALObservers don't get empty edits.
Sean Busbey created HBASE-12388: --- Summary: Document that WALObservers don't get empty edits. Key: HBASE-12388 URL: https://issues.apache.org/jira/browse/HBASE-12388 Project: HBase Issue Type: Task Components: wal Reporter: Sean Busbey Assignee: Sean Busbey Fix For: 2.0.0, 0.99.2 in branch-1+, WALObservers don't get any notice of WALEdits that return true for isEmpty(). Make sure this is noted in the docs. It was surprising while I was writing a test, and it's a different edge case than in 0.98. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12386) Replication gets stuck following a transient zookeeper error to remote peer cluster
[ https://issues.apache.org/jira/browse/HBASE-12386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrian Muraru updated HBASE-12386: -- Status: Patch Available (was: Open) Replication gets stuck following a transient zookeeper error to remote peer cluster --- Key: HBASE-12386 URL: https://issues.apache.org/jira/browse/HBASE-12386 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.98.7 Reporter: Adrian Muraru Attachments: HBASE-12386.patch Following a transient ZK error replication gets stuck and remote peers are never updated. Source region servers are reporting continuously the following error in logs: No replication sinks are available -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11819) Unit test for CoprocessorHConnection
[ https://issues.apache.org/jira/browse/HBASE-11819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190909#comment-14190909 ] Talat UYARER commented on HBASE-11819: -- Thanks for reviewing and commit [~stack]. I compare those. I hope next time my patch will has good formatting. Unit test for CoprocessorHConnection - Key: HBASE-11819 URL: https://issues.apache.org/jira/browse/HBASE-11819 Project: HBase Issue Type: Test Reporter: Andrew Purtell Assignee: Talat UYARER Priority: Minor Labels: newbie++ Fix For: 2.0.0, 0.99.2 Attachments: HBASE-11819.patch, HBASE-11819v2.patch, HBASE-11819v3.patch Add a unit test to hbase-server that exercises CoprocessorHConnection . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12377) HBaseAdmin#deleteTable fails when META region is moved around the same timeframe
[ https://issues.apache.org/jira/browse/HBASE-12377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephen Yuan Jiang updated HBASE-12377: --- Attachment: HBASE-12377.v2-2.0.patch The V2 patch is attached. This is based on Enis's feedback to refactor the code that check whether the deleted table still visible in master. The description of the patch: The HBaseAdmin.deleteTable implemented its own logic to obtain all regions of a table, this is not robust (a few issues found from the logic, NotServingRegion exception was not handled for retry; stale meta cache was used, etc.) The change in this patch is to use proven MetaScanner.listTableRegionLocations method to find all non-archived regions of a table. And the patch also simplifies the code in the function to make it more maintainable (eg. using HConnection.getHTableDescriptorsByTableName instead of duping same logic in the function). HBaseAdmin#deleteTable fails when META region is moved around the same timeframe Key: HBASE-12377 URL: https://issues.apache.org/jira/browse/HBASE-12377 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.98.4 Reporter: Stephen Yuan Jiang Assignee: Stephen Yuan Jiang Fix For: 2.0.0, 0.98.8, 0.99.2 Attachments: HBASE-12377.v1-2.0.patch, HBASE-12377.v2-2.0.patch This is the same issue that HBASE-10809 tried to address. The fix of HBASE-10809 refetch the latest meta location in retry-loop. However, there are 2 problems: (1). inside the retry loop, there is another try-catch block that would throw the exception before retry can kick in; (2). It looks like that HBaseAdmin::getFirstMetaServerForTable() always tries to get meta data from meta cache, which means if the meta cache is stale and out of date, retries would not solve the problem by fetching from the stale meta cache. Here is the call stack of the issue: {noformat} 2014-10-27 10:11:58,495|beaver.machine|INFO|18218|140065036261120|MainThread|org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Region hbase:meta,,1 is not online on ip-172-31-0-48.ec2.internal,60020,1414403435009 2014-10-27 10:11:58,496|beaver.machine|INFO|18218|140065036261120|MainThread|at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2774) 2014-10-27 10:11:58,496|beaver.machine|INFO|18218|140065036261120|MainThread|at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:4257) 2014-10-27 10:11:58,497|beaver.machine|INFO|18218|140065036261120|MainThread|at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3156) 2014-10-27 10:11:58,497|beaver.machine|INFO|18218|140065036261120|MainThread|at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29994) 2014-10-27 10:11:58,498|beaver.machine|INFO|18218|140065036261120|MainThread|at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2078) 2014-10-27 10:11:58,498|beaver.machine|INFO|18218|140065036261120|MainThread|at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108) 2014-10-27 10:11:58,499|beaver.machine|INFO|18218|140065036261120|MainThread|at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114) 2014-10-27 10:11:58,499|beaver.machine|INFO|18218|140065036261120|MainThread|at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94) 2014-10-27 10:11:58,499|beaver.machine|INFO|18218|140065036261120|MainThread|at java.lang.Thread.run(Thread.java:745) 2014-10-27 10:11:58,500|beaver.machine|INFO|18218|140065036261120|MainThread| 2014-10-27 10:11:58,500|beaver.machine|INFO|18218|140065036261120|MainThread|at sun.reflect.GeneratedConstructorAccessor12.newInstance(Unknown Source) 2014-10-27 10:11:58,500|beaver.machine|INFO|18218|140065036261120|MainThread|at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) 2014-10-27 10:11:58,501|beaver.machine|INFO|18218|140065036261120|MainThread|at java.lang.reflect.Constructor.newInstance(Constructor.java:526) 2014-10-27 10:11:58,501|beaver.machine|INFO|18218|140065036261120|MainThread|at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) 2014-10-27 10:11:58,502|beaver.machine|INFO|18218|140065036261120|MainThread|at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95) 2014-10-27 10:11:58,502|beaver.machine|INFO|18218|140065036261120|MainThread|at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:306) 2014-10-27
[jira] [Updated] (HBASE-12386) Replication gets stuck following a transient zookeeper error to remote peer cluster
[ https://issues.apache.org/jira/browse/HBASE-12386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrian Muraru updated HBASE-12386: -- Attachment: (was: HBASE-12386.patch) Replication gets stuck following a transient zookeeper error to remote peer cluster --- Key: HBASE-12386 URL: https://issues.apache.org/jira/browse/HBASE-12386 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.98.7 Reporter: Adrian Muraru Following a transient ZK error replication gets stuck and remote peers are never updated. Source region servers are reporting continuously the following error in logs: No replication sinks are available -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12386) Replication gets stuck following a transient zookeeper error to remote peer cluster
[ https://issues.apache.org/jira/browse/HBASE-12386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrian Muraru updated HBASE-12386: -- Status: Open (was: Patch Available) Replication gets stuck following a transient zookeeper error to remote peer cluster --- Key: HBASE-12386 URL: https://issues.apache.org/jira/browse/HBASE-12386 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.98.7 Reporter: Adrian Muraru Following a transient ZK error replication gets stuck and remote peers are never updated. Source region servers are reporting continuously the following error in logs: No replication sinks are available -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12386) Replication gets stuck following a transient zookeeper error to remote peer cluster
[ https://issues.apache.org/jira/browse/HBASE-12386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrian Muraru updated HBASE-12386: -- Attachment: HBASE-12386.patch Replication gets stuck following a transient zookeeper error to remote peer cluster --- Key: HBASE-12386 URL: https://issues.apache.org/jira/browse/HBASE-12386 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.98.7 Reporter: Adrian Muraru Attachments: HBASE-12386.patch Following a transient ZK error replication gets stuck and remote peers are never updated. Source region servers are reporting continuously the following error in logs: No replication sinks are available -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12386) Replication gets stuck following a transient zookeeper error to remote peer cluster
[ https://issues.apache.org/jira/browse/HBASE-12386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrian Muraru updated HBASE-12386: -- Status: Patch Available (was: Open) Replication gets stuck following a transient zookeeper error to remote peer cluster --- Key: HBASE-12386 URL: https://issues.apache.org/jira/browse/HBASE-12386 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.98.7 Reporter: Adrian Muraru Attachments: HBASE-12386.patch Following a transient ZK error replication gets stuck and remote peers are never updated. Source region servers are reporting continuously the following error in logs: No replication sinks are available -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12346) Scan's default auths behavior under Visibility labels
[ https://issues.apache.org/jira/browse/HBASE-12346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190917#comment-14190917 ] Jerry He commented on HBASE-12346: -- Sounds good. I will have a patch and test it out. Thanks! Scan's default auths behavior under Visibility labels - Key: HBASE-12346 URL: https://issues.apache.org/jira/browse/HBASE-12346 Project: HBase Issue Type: Bug Components: API, security Affects Versions: 0.98.7, 0.99.1 Reporter: Jerry He Fix For: 0.98.8, 0.99.2 Attachments: HBASE-12346-master-v2.patch, HBASE-12346-master.patch In Visibility Labels security, a set of labels (auths) are administered and associated with a user. A user can normally only see cell data during scan that are part of the user's label set (auths). Scan uses setAuthorizations to indicates its wants to use the auths to access the cells. Similarly in the shell: {code} scan 'table1', AUTHORIZATIONS = ['private'] {code} But it is a surprise to find that setAuthorizations seems to be 'mandatory' in the default visibility label security setting. Every scan needs to setAuthorizations before the scan can get any cells even the cells are under the labels the request user is part of. The following steps will illustrate the issue: Run as superuser. {code} 1. create a visibility label called 'private' 2. create 'table1' 3. put into 'table1' data and label the data as 'private' 4. set_auths 'user1', 'private' 5. grant 'user1', 'RW', 'table1' {code} Run as 'user1': {code} 1. scan 'table1' This show no cells. 2. scan 'table1', scan 'table1', AUTHORIZATIONS = ['private'] This will show all the data. {code} I am not sure if this is expected by design or a bug. But a more reasonable, more client application backward compatible, and less surprising default behavior should probably look like this: A scan's default auths, if its Authorizations attributes is not set explicitly, should be all the auths the request user is administered and allowed on the server. If scan.setAuthorizations is used, then the server further filter the auths during scan: use the input auths minus what is not in user's label set on the server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12285) Builds are failing, possibly because of SUREFIRE-1091
[ https://issues.apache.org/jira/browse/HBASE-12285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190919#comment-14190919 ] stack commented on HBASE-12285: --- What you think [~dimaspivak]? I'm thinking we resolved as fixed by the linked too much logging fix you did and the update to surefire 2.18-SNAPSHOT. Builds are failing, possibly because of SUREFIRE-1091 - Key: HBASE-12285 URL: https://issues.apache.org/jira/browse/HBASE-12285 Project: HBase Issue Type: Bug Affects Versions: 1.0.0 Reporter: Dima Spivak Assignee: Dima Spivak Priority: Blocker Attachments: HBASE-12285_branch-1_v1.patch, HBASE-12285_branch-1_v1.patch Our branch-1 builds on builds.apache.org have been failing in recent days after we switched over to an official version of Surefire a few days back (HBASE-4955). The version we're using, 2.17, is hit by a bug ([SUREFIRE-1091|https://jira.codehaus.org/browse/SUREFIRE-1091]) that results in an IOException, which looks like what we're seeing on Jenkins. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12386) Replication gets stuck following a transient zookeeper error to remote peer cluster
[ https://issues.apache.org/jira/browse/HBASE-12386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190926#comment-14190926 ] Ted Yu commented on HBASE-12386: {code} +if (endpoint.getLastRegionServerUpdate() this.lastUpdateToPeers || sinks.isEmpty()) { + LOG.info(Current list of sinks is out of date or empty, updating); {code} It would helpful if the condition (list out of date or empty) is stated clearly in the log message. Replication gets stuck following a transient zookeeper error to remote peer cluster --- Key: HBASE-12386 URL: https://issues.apache.org/jira/browse/HBASE-12386 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.98.7 Reporter: Adrian Muraru Attachments: HBASE-12386.patch Following a transient ZK error replication gets stuck and remote peers are never updated. Source region servers are reporting continuously the following error in logs: No replication sinks are available -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12285) Builds are failing, possibly because of SUREFIRE-1091
[ https://issues.apache.org/jira/browse/HBASE-12285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dima Spivak updated HBASE-12285: Resolution: Fixed Fix Version/s: 1.0.0 Status: Resolved (was: Patch Available) Builds are failing, possibly because of SUREFIRE-1091 - Key: HBASE-12285 URL: https://issues.apache.org/jira/browse/HBASE-12285 Project: HBase Issue Type: Bug Affects Versions: 1.0.0 Reporter: Dima Spivak Assignee: Dima Spivak Priority: Blocker Fix For: 1.0.0 Attachments: HBASE-12285_branch-1_v1.patch, HBASE-12285_branch-1_v1.patch Our branch-1 builds on builds.apache.org have been failing in recent days after we switched over to an official version of Surefire a few days back (HBASE-4955). The version we're using, 2.17, is hit by a bug ([SUREFIRE-1091|https://jira.codehaus.org/browse/SUREFIRE-1091]) that results in an IOException, which looks like what we're seeing on Jenkins. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12285) Builds are failing, possibly because of SUREFIRE-1091
[ https://issues.apache.org/jira/browse/HBASE-12285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190928#comment-14190928 ] Dima Spivak commented on HBASE-12285: - Sounds good, [~stack]. It looks good except for a {code} [ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.18-SNAPSHOT:test (secondPartTestsExecution) on project hbase-server: There was a timeout or other error in the fork - [Help 1] {code} that happened a little while back, but I'll go ahead and close this since I think the SUREFIRE-1091 bug is no longer the problem. Builds are failing, possibly because of SUREFIRE-1091 - Key: HBASE-12285 URL: https://issues.apache.org/jira/browse/HBASE-12285 Project: HBase Issue Type: Bug Affects Versions: 1.0.0 Reporter: Dima Spivak Assignee: Dima Spivak Priority: Blocker Fix For: 1.0.0 Attachments: HBASE-12285_branch-1_v1.patch, HBASE-12285_branch-1_v1.patch Our branch-1 builds on builds.apache.org have been failing in recent days after we switched over to an official version of Surefire a few days back (HBASE-4955). The version we're using, 2.17, is hit by a bug ([SUREFIRE-1091|https://jira.codehaus.org/browse/SUREFIRE-1091]) that results in an IOException, which looks like what we're seeing on Jenkins. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12388) Document that WALObservers don't get empty edits.
[ https://issues.apache.org/jira/browse/HBASE-12388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-12388: Status: Patch Available (was: Open) Document that WALObservers don't get empty edits. - Key: HBASE-12388 URL: https://issues.apache.org/jira/browse/HBASE-12388 Project: HBase Issue Type: Task Components: wal Reporter: Sean Busbey Assignee: Sean Busbey Fix For: 2.0.0, 0.99.2 Attachments: HBASE-12388.1.patch.txt in branch-1+, WALObservers don't get any notice of WALEdits that return true for isEmpty(). Make sure this is noted in the docs. It was surprising while I was writing a test, and it's a different edge case than in 0.98. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12388) Document that WALObservers don't get empty edits.
[ https://issues.apache.org/jira/browse/HBASE-12388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-12388: Attachment: HBASE-12388.1.patch.txt added comment to the javadoc about the behavior and a test that verifies it. Document that WALObservers don't get empty edits. - Key: HBASE-12388 URL: https://issues.apache.org/jira/browse/HBASE-12388 Project: HBase Issue Type: Task Components: wal Reporter: Sean Busbey Assignee: Sean Busbey Fix For: 2.0.0, 0.99.2 Attachments: HBASE-12388.1.patch.txt in branch-1+, WALObservers don't get any notice of WALEdits that return true for isEmpty(). Make sure this is noted in the docs. It was surprising while I was writing a test, and it's a different edge case than in 0.98. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-12389) Reduce the number of versions configured for the ACL table
Andrew Purtell created HBASE-12389: -- Summary: Reduce the number of versions configured for the ACL table Key: HBASE-12389 URL: https://issues.apache.org/jira/browse/HBASE-12389 Project: HBase Issue Type: Improvement Components: security Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Fix For: 2.0.0, 0.98.8, 0.99.2 We recently reduced the number of versions kept for entries in META from 10 to 3. This same arbitrary constant was used for the ACL table definition in AccessControlLists. We should change this to 3 also, or even 1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-9527) Review all old api that takes a table name as a byte array and ensure none can pass ns + tablename
[ https://issues.apache.org/jira/browse/HBASE-9527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190940#comment-14190940 ] stack commented on HBASE-9527: -- We used to specify tables names with byte array or String. In 0.96 we changed it to a TableName class because we added namespaces. I think the worry here is that in a few places we let through 'ns:tablename' -- a String or byte array with the delimiter in it -- which was a workaround getting namespaces in but we'd like to purge that now. Review all old api that takes a table name as a byte array and ensure none can pass ns + tablename -- Key: HBASE-9527 URL: https://issues.apache.org/jira/browse/HBASE-9527 Project: HBase Issue Type: Bug Reporter: stack Assignee: Talat UYARER Priority: Critical Fix For: 0.99.2 Go over all old APIs that take a table name and ensure that it is not possible to pass in a byte array that is a namespace + tablename; instead throw an exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12285) Builds are failing, possibly because of SUREFIRE-1091
[ https://issues.apache.org/jira/browse/HBASE-12285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-12285: -- Fix Version/s: (was: 1.0.0) 0.99.2 2.0.0 Builds are failing, possibly because of SUREFIRE-1091 - Key: HBASE-12285 URL: https://issues.apache.org/jira/browse/HBASE-12285 Project: HBase Issue Type: Bug Affects Versions: 1.0.0 Reporter: Dima Spivak Assignee: Dima Spivak Priority: Blocker Fix For: 2.0.0, 0.99.2 Attachments: HBASE-12285_branch-1_v1.patch, HBASE-12285_branch-1_v1.patch Our branch-1 builds on builds.apache.org have been failing in recent days after we switched over to an official version of Surefire a few days back (HBASE-4955). The version we're using, 2.17, is hit by a bug ([SUREFIRE-1091|https://jira.codehaus.org/browse/SUREFIRE-1091]) that results in an IOException, which looks like what we're seeing on Jenkins. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-9527) Review all old api that takes a table name as a byte array and ensure none can pass ns + tablename
[ https://issues.apache.org/jira/browse/HBASE-9527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190950#comment-14190950 ] Talat UYARER commented on HBASE-9527: - Thanks [~stack] for introduction about the issue. You know what I need. :) Review all old api that takes a table name as a byte array and ensure none can pass ns + tablename -- Key: HBASE-9527 URL: https://issues.apache.org/jira/browse/HBASE-9527 Project: HBase Issue Type: Bug Reporter: stack Assignee: Talat UYARER Priority: Critical Fix For: 0.99.2 Go over all old APIs that take a table name and ensure that it is not possible to pass in a byte array that is a namespace + tablename; instead throw an exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12386) Replication gets stuck following a transient zookeeper error to remote peer cluster
[ https://issues.apache.org/jira/browse/HBASE-12386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190960#comment-14190960 ] Lars Hofhansl commented on HBASE-12386: --- {{Current list of sinks is out of date or empty, updating}} seems clear enough to me. +1 on patch. One thing we have to think through is what happens when the slave cluster is down for a bit. We'd chose sinks again on each call. I think that's OK especially since we dialed down the retry interval to 5mins recently after a bit. Also, we can still be a bad situation where RegionServers die and restart at the slave cluster, we could go down to a single RS at the peers before we try to choose sinks again. That's for another issue. Replication gets stuck following a transient zookeeper error to remote peer cluster --- Key: HBASE-12386 URL: https://issues.apache.org/jira/browse/HBASE-12386 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.98.7 Reporter: Adrian Muraru Attachments: HBASE-12386.patch Following a transient ZK error replication gets stuck and remote peers are never updated. Source region servers are reporting continuously the following error in logs: No replication sinks are available -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11760) Tighten up region state transition
[ https://issues.apache.org/jira/browse/HBASE-11760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190975#comment-14190975 ] Hudson commented on HBASE-11760: FAILURE: Integrated in HBase-TRUNK #5726 (See [https://builds.apache.org/job/HBase-TRUNK/5726/]) TestRegionServerNoMaster#testMultipleOpen is flaky after HBASE-11760 (jxiang: rev 7886c0b82f0be9ef6536b6972e4487c232aaa1e3) * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServerNoMaster.java Tighten up region state transition -- Key: HBASE-11760 URL: https://issues.apache.org/jira/browse/HBASE-11760 Project: HBase Issue Type: Improvement Components: Region Assignment Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 2.0.0 Attachments: hbase-11760.patch, hbase-11760_2.1.patch, hbase-11760_2.2.patch, hbase-11760_2.patch, rsm-2.pdf, rsm.pdf, rsm.png When a regionserver reports to master a region transition, we should check the current region state to be exactly what we expect. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11819) Unit test for CoprocessorHConnection
[ https://issues.apache.org/jira/browse/HBASE-11819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190976#comment-14190976 ] Hudson commented on HBASE-11819: FAILURE: Integrated in HBase-TRUNK #5726 (See [https://builds.apache.org/job/HBase-TRUNK/5726/]) HBASE-11819 Unit test for CoprocessorHConnection (Talat Uyarer) (stack: rev a404db52ec8099d08bd96455f7ceb1412cd9bb22) * hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorHConnection.java Unit test for CoprocessorHConnection - Key: HBASE-11819 URL: https://issues.apache.org/jira/browse/HBASE-11819 Project: HBase Issue Type: Test Reporter: Andrew Purtell Assignee: Talat UYARER Priority: Minor Labels: newbie++ Fix For: 2.0.0, 0.99.2 Attachments: HBASE-11819.patch, HBASE-11819v2.patch, HBASE-11819v3.patch Add a unit test to hbase-server that exercises CoprocessorHConnection . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11819) Unit test for CoprocessorHConnection
[ https://issues.apache.org/jira/browse/HBASE-11819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191006#comment-14191006 ] Hudson commented on HBASE-11819: SUCCESS: Integrated in HBase-1.0 #394 (See [https://builds.apache.org/job/HBase-1.0/394/]) HBASE-11819 Unit test for CoprocessorHConnection (Talat Uyarer) -- ADDENDUM testannoations fixup (stack: rev 49fb89f1512e0a7b89d1130a54420b91f3da0299) * hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorHConnection.java Unit test for CoprocessorHConnection - Key: HBASE-11819 URL: https://issues.apache.org/jira/browse/HBASE-11819 Project: HBase Issue Type: Test Reporter: Andrew Purtell Assignee: Talat UYARER Priority: Minor Labels: newbie++ Fix For: 2.0.0, 0.99.2 Attachments: HBASE-11819.patch, HBASE-11819v2.patch, HBASE-11819v3.patch Add a unit test to hbase-server that exercises CoprocessorHConnection . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-10201) Port 'Make flush decisions per column family' to trunk
[ https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191008#comment-14191008 ] stack commented on HBASE-10201: --- I wrote the list to get another opinion on the patch pre-commit. Has this patch been deployed somewhere in production (smile?). If so, would be good to know. In production, it helps? On rereview: This value should be less than half of the total memstore +threshold (hbase.hregion.memstore.flush.size). Do we ensure this in code? If not, should we? bq. I think it is better to open another issue to handle the duplication. Can you do this for the accounting fixup so by-Store in HLog. Should log when we do this: +long columnfamilyFlushSize = this.htableDescriptor +.getMemStoreColumnFamilyFlushSize(); +if (columnfamilyFlushSize = 0) { + columnfamilyFlushSize = conf.getLong( + HConstants.HREGION_MEMSTORE_COLUMNFAMILY_FLUSH_SIZE_LOWER_BOUND, + HTableDescriptor.DEFAULT_MEMSTORE_COLUMNFAMILY_FLUSH_SIZE_LOWER_BOUND); I can add on commit unless we are doing a new version. This does not have to be public since it is used from same package: + public long getEarliestFlushTimeForAllStores() { ditto this getLatestFlushTimeForAllStores And this ... isPerColumnFamilyFlushEnabled nit: Guard debug logging with an if LOG.isDebugEnabled... + LOG.debug(Since none of the CFs were above the size, flushing all.); When we flush, we write the sequenceid flush to WAL. This patch should have no effect on it. Sequenceids are region scoped. If we flush by Store, will there be holes in our accounting? For example, given 3 column families, A, B, and C. I write sequenceid 1 to A, sequenceid 2 to B, and sequenceid 3 to C. I then write sequence 4 to A. The edit at sequenceid 4 is big and pushes us over and brings on a flush. We flush A and edits 1 and 4. Is the fact that edits 2 and 3 are still up in memory going to mess us up Say the server crashes, at replay time we see we flushed up to edit 4, will we think that we edits 2 and 3 persisted? If you don't have an answer, I can work on the answer. Port 'Make flush decisions per column family' to trunk -- Key: HBASE-10201 URL: https://issues.apache.org/jira/browse/HBASE-10201 Project: HBase Issue Type: Improvement Components: wal Reporter: Ted Yu Assignee: zhangduo Priority: Critical Fix For: 2.0.0, 0.99.2 Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_2.patch, HBASE-10201_3.patch Currently the flush decision is made using the aggregate size of all column families. When large and small column families co-exist, this causes many small flushes of the smaller CF. We need to make per-CF flush decisions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-10201) Port 'Make flush decisions per column family' to trunk
[ https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191015#comment-14191015 ] Gaurav Menghani commented on HBASE-10201: - [~stack] From my design, in this case, 1 and 4 are flushed, but 2 and 3 are retained in the memory. But we can only mark 1 as safe. 2, 3 and 4 will all be replayed if the server crashes. I am not sure, if this has changed in the patch. The Per-CF change is not running in prod right now. I didn't see any big difference deploying it out of the box with the biggest customer where we have a lot of CFs (probably also high-lighted by the small difference in WAF). But I can try running it internally on a shadow cluster again. Let me know if there are some interesting metrics you want me to look at. Port 'Make flush decisions per column family' to trunk -- Key: HBASE-10201 URL: https://issues.apache.org/jira/browse/HBASE-10201 Project: HBase Issue Type: Improvement Components: wal Reporter: Ted Yu Assignee: zhangduo Priority: Critical Fix For: 2.0.0, 0.99.2 Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_2.patch, HBASE-10201_3.patch Currently the flush decision is made using the aggregate size of all column families. When large and small column families co-exist, this causes many small flushes of the smaller CF. We need to make per-CF flush decisions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors
[ https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191019#comment-14191019 ] Hadoop QA commented on HBASE-12219: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12678315/HBASE-12219-v1.patch against trunk revision . ATTACHMENT ID: 12678315 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 8 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:red}-1 checkstyle{color}. The applied patch generated 3776 checkstyle errors (more than the trunk's current 3774 errors). {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: + tds.put(this.metaTableDescritor.getNameAsString(), new TableDescriptor(metaTableDescritor, TableState.State.ENABLED)); + public static TableDescriptor getTableDescriptorFromFs(FileSystem fs, Path tableDir, boolean rewritePb) +FSTableDescriptors htds = new FSTableDescriptorsTest(UTIL.getConfiguration(), fs, rootdir, false, false); +FSTableDescriptors nonchtds = new FSTableDescriptorsTest(UTIL.getConfiguration(), fs, rootdir, false, false); {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/11525//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11525//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11525//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11525//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11525//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11525//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11525//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11525//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11525//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11525//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11525//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11525//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/11525//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/11525//console This message is automatically generated. Cache more efficiently getAll() and get() in FSTableDescriptors --- Key: HBASE-12219 URL: https://issues.apache.org/jira/browse/HBASE-12219 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.94.24, 0.99.1, 0.98.6.1 Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Labels: scalability Attachments: HBASE-12219-v1.patch, HBASE-12219-v1.patch, HBASE-12219.v0.txt, HBASE-12219.v2.patch, list.png Currently table descriptors and tables are cached once they are accessed for the first time. Next calls to the master only require a trip to HDFS to lookup the modified time in order to reload the table descriptors if modified. However in clusters with a large number of tables or concurrent clients and this can be too aggressive to HDFS and
[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors
[ https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191031#comment-14191031 ] Hadoop QA commented on HBASE-12219: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12678319/HBASE-12219.v2.patch against trunk revision . ATTACHMENT ID: 12678319 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 8 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.util.TestFSTableDescriptors Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/11526//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11526//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11526//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11526//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11526//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11526//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11526//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11526//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11526//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11526//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11526//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11526//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/11526//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/11526//console This message is automatically generated. Cache more efficiently getAll() and get() in FSTableDescriptors --- Key: HBASE-12219 URL: https://issues.apache.org/jira/browse/HBASE-12219 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.94.24, 0.99.1, 0.98.6.1 Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Labels: scalability Attachments: HBASE-12219-v1.patch, HBASE-12219-v1.patch, HBASE-12219.v0.txt, HBASE-12219.v2.patch, list.png Currently table descriptors and tables are cached once they are accessed for the first time. Next calls to the master only require a trip to HDFS to lookup the modified time in order to reload the table descriptors if modified. However in clusters with a large number of tables or concurrent clients and this can be too aggressive to HDFS and the master causing contention to process other requests. A simple solution is to have a TTL based cached for FSTableDescriptors#getAll() and FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to process those calls faster without causing contention without having to perform a trip to HDFS for every call. to listtables() or getTableDescriptor() -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12377) HBaseAdmin#deleteTable fails when META region is moved around the same timeframe
[ https://issues.apache.org/jira/browse/HBASE-12377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191035#comment-14191035 ] Enis Soztutar commented on HBASE-12377: --- Can we change this to be a private method, and remove the corresponding interface in Admin. Having getTableDescriptor() which throws TableNotFoundException, and this new method which returns null instead will be confusing to users. {code} + public HTableDescriptor getTableDescriptorByTableName(TableName tableName) {code} Other than that looks good. HBaseAdmin#deleteTable fails when META region is moved around the same timeframe Key: HBASE-12377 URL: https://issues.apache.org/jira/browse/HBASE-12377 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.98.4 Reporter: Stephen Yuan Jiang Assignee: Stephen Yuan Jiang Fix For: 2.0.0, 0.98.8, 0.99.2 Attachments: HBASE-12377.v1-2.0.patch, HBASE-12377.v2-2.0.patch This is the same issue that HBASE-10809 tried to address. The fix of HBASE-10809 refetch the latest meta location in retry-loop. However, there are 2 problems: (1). inside the retry loop, there is another try-catch block that would throw the exception before retry can kick in; (2). It looks like that HBaseAdmin::getFirstMetaServerForTable() always tries to get meta data from meta cache, which means if the meta cache is stale and out of date, retries would not solve the problem by fetching from the stale meta cache. Here is the call stack of the issue: {noformat} 2014-10-27 10:11:58,495|beaver.machine|INFO|18218|140065036261120|MainThread|org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Region hbase:meta,,1 is not online on ip-172-31-0-48.ec2.internal,60020,1414403435009 2014-10-27 10:11:58,496|beaver.machine|INFO|18218|140065036261120|MainThread|at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2774) 2014-10-27 10:11:58,496|beaver.machine|INFO|18218|140065036261120|MainThread|at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:4257) 2014-10-27 10:11:58,497|beaver.machine|INFO|18218|140065036261120|MainThread|at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3156) 2014-10-27 10:11:58,497|beaver.machine|INFO|18218|140065036261120|MainThread|at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29994) 2014-10-27 10:11:58,498|beaver.machine|INFO|18218|140065036261120|MainThread|at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2078) 2014-10-27 10:11:58,498|beaver.machine|INFO|18218|140065036261120|MainThread|at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108) 2014-10-27 10:11:58,499|beaver.machine|INFO|18218|140065036261120|MainThread|at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114) 2014-10-27 10:11:58,499|beaver.machine|INFO|18218|140065036261120|MainThread|at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94) 2014-10-27 10:11:58,499|beaver.machine|INFO|18218|140065036261120|MainThread|at java.lang.Thread.run(Thread.java:745) 2014-10-27 10:11:58,500|beaver.machine|INFO|18218|140065036261120|MainThread| 2014-10-27 10:11:58,500|beaver.machine|INFO|18218|140065036261120|MainThread|at sun.reflect.GeneratedConstructorAccessor12.newInstance(Unknown Source) 2014-10-27 10:11:58,500|beaver.machine|INFO|18218|140065036261120|MainThread|at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) 2014-10-27 10:11:58,501|beaver.machine|INFO|18218|140065036261120|MainThread|at java.lang.reflect.Constructor.newInstance(Constructor.java:526) 2014-10-27 10:11:58,501|beaver.machine|INFO|18218|140065036261120|MainThread|at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) 2014-10-27 10:11:58,502|beaver.machine|INFO|18218|140065036261120|MainThread|at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95) 2014-10-27 10:11:58,502|beaver.machine|INFO|18218|140065036261120|MainThread|at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:306) 2014-10-27 10:11:58,502|beaver.machine|INFO|18218|140065036261120|MainThread|at org.apache.hadoop.hbase.client.HBaseAdmin.deleteTable(HBaseAdmin.java:699) 2014-10-27 10:11:58,503|beaver.machine|INFO|18218|140065036261120|MainThread|at org.apache.hadoop.hbase.client.HBaseAdmin.deleteTable(HBaseAdmin.java:654) 2014-10-27
[jira] [Commented] (HBASE-12377) HBaseAdmin#deleteTable fails when META region is moved around the same timeframe
[ https://issues.apache.org/jira/browse/HBASE-12377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191047#comment-14191047 ] Hadoop QA commented on HBASE-12377: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12678322/HBASE-12377.v2-2.0.patch against trunk revision . ATTACHMENT ID: 12678322 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestTableSnapshotInputFormat org.apache.hadoop.hbase.coprocessor.TestCoprocessorHConnection Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/11527//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11527//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11527//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11527//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11527//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11527//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11527//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11527//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11527//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11527//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11527//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11527//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/11527//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/11527//console This message is automatically generated. HBaseAdmin#deleteTable fails when META region is moved around the same timeframe Key: HBASE-12377 URL: https://issues.apache.org/jira/browse/HBASE-12377 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.98.4 Reporter: Stephen Yuan Jiang Assignee: Stephen Yuan Jiang Fix For: 2.0.0, 0.98.8, 0.99.2 Attachments: HBASE-12377.v1-2.0.patch, HBASE-12377.v2-2.0.patch This is the same issue that HBASE-10809 tried to address. The fix of HBASE-10809 refetch the latest meta location in retry-loop. However, there are 2 problems: (1). inside the retry loop, there is another try-catch block that would throw the exception before retry can kick in; (2). It looks like that HBaseAdmin::getFirstMetaServerForTable() always tries to get meta data from meta cache, which means if the meta cache is stale and out of date, retries would not solve the problem by fetching from the stale
[jira] [Commented] (HBASE-12386) Replication gets stuck following a transient zookeeper error to remote peer cluster
[ https://issues.apache.org/jira/browse/HBASE-12386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191048#comment-14191048 ] Hadoop QA commented on HBASE-12386: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12678323/HBASE-12386.patch against trunk revision . ATTACHMENT ID: 12678323 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.coprocessor.TestCoprocessorHConnection Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/11528//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11528//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11528//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11528//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11528//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11528//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11528//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11528//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11528//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11528//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11528//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11528//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/11528//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/11528//console This message is automatically generated. Replication gets stuck following a transient zookeeper error to remote peer cluster --- Key: HBASE-12386 URL: https://issues.apache.org/jira/browse/HBASE-12386 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.98.7 Reporter: Adrian Muraru Attachments: HBASE-12386.patch Following a transient ZK error replication gets stuck and remote peers are never updated. Source region servers are reporting continuously the following error in logs: No replication sinks are available -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12388) Document that WALObservers don't get empty edits.
[ https://issues.apache.org/jira/browse/HBASE-12388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191063#comment-14191063 ] Hadoop QA commented on HBASE-12388: --- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12678326/HBASE-12388.1.patch.txt against trunk revision . ATTACHMENT ID: 12678326 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 7 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/11529//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11529//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11529//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11529//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11529//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11529//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11529//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11529//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11529//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11529//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11529//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11529//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/11529//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/11529//console This message is automatically generated. Document that WALObservers don't get empty edits. - Key: HBASE-12388 URL: https://issues.apache.org/jira/browse/HBASE-12388 Project: HBase Issue Type: Task Components: wal Reporter: Sean Busbey Assignee: Sean Busbey Fix For: 2.0.0, 0.99.2 Attachments: HBASE-12388.1.patch.txt in branch-1+, WALObservers don't get any notice of WALEdits that return true for isEmpty(). Make sure this is noted in the docs. It was surprising while I was writing a test, and it's a different edge case than in 0.98. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12388) Document that WALObservers don't get empty edits.
[ https://issues.apache.org/jira/browse/HBASE-12388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191069#comment-14191069 ] Ted Yu commented on HBASE-12388: +1 Document that WALObservers don't get empty edits. - Key: HBASE-12388 URL: https://issues.apache.org/jira/browse/HBASE-12388 Project: HBase Issue Type: Task Components: wal Reporter: Sean Busbey Assignee: Sean Busbey Fix For: 2.0.0, 0.99.2 Attachments: HBASE-12388.1.patch.txt in branch-1+, WALObservers don't get any notice of WALEdits that return true for isEmpty(). Make sure this is noted in the docs. It was surprising while I was writing a test, and it's a different edge case than in 0.98. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors
[ https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191080#comment-14191080 ] Esteban Gutierrez commented on HBASE-12219: --- v2 failed after turning off the cache in the the FSTableDescriptors constructor. I think it should be fine to use v1 instead, will upload new patch. Cache more efficiently getAll() and get() in FSTableDescriptors --- Key: HBASE-12219 URL: https://issues.apache.org/jira/browse/HBASE-12219 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.94.24, 0.99.1, 0.98.6.1 Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Labels: scalability Attachments: HBASE-12219-v1.patch, HBASE-12219-v1.patch, HBASE-12219.v0.txt, HBASE-12219.v2.patch, list.png Currently table descriptors and tables are cached once they are accessed for the first time. Next calls to the master only require a trip to HDFS to lookup the modified time in order to reload the table descriptors if modified. However in clusters with a large number of tables or concurrent clients and this can be too aggressive to HDFS and the master causing contention to process other requests. A simple solution is to have a TTL based cached for FSTableDescriptors#getAll() and FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to process those calls faster without causing contention without having to perform a trip to HDFS for every call. to listtables() or getTableDescriptor() -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors
[ https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Esteban Gutierrez updated HBASE-12219: -- Attachment: HBASE-12219.v3.patch Cache more efficiently getAll() and get() in FSTableDescriptors --- Key: HBASE-12219 URL: https://issues.apache.org/jira/browse/HBASE-12219 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.94.24, 0.99.1, 0.98.6.1 Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Labels: scalability Attachments: HBASE-12219-v1.patch, HBASE-12219-v1.patch, HBASE-12219.v0.txt, HBASE-12219.v2.patch, HBASE-12219.v3.patch, list.png Currently table descriptors and tables are cached once they are accessed for the first time. Next calls to the master only require a trip to HDFS to lookup the modified time in order to reload the table descriptors if modified. However in clusters with a large number of tables or concurrent clients and this can be too aggressive to HDFS and the master causing contention to process other requests. A simple solution is to have a TTL based cached for FSTableDescriptors#getAll() and FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to process those calls faster without causing contention without having to perform a trip to HDFS for every call. to listtables() or getTableDescriptor() -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-10201) Port 'Make flush decisions per column family' to trunk
[ https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191091#comment-14191091 ] zhangduo commented on HBASE-10201: -- {quote} Sequenceids are region scoped. If we flush by Store, will there be holes in our accounting? I write sequenceid 1 to A, sequenceid 2 to B, and sequenceid 3 to C. I then write sequence 4 to A. The edit at sequenceid 4 is big and pushes us over and brings on a flush. We flush A and edits 1 and 4. Is the fact that edits 2 and 3 are still up in memory going to mess us up Say the server crashes, at replay time we see we flushed up to edit 4, will we think that we edits 2 and 3 persisted? If you don't have an answer, I can work on the answer. {quote} Yes, we write flush seqId 1(Oh I made a mistake, I write seqId 2 in this case, flushSeqId = oldestSeqIdInStoresNotToFlush should be flushSeqId = oldestSeqIdInStoresNotToFlush - 1, I will fix it) in this case, so there will be holes and some WAL replay is unnecessary when doing recovery. We need to store a map of seqId per store instead of a single seqId to solve this, and also need some efforts on log truncation and log replay. {quote} Has this patch been deployed somewhere in production (smile?). If so, would be good to know. In production, it helps? {quote} For me, no. I am using 0.98.6.1 with HBASE-12078 patched right now(so I first try to port it to 0.98 in this issue...). Some test result is posted above. And in our production, I always see log like this {quote} 2014-09-29 13:16:25,061 INFO [MemStoreFlusher.0] regionserver.HRegion: Started memstore flush for sync:Snapshot,\x00\x00\x00\x00\x02$\x0CC,1411782012686.50aba6be7ff3150be983cb6fd77fc686., current region memstore size 128.3 M 2014-09-29 13:16:25,121 INFO [MemStoreFlusher.0] regionserver.DefaultStoreFlusher: Flushed, sequenceid=10932563, memsize=265.7 K, hasBloomFilter=true, into tmp file hdfs://online-hbase/hbase/data/sync/Snapshot/50aba6be7ff315 0be983cb6fd77fc686/.tmp/129e5ef69d7449fea9c2357aa6c4340a 2014-09-29 13:16:25,192 INFO [MemStoreFlusher.0] regionserver.DefaultStoreFlusher: Flushed, sequenceid=10932563, memsize=2.2 M, hasBloomFilter=true, into tmp file hdfs://online-hbase/hbase/data/sync/Snapshot/50aba6be7ff3150b e983cb6fd77fc686/.tmp/316fee39423142e09cdb767de9f9bc5d 2014-09-29 13:16:25,528 INFO [MemStoreFlusher.0] regionserver.DefaultStoreFlusher: Flushed, sequenceid=10932563, memsize=27.9 M, hasBloomFilter=true, into tmp file hdfs://online-hbase/hbase/data/sync/Snapshot/50aba6be7ff3150 be983cb6fd77fc686/.tmp/a886c1e39565468fbf93be6c434f5fc5 2014-09-29 13:16:26,190 INFO [MemStoreFlusher.0] regionserver.DefaultStoreFlusher: Flushed, sequenceid=10932563, memsize=98.0 M, hasBloomFilter=true, into tmp file hdfs://online-hbase/hbase/data/sync/Snapshot/50aba6be7ff3150 be983cb6fd77fc686/.tmp/ec722497c6e14d0fa732c2a9d29e3391 {quote} The smallest store is always flushed with only KBs. That's the reason why I found this issue and started to working on it... {quote} Can you do this for the accounting fixup so by-Store in HLog. {quote} Yes, I can open another issue to work on this. Thanks. Port 'Make flush decisions per column family' to trunk -- Key: HBASE-10201 URL: https://issues.apache.org/jira/browse/HBASE-10201 Project: HBase Issue Type: Improvement Components: wal Reporter: Ted Yu Assignee: zhangduo Priority: Critical Fix For: 2.0.0, 0.99.2 Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_2.patch, HBASE-10201_3.patch Currently the flush decision is made using the aggregate size of all column families. When large and small column families co-exist, this causes many small flushes of the smaller CF. We need to make per-CF flush decisions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk
[ https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhangduo updated HBASE-10201: - Attachment: HBASE-10201_4.patch Port 'Make flush decisions per column family' to trunk -- Key: HBASE-10201 URL: https://issues.apache.org/jira/browse/HBASE-10201 Project: HBase Issue Type: Improvement Components: wal Reporter: Ted Yu Assignee: zhangduo Priority: Critical Fix For: 2.0.0, 0.99.2 Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_2.patch, HBASE-10201_3.patch, HBASE-10201_4.patch Currently the flush decision is made using the aggregate size of all column families. When large and small column families co-exist, this causes many small flushes of the smaller CF. We need to make per-CF flush decisions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-10201) Port 'Make flush decisions per column family' to trunk
[ https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191117#comment-14191117 ] Hadoop QA commented on HBASE-10201: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12678367/HBASE-10201_4.patch against trunk revision . ATTACHMENT ID: 12678367 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 25 new or modified tests. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/11531//console This message is automatically generated. Port 'Make flush decisions per column family' to trunk -- Key: HBASE-10201 URL: https://issues.apache.org/jira/browse/HBASE-10201 Project: HBase Issue Type: Improvement Components: wal Reporter: Ted Yu Assignee: zhangduo Priority: Critical Fix For: 2.0.0, 0.99.2 Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_2.patch, HBASE-10201_3.patch, HBASE-10201_4.patch Currently the flush decision is made using the aggregate size of all column families. When large and small column families co-exist, this causes many small flushes of the smaller CF. We need to make per-CF flush decisions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-9003) TableMapReduceUtil should not rely on org.apache.hadoop.util.JarFinder#getJar
[ https://issues.apache.org/jira/browse/HBASE-9003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-9003: Attachment: HBASE-9003.v3.patch The patch doesn't apply cleanly for me and the renames are masking whatever content changes were made. I took a shot at cleaning it up, and generated it with {{git diff -M90%}} so that the renames wouldn't clutter the patch. I also cleaned up some whitespace changes I found in the original patch and removed a log debug line. This is what I ended up with. Is it inline with what you intended? I'm +1 for this one. TableMapReduceUtil should not rely on org.apache.hadoop.util.JarFinder#getJar - Key: HBASE-9003 URL: https://issues.apache.org/jira/browse/HBASE-9003 Project: HBase Issue Type: Bug Components: mapreduce Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Fix For: 2.0.0, 0.99.2 Attachments: HBASE-9003.v0.patch, HBASE-9003.v1.patch, HBASE-9003.v2.patch, HBASE-9003.v2.patch, HBASE-9003.v3.patch This is the problem: {{TableMapReduceUtil#addDependencyJars}} relies on {{org.apache.hadoop.util.JarFinder}} if available to call {{getJar()}}. However {{getJar()}} uses File.createTempFile() to create a temporary file under {{hadoop.tmp.dir}}{{/target/test-dir}}. Due HADOOP-9737 the created jar and its content is not purged after the JVM is destroyed. Since most configurations point {{hadoop.tmp.dir}} under {{/tmp}} the generated jar files get purged by {{tmpwatch}} or a similar tool, but boxes that have {{hadoop.tmp.dir}} pointing to a different location not monitored by {{tmpwatch}} will pile up a collection of jars causing all kind of issues. Since {{JarFinder#getJar}} is not a public API from Hadoop (see [~tucu00] comment on HADOOP-9737) we shouldn't use that as part of {{TableMapReduceUtil}} in order to avoid this kind of issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12219) Cache more efficiently getAll() and get() in FSTableDescriptors
[ https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191195#comment-14191195 ] Hadoop QA commented on HBASE-12219: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12678360/HBASE-12219.v3.patch against trunk revision . ATTACHMENT ID: 12678360 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 8 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.coprocessor.TestCoprocessorHConnection {color:red}-1 core zombie tests{color}. There are 1 zombie test(s): at org.apache.hadoop.hdfs.TestDecommission.testDecommission(TestDecommission.java:574) at org.apache.hadoop.hdfs.TestDecommission.testDecommissionFederation(TestDecommission.java:422) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/11530//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11530//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11530//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11530//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11530//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11530//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11530//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11530//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11530//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11530//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11530//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/11530//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/11530//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/11530//console This message is automatically generated. Cache more efficiently getAll() and get() in FSTableDescriptors --- Key: HBASE-12219 URL: https://issues.apache.org/jira/browse/HBASE-12219 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.94.24, 0.99.1, 0.98.6.1 Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Labels: scalability Attachments: HBASE-12219-v1.patch, HBASE-12219-v1.patch, HBASE-12219.v0.txt, HBASE-12219.v2.patch, HBASE-12219.v3.patch, list.png Currently table descriptors and tables are cached once they are accessed for the first time. Next calls to the master only require a trip to HDFS to lookup the modified time in order to reload the table descriptors if modified. However in clusters with a large number of tables or concurrent clients and this can be too aggressive to HDFS and the master causing contention to process other requests. A simple solution is to have a TTL based cached for
[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk
[ https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhangduo updated HBASE-10201: - Attachment: HBASE-10201_5.patch rebase since master's HEAD has been moved. Port 'Make flush decisions per column family' to trunk -- Key: HBASE-10201 URL: https://issues.apache.org/jira/browse/HBASE-10201 Project: HBase Issue Type: Improvement Components: wal Reporter: Ted Yu Assignee: zhangduo Priority: Critical Fix For: 2.0.0, 0.99.2 Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_2.patch, HBASE-10201_3.patch, HBASE-10201_4.patch, HBASE-10201_5.patch Currently the flush decision is made using the aggregate size of all column families. When large and small column families co-exist, this causes many small flushes of the smaller CF. We need to make per-CF flush decisions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-10201) Port 'Make flush decisions per column family' to trunk
[ https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191203#comment-14191203 ] stack commented on HBASE-10201: --- [~gaurav.menghani] Thank you for helping land this upstream and thanks for the update on its state at your shop. What about the recording of last-flushed-sequenceid at the master so it knows what it can safely skip replaying edits on crash for a region; would that only report '1' in our scenario above? Thanks. [~Apache9] Thanks for the new patch. I think I need to go through and check sequenceid accounting. Port 'Make flush decisions per column family' to trunk -- Key: HBASE-10201 URL: https://issues.apache.org/jira/browse/HBASE-10201 Project: HBase Issue Type: Improvement Components: wal Reporter: Ted Yu Assignee: zhangduo Priority: Critical Fix For: 2.0.0, 0.99.2 Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_2.patch, HBASE-10201_3.patch, HBASE-10201_4.patch, HBASE-10201_5.patch Currently the flush decision is made using the aggregate size of all column families. When large and small column families co-exist, this causes many small flushes of the smaller CF. We need to make per-CF flush decisions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HBASE-11819) Unit test for CoprocessorHConnection
[ https://issues.apache.org/jira/browse/HBASE-11819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack reopened HBASE-11819: --- Reopened. Reverted from branch-1+. Failed here: https://builds.apache.org/job/PreCommit-HBASE-Build/11530//testReport/org.apache.hadoop.hbase.coprocessor/TestCoprocessorHConnection/testHConnection/ Unit test for CoprocessorHConnection - Key: HBASE-11819 URL: https://issues.apache.org/jira/browse/HBASE-11819 Project: HBase Issue Type: Test Reporter: Andrew Purtell Assignee: Talat UYARER Priority: Minor Labels: newbie++ Fix For: 2.0.0, 0.99.2 Attachments: HBASE-11819.patch, HBASE-11819v2.patch, HBASE-11819v3.patch Add a unit test to hbase-server that exercises CoprocessorHConnection . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-9003) TableMapReduceUtil should not rely on org.apache.hadoop.util.JarFinder#getJar
[ https://issues.apache.org/jira/browse/HBASE-9003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191209#comment-14191209 ] stack commented on HBASE-9003: -- Good by you [~esteban]? TableMapReduceUtil should not rely on org.apache.hadoop.util.JarFinder#getJar - Key: HBASE-9003 URL: https://issues.apache.org/jira/browse/HBASE-9003 Project: HBase Issue Type: Bug Components: mapreduce Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Fix For: 2.0.0, 0.99.2 Attachments: HBASE-9003.v0.patch, HBASE-9003.v1.patch, HBASE-9003.v2.patch, HBASE-9003.v2.patch, HBASE-9003.v3.patch This is the problem: {{TableMapReduceUtil#addDependencyJars}} relies on {{org.apache.hadoop.util.JarFinder}} if available to call {{getJar()}}. However {{getJar()}} uses File.createTempFile() to create a temporary file under {{hadoop.tmp.dir}}{{/target/test-dir}}. Due HADOOP-9737 the created jar and its content is not purged after the JVM is destroyed. Since most configurations point {{hadoop.tmp.dir}} under {{/tmp}} the generated jar files get purged by {{tmpwatch}} or a similar tool, but boxes that have {{hadoop.tmp.dir}} pointing to a different location not monitored by {{tmpwatch}} will pile up a collection of jars causing all kind of issues. Since {{JarFinder#getJar}} is not a public API from Hadoop (see [~tucu00] comment on HADOOP-9737) we shouldn't use that as part of {{TableMapReduceUtil}} in order to avoid this kind of issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12388) Document that WALObservers don't get empty edits.
[ https://issues.apache.org/jira/browse/HBASE-12388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-12388: -- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Pushed to branch-1+. Thanks [~busbey] Document that WALObservers don't get empty edits. - Key: HBASE-12388 URL: https://issues.apache.org/jira/browse/HBASE-12388 Project: HBase Issue Type: Task Components: wal Reporter: Sean Busbey Assignee: Sean Busbey Fix For: 2.0.0, 0.99.2 Attachments: HBASE-12388.1.patch.txt in branch-1+, WALObservers don't get any notice of WALEdits that return true for isEmpty(). Make sure this is noted in the docs. It was surprising while I was writing a test, and it's a different edge case than in 0.98. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12389) Reduce the number of versions configured for the ACL table
[ https://issues.apache.org/jira/browse/HBASE-12389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191221#comment-14191221 ] Anoop Sam John commented on HBASE-12389: 1 should be fine ? Reduce the number of versions configured for the ACL table -- Key: HBASE-12389 URL: https://issues.apache.org/jira/browse/HBASE-12389 Project: HBase Issue Type: Improvement Components: security Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Fix For: 2.0.0, 0.98.8, 0.99.2 We recently reduced the number of versions kept for entries in META from 10 to 3. This same arbitrary constant was used for the ACL table definition in AccessControlLists. We should change this to 3 also, or even 1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HBASE-12389) Reduce the number of versions configured for the ACL table
[ https://issues.apache.org/jira/browse/HBASE-12389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191221#comment-14191221 ] Anoop Sam John edited comment on HBASE-12389 at 10/31/14 2:41 AM: -- +1 for versions=1 was (Author: anoop.hbase): 1 should be fine ? Reduce the number of versions configured for the ACL table -- Key: HBASE-12389 URL: https://issues.apache.org/jira/browse/HBASE-12389 Project: HBase Issue Type: Improvement Components: security Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Fix For: 2.0.0, 0.98.8, 0.99.2 We recently reduced the number of versions kept for entries in META from 10 to 3. This same arbitrary constant was used for the ACL table definition in AccessControlLists. We should change this to 3 also, or even 1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11819) Unit test for CoprocessorHConnection
[ https://issues.apache.org/jira/browse/HBASE-11819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191223#comment-14191223 ] Hudson commented on HBASE-11819: FAILURE: Integrated in HBase-1.0 #396 (See [https://builds.apache.org/job/HBase-1.0/396/]) HBASE-11819 Unit test for CoprocessorHConnection (Talat Uyarer) -- REVERT. Failed in a test run here: https://builds.apache.org/job/PreCommit-HBASE-Build/11530//testReport/ (stack: rev c1ec92adc996738b9a88dd414418ae15bf0f74d9) * hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorHConnection.java Unit test for CoprocessorHConnection - Key: HBASE-11819 URL: https://issues.apache.org/jira/browse/HBASE-11819 Project: HBase Issue Type: Test Reporter: Andrew Purtell Assignee: Talat UYARER Priority: Minor Labels: newbie++ Fix For: 2.0.0, 0.99.2 Attachments: HBASE-11819.patch, HBASE-11819v2.patch, HBASE-11819v3.patch Add a unit test to hbase-server that exercises CoprocessorHConnection . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12388) Document that WALObservers don't get empty edits.
[ https://issues.apache.org/jira/browse/HBASE-12388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191222#comment-14191222 ] Hudson commented on HBASE-12388: FAILURE: Integrated in HBase-1.0 #396 (See [https://builds.apache.org/job/HBase-1.0/396/]) HBASE-12388 Document behavior wrt coprocessors when wal gets empty waledits. (stack: rev f5e3b3005819ef5439682e1182e9145150884eb4) * hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestWALObserver.java * hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/WALObserver.java * hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/SampleRegionWALObserver.java Document that WALObservers don't get empty edits. - Key: HBASE-12388 URL: https://issues.apache.org/jira/browse/HBASE-12388 Project: HBase Issue Type: Task Components: wal Reporter: Sean Busbey Assignee: Sean Busbey Fix For: 2.0.0, 0.99.2 Attachments: HBASE-12388.1.patch.txt in branch-1+, WALObservers don't get any notice of WALEdits that return true for isEmpty(). Make sure this is noted in the docs. It was surprising while I was writing a test, and it's a different edge case than in 0.98. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11819) Unit test for CoprocessorHConnection
[ https://issues.apache.org/jira/browse/HBASE-11819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191250#comment-14191250 ] stack commented on HBASE-11819: --- It failed here too: https://builds.apache.org/view/H-L/view/HBase/job/HBase-1.0/395/ Unit test for CoprocessorHConnection - Key: HBASE-11819 URL: https://issues.apache.org/jira/browse/HBASE-11819 Project: HBase Issue Type: Test Reporter: Andrew Purtell Assignee: Talat UYARER Priority: Minor Labels: newbie++ Fix For: 2.0.0, 0.99.2 Attachments: HBASE-11819.patch, HBASE-11819v2.patch, HBASE-11819v3.patch Add a unit test to hbase-server that exercises CoprocessorHConnection . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-12390) Change revision style from svn to git
Enis Soztutar created HBASE-12390: - Summary: Change revision style from svn to git Key: HBASE-12390 URL: https://issues.apache.org/jira/browse/HBASE-12390 Project: HBase Issue Type: Improvement Reporter: Enis Soztutar Assignee: Enis Soztutar Priority: Minor Fix For: 2.0.0, 0.99.2 This was bothering me. We should change the {{-r revision_id}} style that is an svn thing. We can do: {code} 2.0.0-SNAPSHOT, revision=64b6109ce917a47e4fa4b88cdb800bcc7a228484 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12390) Change revision style from svn to git
[ https://issues.apache.org/jira/browse/HBASE-12390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-12390: -- Attachment: hbase-12390_v1.patch Here is a patch which changes -r to revision in VersionInfo and web UIs. Change revision style from svn to git - Key: HBASE-12390 URL: https://issues.apache.org/jira/browse/HBASE-12390 Project: HBase Issue Type: Improvement Reporter: Enis Soztutar Assignee: Enis Soztutar Priority: Minor Fix For: 2.0.0, 0.99.2 Attachments: hbase-12390_v1.patch This was bothering me. We should change the {{-r revision_id}} style that is an svn thing. We can do: {code} 2.0.0-SNAPSHOT, revision=64b6109ce917a47e4fa4b88cdb800bcc7a228484 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12390) Change revision style from svn to git
[ https://issues.apache.org/jira/browse/HBASE-12390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-12390: -- Status: Patch Available (was: Open) Change revision style from svn to git - Key: HBASE-12390 URL: https://issues.apache.org/jira/browse/HBASE-12390 Project: HBase Issue Type: Improvement Reporter: Enis Soztutar Assignee: Enis Soztutar Priority: Minor Fix For: 2.0.0, 0.99.2 Attachments: hbase-12390_v1.patch This was bothering me. We should change the {{-r revision_id}} style that is an svn thing. We can do: {code} 2.0.0-SNAPSHOT, revision=64b6109ce917a47e4fa4b88cdb800bcc7a228484 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-9003) TableMapReduceUtil should not rely on org.apache.hadoop.util.JarFinder#getJar
[ https://issues.apache.org/jira/browse/HBASE-9003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191254#comment-14191254 ] Esteban Gutierrez commented on HBASE-9003: -- That works [~stack]. And yes, the tempJar.deleteOnExit() is mainly what we need to clean up the temp files. Thanks [~ndimiduk]! TableMapReduceUtil should not rely on org.apache.hadoop.util.JarFinder#getJar - Key: HBASE-9003 URL: https://issues.apache.org/jira/browse/HBASE-9003 Project: HBase Issue Type: Bug Components: mapreduce Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Fix For: 2.0.0, 0.99.2 Attachments: HBASE-9003.v0.patch, HBASE-9003.v1.patch, HBASE-9003.v2.patch, HBASE-9003.v2.patch, HBASE-9003.v3.patch This is the problem: {{TableMapReduceUtil#addDependencyJars}} relies on {{org.apache.hadoop.util.JarFinder}} if available to call {{getJar()}}. However {{getJar()}} uses File.createTempFile() to create a temporary file under {{hadoop.tmp.dir}}{{/target/test-dir}}. Due HADOOP-9737 the created jar and its content is not purged after the JVM is destroyed. Since most configurations point {{hadoop.tmp.dir}} under {{/tmp}} the generated jar files get purged by {{tmpwatch}} or a similar tool, but boxes that have {{hadoop.tmp.dir}} pointing to a different location not monitored by {{tmpwatch}} will pile up a collection of jars causing all kind of issues. Since {{JarFinder#getJar}} is not a public API from Hadoop (see [~tucu00] comment on HADOOP-9737) we shouldn't use that as part of {{TableMapReduceUtil}} in order to avoid this kind of issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)