[jira] [Commented] (HBASE-10272) Cluster becomes nonoperational if the node hosting the active Master AND ROOT/META table goes offline
[ https://issues.apache.org/jira/browse/HBASE-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13862853#comment-13862853 ] Aditya Kishore commented on HBASE-10272: [~lhofhansl], This should go into next 0.94 release. Cluster becomes nonoperational if the node hosting the active Master AND ROOT/META table goes offline - Key: HBASE-10272 URL: https://issues.apache.org/jira/browse/HBASE-10272 Project: HBase Issue Type: Bug Components: IPC/RPC Affects Versions: 0.96.1, 0.94.15 Reporter: Aditya Kishore Assignee: Aditya Kishore Priority: Critical Fix For: 0.98.0, 0.99.0 Attachments: HBASE-10272.patch, HBASE-10272_0.94.patch Since HBASE-6364, HBase client caches a connection failure to a server and any subsequent attempt to connect to the server throws a {{FailedServerException}} Now if a node which hosted the active Master AND ROOT/META table goes offline, the newly anointed Master's initial attempt to connect to the dead region server will fail with {{NoRouteToHostException}} which it handles but since on second attempt crashes with {{FailedServerException}} Here is the log from one such occurance {noformat} 2013-11-20 10:58:00,161 FATAL org.apache.hadoop.hbase.master.HMaster: Master server abort: loaded coprocessors are: [] 2013-11-20 10:58:00,161 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown. org.apache.hadoop.hbase.ipc.HBaseClient$FailedServerException: This server is in the failed servers list: xxx02/192.168.1.102:60020 at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:425) at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1124) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:974) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:86) at $Proxy9.getProtocolVersion(Unknown Source) at org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:138) at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:208) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1335) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1294) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1281) at org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:506) at org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:383) at org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMeta(CatalogTracker.java:445) at org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMetaServerConnection(CatalogTracker.java:464) at org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:624) at org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:684) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:560) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:376) at java.lang.Thread.run(Thread.java:662) 2013-11-20 10:58:00,162 INFO org.apache.hadoop.hbase.master.HMaster: Aborting 2013-11-20 10:58:00,162 INFO org.apache.hadoop.ipc.HBaseServer: Stopping server on 6 {noformat} Each of the backup master will crash with same error and restarting them will have the same effect. Once this happens, the cluster will remain in-operational until the node with region server is brought online (or the Zookeeper node containing the root region server and/or META entry from the ROOT table is deleted). -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-9846) Integration test and LoadTestTool support for cell ACLs
[ https://issues.apache.org/jira/browse/HBASE-9846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-9846: -- Status: Patch Available (was: Open) Integration test and LoadTestTool support for cell ACLs --- Key: HBASE-9846 URL: https://issues.apache.org/jira/browse/HBASE-9846 Project: HBase Issue Type: Sub-task Affects Versions: 0.98.0 Reporter: Andrew Purtell Assignee: ramkrishna.s.vasudevan Fix For: 0.98.0 Attachments: HBASE-9846.patch, HBASE-9846_1.patch, HBASE-9846_2.patch Cell level ACLs should have an integration test and LoadTestTool support. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-9846) Integration test and LoadTestTool support for cell ACLs
[ https://issues.apache.org/jira/browse/HBASE-9846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-9846: -- Status: Open (was: Patch Available) Integration test and LoadTestTool support for cell ACLs --- Key: HBASE-9846 URL: https://issues.apache.org/jira/browse/HBASE-9846 Project: HBase Issue Type: Sub-task Affects Versions: 0.98.0 Reporter: Andrew Purtell Assignee: ramkrishna.s.vasudevan Fix For: 0.98.0 Attachments: HBASE-9846.patch, HBASE-9846_1.patch, HBASE-9846_2.patch Cell level ACLs should have an integration test and LoadTestTool support. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-9846) Integration test and LoadTestTool support for cell ACLs
[ https://issues.apache.org/jira/browse/HBASE-9846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-9846: -- Attachment: HBASE-9846_2.patch Latest patch. Tries to create subclasses for all the readers/writers/updater threads. Creates table objects and performs the mutations and the gets as that specific user. Integration test and LoadTestTool support for cell ACLs --- Key: HBASE-9846 URL: https://issues.apache.org/jira/browse/HBASE-9846 Project: HBase Issue Type: Sub-task Affects Versions: 0.98.0 Reporter: Andrew Purtell Assignee: ramkrishna.s.vasudevan Fix For: 0.98.0 Attachments: HBASE-9846.patch, HBASE-9846_1.patch, HBASE-9846_2.patch Cell level ACLs should have an integration test and LoadTestTool support. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-9846) Integration test and LoadTestTool support for cell ACLs
[ https://issues.apache.org/jira/browse/HBASE-9846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13862943#comment-13862943 ] ramkrishna.s.vasudevan commented on HBASE-9846: --- https://reviews.apache.org/r/16646/ - RB link. Integration test and LoadTestTool support for cell ACLs --- Key: HBASE-9846 URL: https://issues.apache.org/jira/browse/HBASE-9846 Project: HBase Issue Type: Sub-task Affects Versions: 0.98.0 Reporter: Andrew Purtell Assignee: ramkrishna.s.vasudevan Fix For: 0.98.0 Attachments: HBASE-9846.patch, HBASE-9846_1.patch, HBASE-9846_2.patch Cell level ACLs should have an integration test and LoadTestTool support. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-9846) Integration test and LoadTestTool support for cell ACLs
[ https://issues.apache.org/jira/browse/HBASE-9846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13862987#comment-13862987 ] Hadoop QA commented on HBASE-9846: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12621590/HBASE-9846_2.patch against trunk revision . ATTACHMENT ID: 12621590 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 37 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop1.1{color}. The patch compiles against the hadoop 1.1 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:red}-1 release audit{color}. The applied patch generated 4 release audit warnings (more than the trunk's current 0 warnings). {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: + Batch.CallAccessControlService, GrantResponse callable = new Batch.CallAccessControlService, GrantResponse() { + AccessControlProtos.TablePermission.Builder permissionBuilder = AccessControlProtos.TablePermission +AccessControlProtos.Permission.Action.READ, AccessControlProtos.Permission.Action.WRITE }; + AccessControlClient.grant(conf, table, userOwner.getShortName(), COLUMN_FAMILY, null, actions); +updaterThreads = new MultiThreadedUpdaterWithACL(dataGen, conf, tableName, updatePercent, userOwner); +LOG.error(Failed to mutate: + keyBase + after + (System.currentTimeMillis() - start) + +private void recordFailure(final Mutation m, final long keyBase, final long start, IOException e) { +LOG.error(Failed to insert: + keyBase + after + (System.currentTimeMillis() - start) + {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8346//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8346//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8346//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8346//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8346//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8346//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8346//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8346//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8346//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8346//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8346//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8346//console This message is automatically generated. Integration test and LoadTestTool support for cell ACLs --- Key: HBASE-9846 URL: https://issues.apache.org/jira/browse/HBASE-9846 Project: HBase Issue Type: Sub-task Affects Versions: 0.98.0 Reporter: Andrew Purtell Assignee: ramkrishna.s.vasudevan Fix For: 0.98.0 Attachments: HBASE-9846.patch, HBASE-9846_1.patch, HBASE-9846_2.patch Cell level ACLs should have an integration test and LoadTestTool support. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Assigned] (HBASE-10278) Provide better write predictability
[ https://issues.apache.org/jira/browse/HBASE-10278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Himanshu Vashishtha reassigned HBASE-10278: --- Assignee: Himanshu Vashishtha Provide better write predictability --- Key: HBASE-10278 URL: https://issues.apache.org/jira/browse/HBASE-10278 Project: HBase Issue Type: New Feature Reporter: Himanshu Vashishtha Assignee: Himanshu Vashishtha Attachments: Multiwaldesigndoc.pdf Currently, HBase has one WAL per region server. Whenever there is any latency in the write pipeline (due to whatever reasons such as n/w blip, a node in the pipeline having a bad disk, etc), the overall write latency suffers. Jonathan Hsieh and I analyzed various approaches to tackle this issue. We also looked at HBASE-5699, which talks about adding concurrent multi WALs. Along with performance numbers, we also focussed on design simplicity, minimum impact on MTTR Replication, and compatibility with 0.96 and 0.98. Considering all these parameters, we propose a new HLog implementation with WAL Switching functionality. Please find attached the design doc for the same. It introduces the WAL Switching feature, and experiments/results of a prototype implementation, showing the benefits of this feature. The second goal of this work is to serve as a building block for concurrent multiple WALs feature. Please review the doc. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10210) during master startup, RS can be you-are-dead-ed by master in error
[ https://issues.apache.org/jira/browse/HBASE-10210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863109#comment-13863109 ] Sergey Shelukhin commented on HBASE-10210: -- So far, in trunk and 98. [~stack] ping? during master startup, RS can be you-are-dead-ed by master in error --- Key: HBASE-10210 URL: https://issues.apache.org/jira/browse/HBASE-10210 Project: HBase Issue Type: Bug Affects Versions: 0.98.0, 0.96.1, 0.99.0, 0.96.1.1 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.98.0, 0.99.0 Attachments: HBASE-10210.01.patch, HBASE-10210.02.patch, HBASE-10210.03.patch, HBASE-10210.04.patch, HBASE-10210.05.patch, HBASE-10210.patch Not sure of the root cause yet, I am at how did this ever work stage. We see this problem in 0.96.1, but didn't in 0.96.0 + some patches. It looks like RS information arriving from 2 sources - ZK and server itself, can conflict. Master doesn't handle such cases (timestamp match), and anyway technically timestamps can collide for two separate servers. So, master YouAreDead-s the already-recorded reporting RS, and adds it too. Then it discovers that the new server has died with fatal error! Note the threads. Addition is called from master initialization and from RPC. {noformat} 2013-12-19 11:16:45,290 INFO [master:h2-ubuntu12-sec-1387431063-hbase-10:6] master.ServerManager: Finished waiting for region servers count to settle; checked in 2, slept for 18262 ms, expecting minimum of 1, maximum of 2147483647, master is running. 2013-12-19 11:16:45,290 INFO [master:h2-ubuntu12-sec-1387431063-hbase-10:6] master.ServerManager: Registering server=h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800 2013-12-19 11:16:45,290 INFO [master:h2-ubuntu12-sec-1387431063-hbase-10:6] master.HMaster: Registered server found up in zk but who has not yet reported in: h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800 2013-12-19 11:16:45,380 INFO [RpcServer.handler=4,port=6] master.ServerManager: Triggering server recovery; existingServer h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800 looks stale, new server:h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800 2013-12-19 11:16:45,380 INFO [RpcServer.handler=4,port=6] master.ServerManager: Master doesn't enable ServerShutdownHandler during initialization, delay expiring server h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800 ... 2013-12-19 11:16:46,925 ERROR [RpcServer.handler=7,port=6] master.HMaster: Region server h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800 reported a fatal error: ABORTING region server h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800: org.apache.hadoop.hbase.YouAreDeadException: Server REPORT rejected; currently processing h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800 as dead server {noformat} Presumably some of the recent ZK listener related changes b -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10130) TestSplitLogManager#testTaskResigned fails sometimes
[ https://issues.apache.org/jira/browse/HBASE-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863166#comment-13863166 ] Jeffrey Zhong commented on HBASE-10130: --- {code} int version1 = ZKUtil.checkExists(zkw, tasknode); assertTrue(version1= + version1 + , version= + version, version1 version); {code} The patch fix looks good to me though I'd prefer to remove the above two lines because 1)the risking condition on getting the original version 2) when the following verification is true: {code} assertEquals(tot_mgr_resubmit.get(), 1); {code} The znode version is also verified to be bumped up. TestSplitLogManager#testTaskResigned fails sometimes Key: HBASE-10130 URL: https://issues.apache.org/jira/browse/HBASE-10130 Project: HBase Issue Type: Test Reporter: Ted Yu Assignee: Ted Yu Priority: Minor Attachments: 10130-output.txt, 10130-v1.txt The test failed in https://builds.apache.org/job/PreCommit-HBASE-Build/8131//testReport For testTaskResigned() : {code} int version = ZKUtil.checkExists(zkw, tasknode); // Could be small race here. if (tot_mgr_resubmit.get() == 0) waitForCounter(tot_mgr_resubmit, 0, 1, to/2); {code} There was no log similar to the following (corresponding to waitForCounter() call above): {code} 2013-12-10 21:23:54,905 INFO [main] hbase.Waiter(174): Waiting up to [3,200] milli-secs(wait.for.ratio=[1]) {code} Meaning, the version (2) retrieved corresponded to resubmitted task. version1 retrieved same value, leading to assertion failure. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10130) TestSplitLogManager#testTaskResigned fails sometimes
[ https://issues.apache.org/jira/browse/HBASE-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-10130: --- Attachment: 10130-v2.txt Patch v2 addresses Jeff's comment. TestSplitLogManager#testTaskResigned fails sometimes Key: HBASE-10130 URL: https://issues.apache.org/jira/browse/HBASE-10130 Project: HBase Issue Type: Test Reporter: Ted Yu Assignee: Ted Yu Priority: Minor Attachments: 10130-output.txt, 10130-v1.txt, 10130-v2.txt The test failed in https://builds.apache.org/job/PreCommit-HBASE-Build/8131//testReport For testTaskResigned() : {code} int version = ZKUtil.checkExists(zkw, tasknode); // Could be small race here. if (tot_mgr_resubmit.get() == 0) waitForCounter(tot_mgr_resubmit, 0, 1, to/2); {code} There was no log similar to the following (corresponding to waitForCounter() call above): {code} 2013-12-10 21:23:54,905 INFO [main] hbase.Waiter(174): Waiting up to [3,200] milli-secs(wait.for.ratio=[1]) {code} Meaning, the version (2) retrieved corresponded to resubmitted task. version1 retrieved same value, leading to assertion failure. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10078) Dynamic Filter - Not using DynamicClassLoader when using FilterList
[ https://issues.apache.org/jira/browse/HBASE-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863184#comment-13863184 ] Hadoop QA commented on HBASE-10078: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12621634/0.94-10078_v2.patch against trunk revision . ATTACHMENT ID: 12621634 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8347//console This message is automatically generated. Dynamic Filter - Not using DynamicClassLoader when using FilterList --- Key: HBASE-10078 URL: https://issues.apache.org/jira/browse/HBASE-10078 Project: HBase Issue Type: Bug Components: Filters Affects Versions: 0.94.13 Reporter: Federico Gaule Assignee: Jimmy Xiang Priority: Minor Attachments: 0.94-10078.patch, 0.94-10078_v2.patch, hbase-10078.patch I've tried to use dynamic jar load (https://issues.apache.org/jira/browse/HBASE-1936) but seems to have an issue with FilterList. Here is some log from my app where i send a Get with a FilterList containing AFilter and other with BFilter. {noformat} 2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Class d.p.AFilter not found - using dynamical class loader 2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Finding class: d.p.AFilter 2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Loading new jar files, if any 2013-12-02 13:55:42,677 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Finding class again: d.p.AFilter 2013-12-02 13:55:43,004 ERROR org.apache.hadoop.hbase.io.HbaseObjectWritable: Can't find class d.p.BFilter java.lang.ClassNotFoundException: d.p.BFilter at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:820) at org.apache.hadoop.hbase.io.HbaseObjectWritable.getClassByName(HbaseObjectWritable.java:792) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:679) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594) at org.apache.hadoop.hbase.filter.FilterList.readFields(FilterList.java:324) at org.apache.hadoop.hbase.client.Get.readFields(Get.java:405) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594) at org.apache.hadoop.hbase.client.Action.readFields(Action.java:101) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594) at org.apache.hadoop.hbase.client.MultiAction.readFields(MultiAction.java:116) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690) at org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:126) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1311) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1226) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:748) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:539) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:514) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {noformat} AFilter is not found so it tries with DynamicClassLoader, but when it tries to load AFilter, it uses URLClassLoader and fails without checking out for dynamic jars. I
[jira] [Updated] (HBASE-10078) Dynamic Filter - Not using DynamicClassLoader when using FilterList
[ https://issues.apache.org/jira/browse/HBASE-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-10078: Attachment: 0.94-10078_v2.patch Minor change to the 0.94 patch Dynamic Filter - Not using DynamicClassLoader when using FilterList --- Key: HBASE-10078 URL: https://issues.apache.org/jira/browse/HBASE-10078 Project: HBase Issue Type: Bug Components: Filters Affects Versions: 0.94.13 Reporter: Federico Gaule Assignee: Jimmy Xiang Priority: Minor Attachments: 0.94-10078.patch, 0.94-10078_v2.patch, hbase-10078.patch I've tried to use dynamic jar load (https://issues.apache.org/jira/browse/HBASE-1936) but seems to have an issue with FilterList. Here is some log from my app where i send a Get with a FilterList containing AFilter and other with BFilter. {noformat} 2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Class d.p.AFilter not found - using dynamical class loader 2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Finding class: d.p.AFilter 2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Loading new jar files, if any 2013-12-02 13:55:42,677 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Finding class again: d.p.AFilter 2013-12-02 13:55:43,004 ERROR org.apache.hadoop.hbase.io.HbaseObjectWritable: Can't find class d.p.BFilter java.lang.ClassNotFoundException: d.p.BFilter at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:820) at org.apache.hadoop.hbase.io.HbaseObjectWritable.getClassByName(HbaseObjectWritable.java:792) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:679) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594) at org.apache.hadoop.hbase.filter.FilterList.readFields(FilterList.java:324) at org.apache.hadoop.hbase.client.Get.readFields(Get.java:405) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594) at org.apache.hadoop.hbase.client.Action.readFields(Action.java:101) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594) at org.apache.hadoop.hbase.client.MultiAction.readFields(MultiAction.java:116) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690) at org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:126) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1311) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1226) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:748) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:539) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:514) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {noformat} AFilter is not found so it tries with DynamicClassLoader, but when it tries to load AFilter, it uses URLClassLoader and fails without checking out for dynamic jars. I think the issue is releated to FilterList#readFields {code:title=FilterList.java|borderStyle=solid} public void readFields(final DataInput in) throws IOException { byte opByte = in.readByte(); operator = Operator.values()[opByte]; int size = in.readInt(); if (size 0) { filters = new ArrayListFilter(size); for (int i = 0; i size; i++) { Filter filter = (Filter)HbaseObjectWritable.readObject(in, conf); filters.add(filter); } } } {code} HbaseObjectWritable#readObject uses a conf (created by calling
[jira] [Commented] (HBASE-10263) make LruBlockCache single/multi/in-memory ratio user-configurable and provide preemptive mode for in-memory type block
[ https://issues.apache.org/jira/browse/HBASE-10263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863181#comment-13863181 ] Nick Dimiduk commented on HBASE-10263: -- Evictions happen on a background thread. Filling the cache and then immediately checking the eviction count results in a race between the current thread and the eviction thread; thus this is very likely a flakey test on our over-extended build machines. {noformat} +// 5th single block +cache.cacheBlock(singleBlocks[4].cacheKey, singleBlocks[4]); +expectedCacheSize += singleBlocks[4].cacheBlockHeapSize(); +// Do not expect any evictions yet +assertEquals(0, cache.getEvictionCount()); +// Verify cache size +assertEquals(expectedCacheSize, cache.heapSize()); {noformat} In the above block, the call to cacheBlock() will only notify the eviction thread, not force eviction. A yield or short sleep should be inserted before the call to getEvictionCount() in order to help reduce the chance of exercising the race condition. Repeat for all the following stanzas. make LruBlockCache single/multi/in-memory ratio user-configurable and provide preemptive mode for in-memory type block -- Key: HBASE-10263 URL: https://issues.apache.org/jira/browse/HBASE-10263 Project: HBase Issue Type: Improvement Components: io Reporter: Feng Honghua Assignee: Feng Honghua Attachments: HBASE-10263-trunk_v0.patch, HBASE-10263-trunk_v1.patch currently the single/multi/in-memory ratio in LruBlockCache is hardcoded 1:2:1, which can lead to somewhat counter-intuition behavior for some user scenario where in-memory table's read performance is much worse than ordinary table when two tables' data size is almost equal and larger than regionserver's cache size (we ever did some such experiment and verified that in-memory table random read performance is two times worse than ordinary table). this patch fixes above issue and provides: 1. make single/multi/in-memory ratio user-configurable 2. provide a configurable switch which can make in-memory block preemptive, by preemptive means when this switch is on in-memory block can kick out any ordinary block to make room until no ordinary block, when this switch is off (by default) the behavior is the same as previous, using single/multi/in-memory ratio to determine evicting. by default, above two changes are both off and the behavior keeps the same as before applying this patch. it's client/user's choice to determine whether or which behavior to use by enabling one of these two enhancements. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10130) TestSplitLogManager#testTaskResigned fails sometimes
[ https://issues.apache.org/jira/browse/HBASE-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863180#comment-13863180 ] Jeffrey Zhong commented on HBASE-10130: --- +1. Thanks! TestSplitLogManager#testTaskResigned fails sometimes Key: HBASE-10130 URL: https://issues.apache.org/jira/browse/HBASE-10130 Project: HBase Issue Type: Test Reporter: Ted Yu Assignee: Ted Yu Priority: Minor Attachments: 10130-output.txt, 10130-v1.txt, 10130-v2.txt The test failed in https://builds.apache.org/job/PreCommit-HBASE-Build/8131//testReport For testTaskResigned() : {code} int version = ZKUtil.checkExists(zkw, tasknode); // Could be small race here. if (tot_mgr_resubmit.get() == 0) waitForCounter(tot_mgr_resubmit, 0, 1, to/2); {code} There was no log similar to the following (corresponding to waitForCounter() call above): {code} 2013-12-10 21:23:54,905 INFO [main] hbase.Waiter(174): Waiting up to [3,200] milli-secs(wait.for.ratio=[1]) {code} Meaning, the version (2) retrieved corresponded to resubmitted task. version1 retrieved same value, leading to assertion failure. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10272) Cluster becomes nonoperational if the node hosting the active Master AND ROOT/META table goes offline
[ https://issues.apache.org/jira/browse/HBASE-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863189#comment-13863189 ] Lars Hofhansl commented on HBASE-10272: --- +1 for 0.94. [~stack], assume you want this in 0.96. Cluster becomes nonoperational if the node hosting the active Master AND ROOT/META table goes offline - Key: HBASE-10272 URL: https://issues.apache.org/jira/browse/HBASE-10272 Project: HBase Issue Type: Bug Components: IPC/RPC Affects Versions: 0.96.1, 0.94.15 Reporter: Aditya Kishore Assignee: Aditya Kishore Priority: Critical Fix For: 0.98.0, 0.99.0 Attachments: HBASE-10272.patch, HBASE-10272_0.94.patch Since HBASE-6364, HBase client caches a connection failure to a server and any subsequent attempt to connect to the server throws a {{FailedServerException}} Now if a node which hosted the active Master AND ROOT/META table goes offline, the newly anointed Master's initial attempt to connect to the dead region server will fail with {{NoRouteToHostException}} which it handles but since on second attempt crashes with {{FailedServerException}} Here is the log from one such occurance {noformat} 2013-11-20 10:58:00,161 FATAL org.apache.hadoop.hbase.master.HMaster: Master server abort: loaded coprocessors are: [] 2013-11-20 10:58:00,161 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown. org.apache.hadoop.hbase.ipc.HBaseClient$FailedServerException: This server is in the failed servers list: xxx02/192.168.1.102:60020 at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:425) at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1124) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:974) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:86) at $Proxy9.getProtocolVersion(Unknown Source) at org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:138) at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:208) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1335) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1294) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1281) at org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:506) at org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:383) at org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMeta(CatalogTracker.java:445) at org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMetaServerConnection(CatalogTracker.java:464) at org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:624) at org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:684) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:560) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:376) at java.lang.Thread.run(Thread.java:662) 2013-11-20 10:58:00,162 INFO org.apache.hadoop.hbase.master.HMaster: Aborting 2013-11-20 10:58:00,162 INFO org.apache.hadoop.ipc.HBaseServer: Stopping server on 6 {noformat} Each of the backup master will crash with same error and restarting them will have the same effect. Once this happens, the cluster will remain in-operational until the node with region server is brought online (or the Zookeeper node containing the root region server and/or META entry from the ROOT table is deleted). -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10263) make LruBlockCache single/multi/in-memory ratio user-configurable and provide preemptive mode for in-memory type block
[ https://issues.apache.org/jira/browse/HBASE-10263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863192#comment-13863192 ] Nick Dimiduk commented on HBASE-10263: -- One other stray comment. Your config names should all be modified to reflect that these options are specific to the lru cache. So hbase.rs.inmemoryforcemode, hbase.blockcache.single.percentage, hbase.blockcache.multi.percentage, hbase.blockcache.memory.percentage should all share the common prefix of hbase.lru.blockcache. This follows the precedent established by the existing factor configs. make LruBlockCache single/multi/in-memory ratio user-configurable and provide preemptive mode for in-memory type block -- Key: HBASE-10263 URL: https://issues.apache.org/jira/browse/HBASE-10263 Project: HBase Issue Type: Improvement Components: io Reporter: Feng Honghua Assignee: Feng Honghua Attachments: HBASE-10263-trunk_v0.patch, HBASE-10263-trunk_v1.patch currently the single/multi/in-memory ratio in LruBlockCache is hardcoded 1:2:1, which can lead to somewhat counter-intuition behavior for some user scenario where in-memory table's read performance is much worse than ordinary table when two tables' data size is almost equal and larger than regionserver's cache size (we ever did some such experiment and verified that in-memory table random read performance is two times worse than ordinary table). this patch fixes above issue and provides: 1. make single/multi/in-memory ratio user-configurable 2. provide a configurable switch which can make in-memory block preemptive, by preemptive means when this switch is on in-memory block can kick out any ordinary block to make room until no ordinary block, when this switch is off (by default) the behavior is the same as previous, using single/multi/in-memory ratio to determine evicting. by default, above two changes are both off and the behavior keeps the same as before applying this patch. it's client/user's choice to determine whether or which behavior to use by enabling one of these two enhancements. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10272) Cluster becomes nonoperational if the node hosting the active Master AND ROOT/META table goes offline
[ https://issues.apache.org/jira/browse/HBASE-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-10272: -- Fix Version/s: 0.94.16 Cluster becomes nonoperational if the node hosting the active Master AND ROOT/META table goes offline - Key: HBASE-10272 URL: https://issues.apache.org/jira/browse/HBASE-10272 Project: HBase Issue Type: Bug Components: IPC/RPC Affects Versions: 0.96.1, 0.94.15 Reporter: Aditya Kishore Assignee: Aditya Kishore Priority: Critical Fix For: 0.98.0, 0.94.16, 0.99.0 Attachments: HBASE-10272.patch, HBASE-10272_0.94.patch Since HBASE-6364, HBase client caches a connection failure to a server and any subsequent attempt to connect to the server throws a {{FailedServerException}} Now if a node which hosted the active Master AND ROOT/META table goes offline, the newly anointed Master's initial attempt to connect to the dead region server will fail with {{NoRouteToHostException}} which it handles but since on second attempt crashes with {{FailedServerException}} Here is the log from one such occurance {noformat} 2013-11-20 10:58:00,161 FATAL org.apache.hadoop.hbase.master.HMaster: Master server abort: loaded coprocessors are: [] 2013-11-20 10:58:00,161 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown. org.apache.hadoop.hbase.ipc.HBaseClient$FailedServerException: This server is in the failed servers list: xxx02/192.168.1.102:60020 at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:425) at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1124) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:974) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:86) at $Proxy9.getProtocolVersion(Unknown Source) at org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:138) at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:208) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1335) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1294) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1281) at org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:506) at org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:383) at org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMeta(CatalogTracker.java:445) at org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMetaServerConnection(CatalogTracker.java:464) at org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:624) at org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:684) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:560) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:376) at java.lang.Thread.run(Thread.java:662) 2013-11-20 10:58:00,162 INFO org.apache.hadoop.hbase.master.HMaster: Aborting 2013-11-20 10:58:00,162 INFO org.apache.hadoop.ipc.HBaseServer: Stopping server on 6 {noformat} Each of the backup master will crash with same error and restarting them will have the same effect. Once this happens, the cluster will remain in-operational until the node with region server is brought online (or the Zookeeper node containing the root region server and/or META entry from the ROOT table is deleted). -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10272) Cluster becomes nonoperational if the node hosting the active Master AND ROOT/META table goes offline
[ https://issues.apache.org/jira/browse/HBASE-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863194#comment-13863194 ] Lars Hofhansl commented on HBASE-10272: --- Committed to 0.94. Cluster becomes nonoperational if the node hosting the active Master AND ROOT/META table goes offline - Key: HBASE-10272 URL: https://issues.apache.org/jira/browse/HBASE-10272 Project: HBase Issue Type: Bug Components: IPC/RPC Affects Versions: 0.96.1, 0.94.15 Reporter: Aditya Kishore Assignee: Aditya Kishore Priority: Critical Fix For: 0.98.0, 0.94.16, 0.99.0 Attachments: HBASE-10272.patch, HBASE-10272_0.94.patch Since HBASE-6364, HBase client caches a connection failure to a server and any subsequent attempt to connect to the server throws a {{FailedServerException}} Now if a node which hosted the active Master AND ROOT/META table goes offline, the newly anointed Master's initial attempt to connect to the dead region server will fail with {{NoRouteToHostException}} which it handles but since on second attempt crashes with {{FailedServerException}} Here is the log from one such occurance {noformat} 2013-11-20 10:58:00,161 FATAL org.apache.hadoop.hbase.master.HMaster: Master server abort: loaded coprocessors are: [] 2013-11-20 10:58:00,161 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown. org.apache.hadoop.hbase.ipc.HBaseClient$FailedServerException: This server is in the failed servers list: xxx02/192.168.1.102:60020 at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:425) at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1124) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:974) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:86) at $Proxy9.getProtocolVersion(Unknown Source) at org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:138) at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:208) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1335) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1294) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1281) at org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:506) at org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:383) at org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMeta(CatalogTracker.java:445) at org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMetaServerConnection(CatalogTracker.java:464) at org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:624) at org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:684) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:560) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:376) at java.lang.Thread.run(Thread.java:662) 2013-11-20 10:58:00,162 INFO org.apache.hadoop.hbase.master.HMaster: Aborting 2013-11-20 10:58:00,162 INFO org.apache.hadoop.ipc.HBaseServer: Stopping server on 6 {noformat} Each of the backup master will crash with same error and restarting them will have the same effect. Once this happens, the cluster will remain in-operational until the node with region server is brought online (or the Zookeeper node containing the root region server and/or META entry from the ROOT table is deleted). -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Resolved] (HBASE-10284) Build broken with svn 1.8
[ https://issues.apache.org/jira/browse/HBASE-10284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl resolved HBASE-10284. --- Resolution: Fixed Committed to all branches. Build broken with svn 1.8 - Key: HBASE-10284 URL: https://issues.apache.org/jira/browse/HBASE-10284 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.98.0, 0.94.16, 0.96.2, 0.99.0 Attachments: 10284.txt Just upgraded my machine and found that {{svn info}} displays a Relative URL: line in svn 1.8. saveVersion.sh does not deal with that correctly. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10174) Back port HBASE-9667 'NullOutputStream removed from Guava 15' to 0.94
[ https://issues.apache.org/jira/browse/HBASE-10174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-10174: -- Fix Version/s: (was: 0.94.16) 0.94.17 Back port HBASE-9667 'NullOutputStream removed from Guava 15' to 0.94 - Key: HBASE-10174 URL: https://issues.apache.org/jira/browse/HBASE-10174 Project: HBase Issue Type: Improvement Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.94.17 Attachments: 10174-v2.txt, 10174-v3.txt, 9667-0.94.patch On user mailing list under the thread 'Guava 15', Kristoffer Sjögren reported NoClassDefFoundError when he used Guava 15. The issue has been fixed in 0.96 + by HBASE-9667 This JIRA ports the fix to 0.94 branch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Resolved] (HBASE-10273) AssignmentManager.regions and AssignmentManager.servers are not always updated in tandem
[ https://issues.apache.org/jira/browse/HBASE-10273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl resolved HBASE-10273. --- Resolution: Fixed Hadoop Flags: Reviewed Committed to 0.94. Thanks [~fenghh]. AssignmentManager.regions and AssignmentManager.servers are not always updated in tandem Key: HBASE-10273 URL: https://issues.apache.org/jira/browse/HBASE-10273 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.94.16 Reporter: Feng Honghua Assignee: Feng Honghua Fix For: 0.94.16 Attachments: HBASE-10273-0.94_v0.patch, HBASE-10273-0.94_v1.patch By definition, AssignmentManager.servers and AssignmentManager.regions are tied and should be updated in tandem with each other under a lock on AssignmentManager.regions, but there are two places where this protocol is broken. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10273) AssignmentManager.regions and AssignmentManager.servers are not always updated in tandem with each other
[ https://issues.apache.org/jira/browse/HBASE-10273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-10273: -- Summary: AssignmentManager.regions and AssignmentManager.servers are not always updated in tandem with each other (was: AssignmentManager.regions(region to regionserver assignment map) and AssignmentManager.servers(regionserver to regions assignment map) are not always updated in tandem with each other) AssignmentManager.regions and AssignmentManager.servers are not always updated in tandem with each other Key: HBASE-10273 URL: https://issues.apache.org/jira/browse/HBASE-10273 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.94.16 Reporter: Feng Honghua Assignee: Feng Honghua Fix For: 0.94.16 Attachments: HBASE-10273-0.94_v0.patch, HBASE-10273-0.94_v1.patch By definition, AssignmentManager.servers and AssignmentManager.regions are tied and should be updated in tandem with each other under a lock on AssignmentManager.regions, but there are two places where this protocol is broken. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10273) AssignmentManager.regions and AssignmentManager.servers are not always updated in tandem
[ https://issues.apache.org/jira/browse/HBASE-10273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-10273: -- Summary: AssignmentManager.regions and AssignmentManager.servers are not always updated in tandem (was: AssignmentManager.regions and AssignmentManager.servers are not always updated in tandem with each other) AssignmentManager.regions and AssignmentManager.servers are not always updated in tandem Key: HBASE-10273 URL: https://issues.apache.org/jira/browse/HBASE-10273 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.94.16 Reporter: Feng Honghua Assignee: Feng Honghua Fix For: 0.94.16 Attachments: HBASE-10273-0.94_v0.patch, HBASE-10273-0.94_v1.patch By definition, AssignmentManager.servers and AssignmentManager.regions are tied and should be updated in tandem with each other under a lock on AssignmentManager.regions, but there are two places where this protocol is broken. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10271) [regression] Cannot use the wildcard address since HBASE-9593
[ https://issues.apache.org/jira/browse/HBASE-10271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863207#comment-13863207 ] Lars Hofhansl commented on HBASE-10271: --- Lemme revert HBASE-9593 from 0.94. (I would like to spin a new RC in a week or so because of HBASE-8912.) [regression] Cannot use the wildcard address since HBASE-9593 - Key: HBASE-10271 URL: https://issues.apache.org/jira/browse/HBASE-10271 Project: HBase Issue Type: Bug Affects Versions: 0.98.0, 0.94.13, 0.96.1 Reporter: Jean-Daniel Cryans Priority: Critical Fix For: 0.98.0, 0.94.16, 0.96.2, 0.99.0 Attachments: HBASE-10271.patch HBASE-9593 moved the creation of the ephemeral znode earlier in the region server startup process such that we don't have access to the ServerName from the Master's POV. HRS.getMyEphemeralNodePath() calls HRS.getServerName() which at that point will return this.isa.getHostName(). If you set hbase.regionserver.ipc.address to 0.0.0.0, you will create a znode with that address. What happens next is that the RS will report for duty correctly but the master will do this: {noformat} 2014-01-02 11:45:49,498 INFO [master:172.21.3.117:6] master.ServerManager: Registering server=0:0:0:0:0:0:0:0%0,60020,1388691892014 2014-01-02 11:45:49,498 INFO [master:172.21.3.117:6] master.HMaster: Registered server found up in zk but who has not yet reported in: 0:0:0:0:0:0:0:0%0,60020,1388691892014 {noformat} The cluster is then unusable. I think a better solution is to track the heartbeats for the region servers and expire those that haven't checked-in for some time. The 0.89-fb branch has this concept, and they also use it to detect rack failures: https://github.com/apache/hbase/blob/0.89-fb/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java#L1224. In this jira's scope I would just add the heartbeat tracking and add a unit test for the wildcard address. What do you think [~rajesh23]? -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10078) Dynamic Filter - Not using DynamicClassLoader when using FilterList
[ https://issues.apache.org/jira/browse/HBASE-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863208#comment-13863208 ] Jimmy Xiang commented on HBASE-10078: - [~lhofhansl], could you take a look the patch? Thanks. Dynamic Filter - Not using DynamicClassLoader when using FilterList --- Key: HBASE-10078 URL: https://issues.apache.org/jira/browse/HBASE-10078 Project: HBase Issue Type: Bug Components: Filters Affects Versions: 0.94.13 Reporter: Federico Gaule Assignee: Jimmy Xiang Priority: Minor Attachments: 0.94-10078.patch, 0.94-10078_v2.patch, hbase-10078.patch I've tried to use dynamic jar load (https://issues.apache.org/jira/browse/HBASE-1936) but seems to have an issue with FilterList. Here is some log from my app where i send a Get with a FilterList containing AFilter and other with BFilter. {noformat} 2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Class d.p.AFilter not found - using dynamical class loader 2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Finding class: d.p.AFilter 2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Loading new jar files, if any 2013-12-02 13:55:42,677 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Finding class again: d.p.AFilter 2013-12-02 13:55:43,004 ERROR org.apache.hadoop.hbase.io.HbaseObjectWritable: Can't find class d.p.BFilter java.lang.ClassNotFoundException: d.p.BFilter at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:820) at org.apache.hadoop.hbase.io.HbaseObjectWritable.getClassByName(HbaseObjectWritable.java:792) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:679) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594) at org.apache.hadoop.hbase.filter.FilterList.readFields(FilterList.java:324) at org.apache.hadoop.hbase.client.Get.readFields(Get.java:405) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594) at org.apache.hadoop.hbase.client.Action.readFields(Action.java:101) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594) at org.apache.hadoop.hbase.client.MultiAction.readFields(MultiAction.java:116) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690) at org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:126) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1311) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1226) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:748) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:539) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:514) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {noformat} AFilter is not found so it tries with DynamicClassLoader, but when it tries to load AFilter, it uses URLClassLoader and fails without checking out for dynamic jars. I think the issue is releated to FilterList#readFields {code:title=FilterList.java|borderStyle=solid} public void readFields(final DataInput in) throws IOException { byte opByte = in.readByte(); operator = Operator.values()[opByte]; int size = in.readInt(); if (size 0) { filters = new ArrayListFilter(size); for (int i = 0; i size; i++) { Filter filter = (Filter)HbaseObjectWritable.readObject(in, conf); filters.add(filter); } } } {code} HbaseObjectWritable#readObject uses a
[jira] [Commented] (HBASE-9117) Remove HTablePool and all HConnection pooling related APIs
[ https://issues.apache.org/jira/browse/HBASE-9117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863225#comment-13863225 ] Ted Yu commented on HBASE-9117: --- @Nick: This was the real error compiling against hadoop 1.0 : {code} [ERROR] /Users/tyu/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/util/ReflectionUtils.java:[23,20] package java.nio.file does not exist [INFO] 1 error {code} Remove HTablePool and all HConnection pooling related APIs -- Key: HBASE-9117 URL: https://issues.apache.org/jira/browse/HBASE-9117 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Nick Dimiduk Fix For: 0.99.0 Attachments: HBASE-9117.00.patch, HBASE-9117.01.patch, HBASE-9117.02.patch, HBASE-9117.03.patch The recommended way is now: # Create an HConnection: HConnectionManager.createConnection(...) # Create a light HTable: HConnection.getTable(...) # table.close() # connection.close() All other API and pooling will be removed. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10078) Dynamic Filter - Not using DynamicClassLoader when using FilterList
[ https://issues.apache.org/jira/browse/HBASE-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863243#comment-13863243 ] Lars Hofhansl commented on HBASE-10078: --- Looking... Dynamic Filter - Not using DynamicClassLoader when using FilterList --- Key: HBASE-10078 URL: https://issues.apache.org/jira/browse/HBASE-10078 Project: HBase Issue Type: Bug Components: Filters Affects Versions: 0.94.13 Reporter: Federico Gaule Assignee: Jimmy Xiang Priority: Minor Attachments: 0.94-10078.patch, 0.94-10078_v2.patch, hbase-10078.patch I've tried to use dynamic jar load (https://issues.apache.org/jira/browse/HBASE-1936) but seems to have an issue with FilterList. Here is some log from my app where i send a Get with a FilterList containing AFilter and other with BFilter. {noformat} 2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Class d.p.AFilter not found - using dynamical class loader 2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Finding class: d.p.AFilter 2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Loading new jar files, if any 2013-12-02 13:55:42,677 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Finding class again: d.p.AFilter 2013-12-02 13:55:43,004 ERROR org.apache.hadoop.hbase.io.HbaseObjectWritable: Can't find class d.p.BFilter java.lang.ClassNotFoundException: d.p.BFilter at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:820) at org.apache.hadoop.hbase.io.HbaseObjectWritable.getClassByName(HbaseObjectWritable.java:792) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:679) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594) at org.apache.hadoop.hbase.filter.FilterList.readFields(FilterList.java:324) at org.apache.hadoop.hbase.client.Get.readFields(Get.java:405) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594) at org.apache.hadoop.hbase.client.Action.readFields(Action.java:101) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594) at org.apache.hadoop.hbase.client.MultiAction.readFields(MultiAction.java:116) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690) at org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:126) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1311) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1226) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:748) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:539) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:514) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {noformat} AFilter is not found so it tries with DynamicClassLoader, but when it tries to load AFilter, it uses URLClassLoader and fails without checking out for dynamic jars. I think the issue is releated to FilterList#readFields {code:title=FilterList.java|borderStyle=solid} public void readFields(final DataInput in) throws IOException { byte opByte = in.readByte(); operator = Operator.values()[opByte]; int size = in.readInt(); if (size 0) { filters = new ArrayListFilter(size); for (int i = 0; i size; i++) { Filter filter = (Filter)HbaseObjectWritable.readObject(in, conf); filters.add(filter); } } } {code} HbaseObjectWritable#readObject uses a conf (created by calling
[jira] [Commented] (HBASE-9117) Remove HTablePool and all HConnection pooling related APIs
[ https://issues.apache.org/jira/browse/HBASE-9117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863272#comment-13863272 ] Nick Dimiduk commented on HBASE-9117: - Looks like a Java7 API. Let me correct the patch. Thanks for the pointer, [~yuzhih...@gmail.com]. Remove HTablePool and all HConnection pooling related APIs -- Key: HBASE-9117 URL: https://issues.apache.org/jira/browse/HBASE-9117 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Nick Dimiduk Fix For: 0.99.0 Attachments: HBASE-9117.00.patch, HBASE-9117.01.patch, HBASE-9117.02.patch, HBASE-9117.03.patch The recommended way is now: # Create an HConnection: HConnectionManager.createConnection(...) # Create a light HTable: HConnection.getTable(...) # table.close() # connection.close() All other API and pooling will be removed. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10273) AssignmentManager.regions and AssignmentManager.servers are not always updated in tandem
[ https://issues.apache.org/jira/browse/HBASE-10273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863299#comment-13863299 ] Hudson commented on HBASE-10273: FAILURE: Integrated in HBase-0.94-security #379 (See [https://builds.apache.org/job/HBase-0.94-security/379/]) HBASE-10273 AssignmentManager.regions and AssignmentManager.servers are not always updated in tandem (Feng Honghua) (larsh: rev 1555966) * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java AssignmentManager.regions and AssignmentManager.servers are not always updated in tandem Key: HBASE-10273 URL: https://issues.apache.org/jira/browse/HBASE-10273 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.94.16 Reporter: Feng Honghua Assignee: Feng Honghua Fix For: 0.94.16 Attachments: HBASE-10273-0.94_v0.patch, HBASE-10273-0.94_v1.patch By definition, AssignmentManager.servers and AssignmentManager.regions are tied and should be updated in tandem with each other under a lock on AssignmentManager.regions, but there are two places where this protocol is broken. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10284) Build broken with svn 1.8
[ https://issues.apache.org/jira/browse/HBASE-10284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863298#comment-13863298 ] Hudson commented on HBASE-10284: FAILURE: Integrated in HBase-0.94-security #379 (See [https://builds.apache.org/job/HBase-0.94-security/379/]) HBASE-10284 Build broken with svn 1.8 (larsh: rev 1555961) * /hbase/branches/0.94/src/saveVersion.sh Build broken with svn 1.8 - Key: HBASE-10284 URL: https://issues.apache.org/jira/browse/HBASE-10284 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.98.0, 0.94.16, 0.96.2, 0.99.0 Attachments: 10284.txt Just upgraded my machine and found that {{svn info}} displays a Relative URL: line in svn 1.8. saveVersion.sh does not deal with that correctly. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10272) Cluster becomes nonoperational if the node hosting the active Master AND ROOT/META table goes offline
[ https://issues.apache.org/jira/browse/HBASE-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863300#comment-13863300 ] Hudson commented on HBASE-10272: FAILURE: Integrated in HBase-0.94-security #379 (See [https://builds.apache.org/job/HBase-0.94-security/379/]) HBASE-10272 Cluster becomes nonoperational if the node hosting the active Master AND ROOT/META table goes offline (Aditya Kishore) (larsh: rev 1555960) * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java Cluster becomes nonoperational if the node hosting the active Master AND ROOT/META table goes offline - Key: HBASE-10272 URL: https://issues.apache.org/jira/browse/HBASE-10272 Project: HBase Issue Type: Bug Components: IPC/RPC Affects Versions: 0.96.1, 0.94.15 Reporter: Aditya Kishore Assignee: Aditya Kishore Priority: Critical Fix For: 0.98.0, 0.94.16, 0.99.0 Attachments: HBASE-10272.patch, HBASE-10272_0.94.patch Since HBASE-6364, HBase client caches a connection failure to a server and any subsequent attempt to connect to the server throws a {{FailedServerException}} Now if a node which hosted the active Master AND ROOT/META table goes offline, the newly anointed Master's initial attempt to connect to the dead region server will fail with {{NoRouteToHostException}} which it handles but since on second attempt crashes with {{FailedServerException}} Here is the log from one such occurance {noformat} 2013-11-20 10:58:00,161 FATAL org.apache.hadoop.hbase.master.HMaster: Master server abort: loaded coprocessors are: [] 2013-11-20 10:58:00,161 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown. org.apache.hadoop.hbase.ipc.HBaseClient$FailedServerException: This server is in the failed servers list: xxx02/192.168.1.102:60020 at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:425) at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1124) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:974) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:86) at $Proxy9.getProtocolVersion(Unknown Source) at org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:138) at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:208) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1335) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1294) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1281) at org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:506) at org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:383) at org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMeta(CatalogTracker.java:445) at org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMetaServerConnection(CatalogTracker.java:464) at org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:624) at org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:684) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:560) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:376) at java.lang.Thread.run(Thread.java:662) 2013-11-20 10:58:00,162 INFO org.apache.hadoop.hbase.master.HMaster: Aborting 2013-11-20 10:58:00,162 INFO org.apache.hadoop.ipc.HBaseServer: Stopping server on 6 {noformat} Each of the backup master will crash with same error and restarting them will have the same effect. Once this happens, the cluster will remain in-operational until the node with region server is brought online (or the Zookeeper node containing the root region server and/or META entry from the ROOT table is deleted). -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10130) TestSplitLogManager#testTaskResigned fails sometimes
[ https://issues.apache.org/jira/browse/HBASE-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863311#comment-13863311 ] Hadoop QA commented on HBASE-10130: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12621633/10130-v2.txt against trunk revision . ATTACHMENT ID: 12621633 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop1.1{color}. The patch compiles against the hadoop 1.1 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:red}-1 release audit{color}. The applied patch generated 4 release audit warnings (more than the trunk's current 0 warnings). {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8348//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8348//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8348//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8348//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8348//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8348//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8348//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8348//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8348//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8348//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8348//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8348//console This message is automatically generated. TestSplitLogManager#testTaskResigned fails sometimes Key: HBASE-10130 URL: https://issues.apache.org/jira/browse/HBASE-10130 Project: HBase Issue Type: Test Reporter: Ted Yu Assignee: Ted Yu Priority: Minor Attachments: 10130-output.txt, 10130-v1.txt, 10130-v2.txt The test failed in https://builds.apache.org/job/PreCommit-HBASE-Build/8131//testReport For testTaskResigned() : {code} int version = ZKUtil.checkExists(zkw, tasknode); // Could be small race here. if (tot_mgr_resubmit.get() == 0) waitForCounter(tot_mgr_resubmit, 0, 1, to/2); {code} There was no log similar to the following (corresponding to waitForCounter() call above): {code} 2013-12-10 21:23:54,905 INFO [main] hbase.Waiter(174): Waiting up to [3,200] milli-secs(wait.for.ratio=[1]) {code} Meaning, the version (2) retrieved corresponded to resubmitted task. version1 retrieved same value, leading to assertion failure. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-9117) Remove HTablePool and all HConnection pooling related APIs
[ https://issues.apache.org/jira/browse/HBASE-9117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-9117: Attachment: HBASE-9117.04.patch Here's an updated patch that compiles on Java6. Remove HTablePool and all HConnection pooling related APIs -- Key: HBASE-9117 URL: https://issues.apache.org/jira/browse/HBASE-9117 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Nick Dimiduk Fix For: 0.99.0 Attachments: HBASE-9117.00.patch, HBASE-9117.01.patch, HBASE-9117.02.patch, HBASE-9117.03.patch, HBASE-9117.04.patch The recommended way is now: # Create an HConnection: HConnectionManager.createConnection(...) # Create a light HTable: HConnection.getTable(...) # table.close() # connection.close() All other API and pooling will be removed. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10078) Dynamic Filter - Not using DynamicClassLoader when using FilterList
[ https://issues.apache.org/jira/browse/HBASE-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863340#comment-13863340 ] Lars Hofhansl commented on HBASE-10078: --- Took me a bit to grok the change. Looks good. +1 Dynamic Filter - Not using DynamicClassLoader when using FilterList --- Key: HBASE-10078 URL: https://issues.apache.org/jira/browse/HBASE-10078 Project: HBase Issue Type: Bug Components: Filters Affects Versions: 0.94.13 Reporter: Federico Gaule Assignee: Jimmy Xiang Priority: Minor Attachments: 0.94-10078.patch, 0.94-10078_v2.patch, hbase-10078.patch I've tried to use dynamic jar load (https://issues.apache.org/jira/browse/HBASE-1936) but seems to have an issue with FilterList. Here is some log from my app where i send a Get with a FilterList containing AFilter and other with BFilter. {noformat} 2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Class d.p.AFilter not found - using dynamical class loader 2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Finding class: d.p.AFilter 2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Loading new jar files, if any 2013-12-02 13:55:42,677 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Finding class again: d.p.AFilter 2013-12-02 13:55:43,004 ERROR org.apache.hadoop.hbase.io.HbaseObjectWritable: Can't find class d.p.BFilter java.lang.ClassNotFoundException: d.p.BFilter at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:820) at org.apache.hadoop.hbase.io.HbaseObjectWritable.getClassByName(HbaseObjectWritable.java:792) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:679) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594) at org.apache.hadoop.hbase.filter.FilterList.readFields(FilterList.java:324) at org.apache.hadoop.hbase.client.Get.readFields(Get.java:405) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594) at org.apache.hadoop.hbase.client.Action.readFields(Action.java:101) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594) at org.apache.hadoop.hbase.client.MultiAction.readFields(MultiAction.java:116) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690) at org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:126) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1311) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1226) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:748) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:539) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:514) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {noformat} AFilter is not found so it tries with DynamicClassLoader, but when it tries to load AFilter, it uses URLClassLoader and fails without checking out for dynamic jars. I think the issue is releated to FilterList#readFields {code:title=FilterList.java|borderStyle=solid} public void readFields(final DataInput in) throws IOException { byte opByte = in.readByte(); operator = Operator.values()[opByte]; int size = in.readInt(); if (size 0) { filters = new ArrayListFilter(size); for (int i = 0; i size; i++) { Filter filter = (Filter)HbaseObjectWritable.readObject(in, conf); filters.add(filter); } } } {code} HbaseObjectWritable#readObject uses a
[jira] [Commented] (HBASE-9117) Remove HTablePool and all HConnection pooling related APIs
[ https://issues.apache.org/jira/browse/HBASE-9117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863354#comment-13863354 ] Andrew Purtell commented on HBASE-9117: --- Why the hell are some precommit builds are apparently using Java 7 while others are using Java 6? Remove HTablePool and all HConnection pooling related APIs -- Key: HBASE-9117 URL: https://issues.apache.org/jira/browse/HBASE-9117 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Nick Dimiduk Fix For: 0.99.0 Attachments: HBASE-9117.00.patch, HBASE-9117.01.patch, HBASE-9117.02.patch, HBASE-9117.03.patch, HBASE-9117.04.patch The recommended way is now: # Create an HConnection: HConnectionManager.createConnection(...) # Create a light HTable: HConnection.getTable(...) # table.close() # connection.close() All other API and pooling will be removed. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10272) Cluster becomes nonoperational if the node hosting the active Master AND ROOT/META table goes offline
[ https://issues.apache.org/jira/browse/HBASE-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-10272: -- Fix Version/s: 0.96.2 Cluster becomes nonoperational if the node hosting the active Master AND ROOT/META table goes offline - Key: HBASE-10272 URL: https://issues.apache.org/jira/browse/HBASE-10272 Project: HBase Issue Type: Bug Components: IPC/RPC Affects Versions: 0.96.1, 0.94.15 Reporter: Aditya Kishore Assignee: Aditya Kishore Priority: Critical Fix For: 0.98.0, 0.94.16, 0.96.2, 0.99.0 Attachments: HBASE-10272.patch, HBASE-10272_0.94.patch Since HBASE-6364, HBase client caches a connection failure to a server and any subsequent attempt to connect to the server throws a {{FailedServerException}} Now if a node which hosted the active Master AND ROOT/META table goes offline, the newly anointed Master's initial attempt to connect to the dead region server will fail with {{NoRouteToHostException}} which it handles but since on second attempt crashes with {{FailedServerException}} Here is the log from one such occurance {noformat} 2013-11-20 10:58:00,161 FATAL org.apache.hadoop.hbase.master.HMaster: Master server abort: loaded coprocessors are: [] 2013-11-20 10:58:00,161 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown. org.apache.hadoop.hbase.ipc.HBaseClient$FailedServerException: This server is in the failed servers list: xxx02/192.168.1.102:60020 at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:425) at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1124) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:974) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:86) at $Proxy9.getProtocolVersion(Unknown Source) at org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:138) at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:208) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1335) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1294) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1281) at org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:506) at org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:383) at org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMeta(CatalogTracker.java:445) at org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMetaServerConnection(CatalogTracker.java:464) at org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:624) at org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:684) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:560) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:376) at java.lang.Thread.run(Thread.java:662) 2013-11-20 10:58:00,162 INFO org.apache.hadoop.hbase.master.HMaster: Aborting 2013-11-20 10:58:00,162 INFO org.apache.hadoop.ipc.HBaseServer: Stopping server on 6 {noformat} Each of the backup master will crash with same error and restarting them will have the same effect. Once this happens, the cluster will remain in-operational until the node with region server is brought online (or the Zookeeper node containing the root region server and/or META entry from the ROOT table is deleted). -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10284) Build broken with svn 1.8
[ https://issues.apache.org/jira/browse/HBASE-10284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863356#comment-13863356 ] Andrew Purtell commented on HBASE-10284: +1 belated. Anyway, not a code change Build broken with svn 1.8 - Key: HBASE-10284 URL: https://issues.apache.org/jira/browse/HBASE-10284 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.98.0, 0.94.16, 0.96.2, 0.99.0 Attachments: 10284.txt Just upgraded my machine and found that {{svn info}} displays a Relative URL: line in svn 1.8. saveVersion.sh does not deal with that correctly. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-9459) Backport 8534 Fix coverage for org.apache.hadoop.hbase.mapreduce to 0.94
[ https://issues.apache.org/jira/browse/HBASE-9459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-9459: - Fix Version/s: (was: 0.94.16) 0.94.17 Backport 8534 Fix coverage for org.apache.hadoop.hbase.mapreduce to 0.94 -- Key: HBASE-9459 URL: https://issues.apache.org/jira/browse/HBASE-9459 Project: HBase Issue Type: Test Components: mapreduce, test Reporter: Nick Dimiduk Assignee: Ivan A. Veselovsky Fix For: 0.94.17 Attachments: HBASE-9459-0.94--n3.patch Do you want this test update backported? See HBASE-8534 for a 0.94 patch. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10272) Cluster becomes nonoperational if the node hosting the active Master AND ROOT/META table goes offline
[ https://issues.apache.org/jira/browse/HBASE-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-10272: -- Resolution: Fixed Status: Resolved (was: Patch Available) Assuming you want this [~stack], I took the liberty and committed to 0.96 as well. :) Cluster becomes nonoperational if the node hosting the active Master AND ROOT/META table goes offline - Key: HBASE-10272 URL: https://issues.apache.org/jira/browse/HBASE-10272 Project: HBase Issue Type: Bug Components: IPC/RPC Affects Versions: 0.96.1, 0.94.15 Reporter: Aditya Kishore Assignee: Aditya Kishore Priority: Critical Fix For: 0.98.0, 0.94.16, 0.96.2, 0.99.0 Attachments: HBASE-10272.patch, HBASE-10272_0.94.patch Since HBASE-6364, HBase client caches a connection failure to a server and any subsequent attempt to connect to the server throws a {{FailedServerException}} Now if a node which hosted the active Master AND ROOT/META table goes offline, the newly anointed Master's initial attempt to connect to the dead region server will fail with {{NoRouteToHostException}} which it handles but since on second attempt crashes with {{FailedServerException}} Here is the log from one such occurance {noformat} 2013-11-20 10:58:00,161 FATAL org.apache.hadoop.hbase.master.HMaster: Master server abort: loaded coprocessors are: [] 2013-11-20 10:58:00,161 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown. org.apache.hadoop.hbase.ipc.HBaseClient$FailedServerException: This server is in the failed servers list: xxx02/192.168.1.102:60020 at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:425) at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1124) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:974) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:86) at $Proxy9.getProtocolVersion(Unknown Source) at org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:138) at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:208) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1335) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1294) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1281) at org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:506) at org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:383) at org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMeta(CatalogTracker.java:445) at org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMetaServerConnection(CatalogTracker.java:464) at org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:624) at org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:684) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:560) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:376) at java.lang.Thread.run(Thread.java:662) 2013-11-20 10:58:00,162 INFO org.apache.hadoop.hbase.master.HMaster: Aborting 2013-11-20 10:58:00,162 INFO org.apache.hadoop.ipc.HBaseServer: Stopping server on 6 {noformat} Each of the backup master will crash with same error and restarting them will have the same effect. Once this happens, the cluster will remain in-operational until the node with region server is brought online (or the Zookeeper node containing the root region server and/or META entry from the ROOT table is deleted). -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10271) [regression] Cannot use the wildcard address since HBASE-9593
[ https://issues.apache.org/jira/browse/HBASE-10271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863373#comment-13863373 ] Andrew Purtell commented on HBASE-10271: Any update? Or I can revert from 0.98 also... [regression] Cannot use the wildcard address since HBASE-9593 - Key: HBASE-10271 URL: https://issues.apache.org/jira/browse/HBASE-10271 Project: HBase Issue Type: Bug Affects Versions: 0.98.0, 0.94.13, 0.96.1 Reporter: Jean-Daniel Cryans Priority: Critical Fix For: 0.98.0, 0.94.16, 0.96.2, 0.99.0 Attachments: HBASE-10271.patch HBASE-9593 moved the creation of the ephemeral znode earlier in the region server startup process such that we don't have access to the ServerName from the Master's POV. HRS.getMyEphemeralNodePath() calls HRS.getServerName() which at that point will return this.isa.getHostName(). If you set hbase.regionserver.ipc.address to 0.0.0.0, you will create a znode with that address. What happens next is that the RS will report for duty correctly but the master will do this: {noformat} 2014-01-02 11:45:49,498 INFO [master:172.21.3.117:6] master.ServerManager: Registering server=0:0:0:0:0:0:0:0%0,60020,1388691892014 2014-01-02 11:45:49,498 INFO [master:172.21.3.117:6] master.HMaster: Registered server found up in zk but who has not yet reported in: 0:0:0:0:0:0:0:0%0,60020,1388691892014 {noformat} The cluster is then unusable. I think a better solution is to track the heartbeats for the region servers and expire those that haven't checked-in for some time. The 0.89-fb branch has this concept, and they also use it to detect rack failures: https://github.com/apache/hbase/blob/0.89-fb/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java#L1224. In this jira's scope I would just add the heartbeat tracking and add a unit test for the wildcard address. What do you think [~rajesh23]? -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10284) Build broken with svn 1.8
[ https://issues.apache.org/jira/browse/HBASE-10284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863385#comment-13863385 ] Hudson commented on HBASE-10284: SUCCESS: Integrated in HBase-0.94 #1252 (See [https://builds.apache.org/job/HBase-0.94/1252/]) HBASE-10284 Build broken with svn 1.8 (larsh: rev 1555961) * /hbase/branches/0.94/src/saveVersion.sh Build broken with svn 1.8 - Key: HBASE-10284 URL: https://issues.apache.org/jira/browse/HBASE-10284 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.98.0, 0.94.16, 0.96.2, 0.99.0 Attachments: 10284.txt Just upgraded my machine and found that {{svn info}} displays a Relative URL: line in svn 1.8. saveVersion.sh does not deal with that correctly. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10273) AssignmentManager.regions and AssignmentManager.servers are not always updated in tandem
[ https://issues.apache.org/jira/browse/HBASE-10273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863386#comment-13863386 ] Hudson commented on HBASE-10273: SUCCESS: Integrated in HBase-0.94 #1252 (See [https://builds.apache.org/job/HBase-0.94/1252/]) HBASE-10273 AssignmentManager.regions and AssignmentManager.servers are not always updated in tandem (Feng Honghua) (larsh: rev 1555966) * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java AssignmentManager.regions and AssignmentManager.servers are not always updated in tandem Key: HBASE-10273 URL: https://issues.apache.org/jira/browse/HBASE-10273 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.94.16 Reporter: Feng Honghua Assignee: Feng Honghua Fix For: 0.94.16 Attachments: HBASE-10273-0.94_v0.patch, HBASE-10273-0.94_v1.patch By definition, AssignmentManager.servers and AssignmentManager.regions are tied and should be updated in tandem with each other under a lock on AssignmentManager.regions, but there are two places where this protocol is broken. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10272) Cluster becomes nonoperational if the node hosting the active Master AND ROOT/META table goes offline
[ https://issues.apache.org/jira/browse/HBASE-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863387#comment-13863387 ] Hudson commented on HBASE-10272: SUCCESS: Integrated in HBase-0.94 #1252 (See [https://builds.apache.org/job/HBase-0.94/1252/]) HBASE-10272 Cluster becomes nonoperational if the node hosting the active Master AND ROOT/META table goes offline (Aditya Kishore) (larsh: rev 1555960) * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java Cluster becomes nonoperational if the node hosting the active Master AND ROOT/META table goes offline - Key: HBASE-10272 URL: https://issues.apache.org/jira/browse/HBASE-10272 Project: HBase Issue Type: Bug Components: IPC/RPC Affects Versions: 0.96.1, 0.94.15 Reporter: Aditya Kishore Assignee: Aditya Kishore Priority: Critical Fix For: 0.98.0, 0.94.16, 0.96.2, 0.99.0 Attachments: HBASE-10272.patch, HBASE-10272_0.94.patch Since HBASE-6364, HBase client caches a connection failure to a server and any subsequent attempt to connect to the server throws a {{FailedServerException}} Now if a node which hosted the active Master AND ROOT/META table goes offline, the newly anointed Master's initial attempt to connect to the dead region server will fail with {{NoRouteToHostException}} which it handles but since on second attempt crashes with {{FailedServerException}} Here is the log from one such occurance {noformat} 2013-11-20 10:58:00,161 FATAL org.apache.hadoop.hbase.master.HMaster: Master server abort: loaded coprocessors are: [] 2013-11-20 10:58:00,161 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown. org.apache.hadoop.hbase.ipc.HBaseClient$FailedServerException: This server is in the failed servers list: xxx02/192.168.1.102:60020 at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:425) at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1124) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:974) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:86) at $Proxy9.getProtocolVersion(Unknown Source) at org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:138) at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:208) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1335) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1294) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1281) at org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:506) at org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:383) at org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMeta(CatalogTracker.java:445) at org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMetaServerConnection(CatalogTracker.java:464) at org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:624) at org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:684) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:560) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:376) at java.lang.Thread.run(Thread.java:662) 2013-11-20 10:58:00,162 INFO org.apache.hadoop.hbase.master.HMaster: Aborting 2013-11-20 10:58:00,162 INFO org.apache.hadoop.ipc.HBaseServer: Stopping server on 6 {noformat} Each of the backup master will crash with same error and restarting them will have the same effect. Once this happens, the cluster will remain in-operational until the node with region server is brought online (or the Zookeeper node containing the root region server and/or META entry from the ROOT table is deleted). -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10078) Dynamic Filter - Not using DynamicClassLoader when using FilterList
[ https://issues.apache.org/jira/browse/HBASE-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-10078: Resolution: Fixed Fix Version/s: 0.96.2 0.94.16 0.98.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks a lot. Dynamic Filter - Not using DynamicClassLoader when using FilterList --- Key: HBASE-10078 URL: https://issues.apache.org/jira/browse/HBASE-10078 Project: HBase Issue Type: Bug Components: Filters Affects Versions: 0.94.13 Reporter: Federico Gaule Assignee: Jimmy Xiang Priority: Minor Fix For: 0.98.0, 0.94.16, 0.96.2 Attachments: 0.94-10078.patch, 0.94-10078_v2.patch, hbase-10078.patch I've tried to use dynamic jar load (https://issues.apache.org/jira/browse/HBASE-1936) but seems to have an issue with FilterList. Here is some log from my app where i send a Get with a FilterList containing AFilter and other with BFilter. {noformat} 2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Class d.p.AFilter not found - using dynamical class loader 2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Finding class: d.p.AFilter 2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Loading new jar files, if any 2013-12-02 13:55:42,677 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Finding class again: d.p.AFilter 2013-12-02 13:55:43,004 ERROR org.apache.hadoop.hbase.io.HbaseObjectWritable: Can't find class d.p.BFilter java.lang.ClassNotFoundException: d.p.BFilter at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:820) at org.apache.hadoop.hbase.io.HbaseObjectWritable.getClassByName(HbaseObjectWritable.java:792) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:679) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594) at org.apache.hadoop.hbase.filter.FilterList.readFields(FilterList.java:324) at org.apache.hadoop.hbase.client.Get.readFields(Get.java:405) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594) at org.apache.hadoop.hbase.client.Action.readFields(Action.java:101) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594) at org.apache.hadoop.hbase.client.MultiAction.readFields(MultiAction.java:116) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690) at org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:126) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1311) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1226) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:748) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:539) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:514) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {noformat} AFilter is not found so it tries with DynamicClassLoader, but when it tries to load AFilter, it uses URLClassLoader and fails without checking out for dynamic jars. I think the issue is releated to FilterList#readFields {code:title=FilterList.java|borderStyle=solid} public void readFields(final DataInput in) throws IOException { byte opByte = in.readByte(); operator = Operator.values()[opByte]; int size = in.readInt(); if (size 0) { filters = new ArrayListFilter(size); for (int i = 0; i size; i++) { Filter filter =
[jira] [Commented] (HBASE-10271) [regression] Cannot use the wildcard address since HBASE-9593
[ https://issues.apache.org/jira/browse/HBASE-10271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863395#comment-13863395 ] Jean-Daniel Cryans commented on HBASE-10271: bq. Any update? I'm waiting to see if there's interest in further building up the patch. Sergey seems ok with the process although seem to think we should further prove that it's necessary to have two methods to expire region servers. Lars H. had questions I answered but I didn't see interest. IMO we can safely revert HBASE-9593 from all the branches and spend more time on a proper fix. [regression] Cannot use the wildcard address since HBASE-9593 - Key: HBASE-10271 URL: https://issues.apache.org/jira/browse/HBASE-10271 Project: HBase Issue Type: Bug Affects Versions: 0.98.0, 0.94.13, 0.96.1 Reporter: Jean-Daniel Cryans Priority: Critical Fix For: 0.98.0, 0.94.16, 0.96.2, 0.99.0 Attachments: HBASE-10271.patch HBASE-9593 moved the creation of the ephemeral znode earlier in the region server startup process such that we don't have access to the ServerName from the Master's POV. HRS.getMyEphemeralNodePath() calls HRS.getServerName() which at that point will return this.isa.getHostName(). If you set hbase.regionserver.ipc.address to 0.0.0.0, you will create a znode with that address. What happens next is that the RS will report for duty correctly but the master will do this: {noformat} 2014-01-02 11:45:49,498 INFO [master:172.21.3.117:6] master.ServerManager: Registering server=0:0:0:0:0:0:0:0%0,60020,1388691892014 2014-01-02 11:45:49,498 INFO [master:172.21.3.117:6] master.HMaster: Registered server found up in zk but who has not yet reported in: 0:0:0:0:0:0:0:0%0,60020,1388691892014 {noformat} The cluster is then unusable. I think a better solution is to track the heartbeats for the region servers and expire those that haven't checked-in for some time. The 0.89-fb branch has this concept, and they also use it to detect rack failures: https://github.com/apache/hbase/blob/0.89-fb/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java#L1224. In this jira's scope I would just add the heartbeat tracking and add a unit test for the wildcard address. What do you think [~rajesh23]? -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10284) Build broken with svn 1.8
[ https://issues.apache.org/jira/browse/HBASE-10284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863403#comment-13863403 ] Hudson commented on HBASE-10284: SUCCESS: Integrated in HBase-0.94-JDK7 #19 (See [https://builds.apache.org/job/HBase-0.94-JDK7/19/]) HBASE-10284 Build broken with svn 1.8 (larsh: rev 1555961) * /hbase/branches/0.94/src/saveVersion.sh Build broken with svn 1.8 - Key: HBASE-10284 URL: https://issues.apache.org/jira/browse/HBASE-10284 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.98.0, 0.94.16, 0.96.2, 0.99.0 Attachments: 10284.txt Just upgraded my machine and found that {{svn info}} displays a Relative URL: line in svn 1.8. saveVersion.sh does not deal with that correctly. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10273) AssignmentManager.regions and AssignmentManager.servers are not always updated in tandem
[ https://issues.apache.org/jira/browse/HBASE-10273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863404#comment-13863404 ] Hudson commented on HBASE-10273: SUCCESS: Integrated in HBase-0.94-JDK7 #19 (See [https://builds.apache.org/job/HBase-0.94-JDK7/19/]) HBASE-10273 AssignmentManager.regions and AssignmentManager.servers are not always updated in tandem (Feng Honghua) (larsh: rev 1555966) * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java AssignmentManager.regions and AssignmentManager.servers are not always updated in tandem Key: HBASE-10273 URL: https://issues.apache.org/jira/browse/HBASE-10273 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.94.16 Reporter: Feng Honghua Assignee: Feng Honghua Fix For: 0.94.16 Attachments: HBASE-10273-0.94_v0.patch, HBASE-10273-0.94_v1.patch By definition, AssignmentManager.servers and AssignmentManager.regions are tied and should be updated in tandem with each other under a lock on AssignmentManager.regions, but there are two places where this protocol is broken. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10272) Cluster becomes nonoperational if the node hosting the active Master AND ROOT/META table goes offline
[ https://issues.apache.org/jira/browse/HBASE-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863405#comment-13863405 ] Hudson commented on HBASE-10272: SUCCESS: Integrated in HBase-0.94-JDK7 #19 (See [https://builds.apache.org/job/HBase-0.94-JDK7/19/]) HBASE-10272 Cluster becomes nonoperational if the node hosting the active Master AND ROOT/META table goes offline (Aditya Kishore) (larsh: rev 1555960) * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java Cluster becomes nonoperational if the node hosting the active Master AND ROOT/META table goes offline - Key: HBASE-10272 URL: https://issues.apache.org/jira/browse/HBASE-10272 Project: HBase Issue Type: Bug Components: IPC/RPC Affects Versions: 0.96.1, 0.94.15 Reporter: Aditya Kishore Assignee: Aditya Kishore Priority: Critical Fix For: 0.98.0, 0.94.16, 0.96.2, 0.99.0 Attachments: HBASE-10272.patch, HBASE-10272_0.94.patch Since HBASE-6364, HBase client caches a connection failure to a server and any subsequent attempt to connect to the server throws a {{FailedServerException}} Now if a node which hosted the active Master AND ROOT/META table goes offline, the newly anointed Master's initial attempt to connect to the dead region server will fail with {{NoRouteToHostException}} which it handles but since on second attempt crashes with {{FailedServerException}} Here is the log from one such occurance {noformat} 2013-11-20 10:58:00,161 FATAL org.apache.hadoop.hbase.master.HMaster: Master server abort: loaded coprocessors are: [] 2013-11-20 10:58:00,161 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown. org.apache.hadoop.hbase.ipc.HBaseClient$FailedServerException: This server is in the failed servers list: xxx02/192.168.1.102:60020 at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:425) at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1124) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:974) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:86) at $Proxy9.getProtocolVersion(Unknown Source) at org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:138) at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:208) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1335) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1294) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1281) at org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:506) at org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:383) at org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMeta(CatalogTracker.java:445) at org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMetaServerConnection(CatalogTracker.java:464) at org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:624) at org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:684) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:560) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:376) at java.lang.Thread.run(Thread.java:662) 2013-11-20 10:58:00,162 INFO org.apache.hadoop.hbase.master.HMaster: Aborting 2013-11-20 10:58:00,162 INFO org.apache.hadoop.ipc.HBaseServer: Stopping server on 6 {noformat} Each of the backup master will crash with same error and restarting them will have the same effect. Once this happens, the cluster will remain in-operational until the node with region server is brought online (or the Zookeeper node containing the root region server and/or META entry from the ROOT table is deleted). -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10284) Build broken with svn 1.8
[ https://issues.apache.org/jira/browse/HBASE-10284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863410#comment-13863410 ] Hudson commented on HBASE-10284: FAILURE: Integrated in HBase-TRUNK #4793 (See [https://builds.apache.org/job/HBase-TRUNK/4793/]) HBASE-10284 Build broken with svn 1.8 (larsh: rev 1555962) * /hbase/trunk/hbase-common/src/saveVersion.sh Build broken with svn 1.8 - Key: HBASE-10284 URL: https://issues.apache.org/jira/browse/HBASE-10284 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.98.0, 0.94.16, 0.96.2, 0.99.0 Attachments: 10284.txt Just upgraded my machine and found that {{svn info}} displays a Relative URL: line in svn 1.8. saveVersion.sh does not deal with that correctly. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10284) Build broken with svn 1.8
[ https://issues.apache.org/jira/browse/HBASE-10284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863413#comment-13863413 ] Hudson commented on HBASE-10284: SUCCESS: Integrated in HBase-0.98 #59 (See [https://builds.apache.org/job/HBase-0.98/59/]) HBASE-10284 Build broken with svn 1.8 (larsh: rev 1555963) * /hbase/branches/0.98/hbase-common/src/saveVersion.sh Build broken with svn 1.8 - Key: HBASE-10284 URL: https://issues.apache.org/jira/browse/HBASE-10284 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.98.0, 0.94.16, 0.96.2, 0.99.0 Attachments: 10284.txt Just upgraded my machine and found that {{svn info}} displays a Relative URL: line in svn 1.8. saveVersion.sh does not deal with that correctly. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10284) Build broken with svn 1.8
[ https://issues.apache.org/jira/browse/HBASE-10284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863421#comment-13863421 ] Hudson commented on HBASE-10284: FAILURE: Integrated in hbase-0.96 #250 (See [https://builds.apache.org/job/hbase-0.96/250/]) HBASE-10284 Build broken with svn 1.8 (larsh: rev 1555964) * /hbase/branches/0.96/hbase-common/src/saveVersion.sh Build broken with svn 1.8 - Key: HBASE-10284 URL: https://issues.apache.org/jira/browse/HBASE-10284 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.98.0, 0.94.16, 0.96.2, 0.99.0 Attachments: 10284.txt Just upgraded my machine and found that {{svn info}} displays a Relative URL: line in svn 1.8. saveVersion.sh does not deal with that correctly. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10130) TestSplitLogManager#testTaskResigned fails sometimes
[ https://issues.apache.org/jira/browse/HBASE-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-10130: --- Fix Version/s: 0.99.0 Hadoop Flags: Reviewed Integrated to trunk. Thanks for the review, Jeff. TestSplitLogManager#testTaskResigned fails sometimes Key: HBASE-10130 URL: https://issues.apache.org/jira/browse/HBASE-10130 Project: HBase Issue Type: Test Reporter: Ted Yu Assignee: Ted Yu Priority: Minor Fix For: 0.99.0 Attachments: 10130-output.txt, 10130-v1.txt, 10130-v2.txt The test failed in https://builds.apache.org/job/PreCommit-HBASE-Build/8131//testReport For testTaskResigned() : {code} int version = ZKUtil.checkExists(zkw, tasknode); // Could be small race here. if (tot_mgr_resubmit.get() == 0) waitForCounter(tot_mgr_resubmit, 0, 1, to/2); {code} There was no log similar to the following (corresponding to waitForCounter() call above): {code} 2013-12-10 21:23:54,905 INFO [main] hbase.Waiter(174): Waiting up to [3,200] milli-secs(wait.for.ratio=[1]) {code} Meaning, the version (2) retrieved corresponded to resubmitted task. version1 retrieved same value, leading to assertion failure. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-9117) Remove HTablePool and all HConnection pooling related APIs
[ https://issues.apache.org/jira/browse/HBASE-9117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863456#comment-13863456 ] Hadoop QA commented on HBASE-9117: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12621655/HBASE-9117.04.patch against trunk revision . ATTACHMENT ID: 12621655 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 120 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop1.1{color}. The patch compiles against the hadoop 1.1 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:red}-1 release audit{color}. The applied patch generated 4 release audit warnings (more than the trunk's current 0 warnings). {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: +// HBaseAdmin only waits for regions to appear in hbase:meta we should wait until they are assigned +LOG.warn(close() called on HConnection instance returned from HBaseTestingUtility.getConnection()); + public static THBaseService.Iface newInstance(Configuration conf, ThriftMetrics metrics) throws IOException { {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.security.visibility.TestVisibilityLabelsWithACL org.apache.hadoop.hbase.security.access.TestTablePermissions org.apache.hadoop.hbase.TestZooKeeper org.apache.hadoop.hbase.client.TestMultiParallel org.apache.hadoop.hbase.security.access.TestAccessControlFilter org.apache.hadoop.hbase.client.TestHCM org.apache.hadoop.hbase.client.TestFromClientSide org.apache.hadoop.hbase.client.TestFromClientSideWithCoprocessor org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction org.apache.hadoop.hbase.master.TestAssignmentManager {color:red}-1 core zombie tests{color}. There are 7 zombie test(s): at org.apache.hadoop.hbase.master.TestMasterNoCluster.testNotPullingDeadRegionServerFromZK(TestMasterNoCluster.java:408) at org.apache.hadoop.hbase.mapreduce.TestLoadIncrementalHFiles.testNonexistentColumnFamilyLoad(TestLoadIncrementalHFiles.java:200) at org.apache.hadoop.hbase.client.TestMetaScanner.testConcurrentMetaScannerAndCatalogJanitor(TestMetaScanner.java:239) at org.apache.hadoop.hbase.util.TestHBaseFsck.testHBaseFsck(TestHBaseFsck.java:151) at org.apache.hadoop.hbase.io.encoding.TestChangingEncoding.testCrazyRandomChanges(TestChangingEncoding.java:252) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8349//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8349//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8349//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8349//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8349//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8349//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8349//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8349//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8349//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8349//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8349//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8349//console This
[jira] [Updated] (HBASE-5923) Cleanup checkAndXXX logic
[ https://issues.apache.org/jira/browse/HBASE-5923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-5923: -- Resolution: Fixed Status: Resolved (was: Patch Available) bq. Unless somebody desperately wants this in 0.94/0.96/0.98, let's just change this in trunk. This has already been committed to trunk, so closing as Fixed. Committers, please *resolve* JIRAs after commit. Cleanup checkAndXXX logic - Key: HBASE-5923 URL: https://issues.apache.org/jira/browse/HBASE-5923 Project: HBase Issue Type: Improvement Components: Client, regionserver Reporter: Lars Hofhansl Assignee: Feng Honghua Labels: noob Fix For: 0.99.0 Attachments: 5923-0.94.txt, 5923-trunk.txt, HBASE-10262-trunk_v0.patch 1. the checkAnd{Put|Delete} method that takes a CompareOP is not exposed via HTable[Interface]. 2. there is unnecessary duplicate code in the check{Put|Delete} code in HRegionServer. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Comment Edited] (HBASE-6104) Require EXEC permission to call coprocessor endpoints
[ https://issues.apache.org/jira/browse/HBASE-6104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863475#comment-13863475 ] Andrew Purtell edited comment on HBASE-6104 at 1/6/14 9:55 PM: --- bq. Before I commit this to trunk (and maybe to 0.98... I am tempted ... other security features have been bundled into it) I am going to bring this into 0.98, but one more rev of the patch first: Let's make enforcement of EXEC privilege optional and disabled by default for 0.98, then on unconditionally in a later release. Thanks to [~avik_...@yahoo.com] for the suggestion. was (Author: apurtell): bq. Before I commit this to trunk (and maybe to 0.98... I am tempted ... other security features have been bundled into it) I am going to bring this into 0.98, but one more rev of the patch first: Let's make enforcement of EXEC privilege optional and disabled by default for 0.98, then on unconditionally in a later release. Require EXEC permission to call coprocessor endpoints - Key: HBASE-6104 URL: https://issues.apache.org/jira/browse/HBASE-6104 Project: HBase Issue Type: New Feature Components: Coprocessors, security Reporter: Gary Helmling Assignee: Andrew Purtell Fix For: 0.98.0, 0.99.0 Attachments: 6104-addendum-1.patch, 6104-revert.patch, 6104.patch, 6104.patch, 6104.patch, 6104.patch, 6104.patch, 6104.patch The EXEC action currently exists as only a placeholder in access control. It should really be used to enforce access to coprocessor endpoint RPC calls, which are currently unrestricted. How the ACLs to support this would be modeled deserves some discussion: * Should access be scoped to a specific table and CoprocessorProtocol extension? * Should it be possible to grant access to a CoprocessorProtocol implementation globally (regardless of table)? * Are per-method restrictions necessary? * Should we expose hooks available to endpoint implementors so that they could additionally apply their own permission checks? Some CP endpoints may want to require READ permissions, others may want to enforce WRITE, or READ + WRITE. To apply these kinds of checks we would also have to extend the RegionObserver interface to provide hooks wrapping HRegion.exec(). -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-6104) Require EXEC permission to call coprocessor endpoints
[ https://issues.apache.org/jira/browse/HBASE-6104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-6104: -- Fix Version/s: 0.98.0 bq. Before I commit this to trunk (and maybe to 0.98... I am tempted ... other security features have been bundled into it) I am going to bring this into 0.98, but one more rev of the patch first: Let's make enforcement of EXEC privilege optional and disabled by default for 0.98, then on unconditionally in a later release. Require EXEC permission to call coprocessor endpoints - Key: HBASE-6104 URL: https://issues.apache.org/jira/browse/HBASE-6104 Project: HBase Issue Type: New Feature Components: Coprocessors, security Reporter: Gary Helmling Assignee: Andrew Purtell Fix For: 0.98.0, 0.99.0 Attachments: 6104-addendum-1.patch, 6104-revert.patch, 6104.patch, 6104.patch, 6104.patch, 6104.patch, 6104.patch, 6104.patch The EXEC action currently exists as only a placeholder in access control. It should really be used to enforce access to coprocessor endpoint RPC calls, which are currently unrestricted. How the ACLs to support this would be modeled deserves some discussion: * Should access be scoped to a specific table and CoprocessorProtocol extension? * Should it be possible to grant access to a CoprocessorProtocol implementation globally (regardless of table)? * Are per-method restrictions necessary? * Should we expose hooks available to endpoint implementors so that they could additionally apply their own permission checks? Some CP endpoints may want to require READ permissions, others may want to enforce WRITE, or READ + WRITE. To apply these kinds of checks we would also have to extend the RegionObserver interface to provide hooks wrapping HRegion.exec(). -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10130) TestSplitLogManager#testTaskResigned fails sometimes
[ https://issues.apache.org/jira/browse/HBASE-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-10130: --- Resolution: Fixed Status: Resolved (was: Patch Available) TestSplitLogManager#testTaskResigned fails sometimes Key: HBASE-10130 URL: https://issues.apache.org/jira/browse/HBASE-10130 Project: HBase Issue Type: Test Reporter: Ted Yu Assignee: Ted Yu Priority: Minor Fix For: 0.99.0 Attachments: 10130-output.txt, 10130-v1.txt, 10130-v2.txt The test failed in https://builds.apache.org/job/PreCommit-HBASE-Build/8131//testReport For testTaskResigned() : {code} int version = ZKUtil.checkExists(zkw, tasknode); // Could be small race here. if (tot_mgr_resubmit.get() == 0) waitForCounter(tot_mgr_resubmit, 0, 1, to/2); {code} There was no log similar to the following (corresponding to waitForCounter() call above): {code} 2013-12-10 21:23:54,905 INFO [main] hbase.Waiter(174): Waiting up to [3,200] milli-secs(wait.for.ratio=[1]) {code} Meaning, the version (2) retrieved corresponded to resubmitted task. version1 retrieved same value, leading to assertion failure. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10078) Dynamic Filter - Not using DynamicClassLoader when using FilterList
[ https://issues.apache.org/jira/browse/HBASE-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863492#comment-13863492 ] Hudson commented on HBASE-10078: SUCCESS: Integrated in HBase-0.94-security #381 (See [https://builds.apache.org/job/HBase-0.94-security/381/]) HBASE-10078 Dynamic Filter - Not using DynamicClassLoader when using FilterList (jxiang: rev 1556027) * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/filter/FilterList.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/util/Classes.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/client/TestGet.java Dynamic Filter - Not using DynamicClassLoader when using FilterList --- Key: HBASE-10078 URL: https://issues.apache.org/jira/browse/HBASE-10078 Project: HBase Issue Type: Bug Components: Filters Affects Versions: 0.94.13 Reporter: Federico Gaule Assignee: Jimmy Xiang Priority: Minor Fix For: 0.98.0, 0.94.16, 0.96.2 Attachments: 0.94-10078.patch, 0.94-10078_v2.patch, hbase-10078.patch I've tried to use dynamic jar load (https://issues.apache.org/jira/browse/HBASE-1936) but seems to have an issue with FilterList. Here is some log from my app where i send a Get with a FilterList containing AFilter and other with BFilter. {noformat} 2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Class d.p.AFilter not found - using dynamical class loader 2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Finding class: d.p.AFilter 2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Loading new jar files, if any 2013-12-02 13:55:42,677 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Finding class again: d.p.AFilter 2013-12-02 13:55:43,004 ERROR org.apache.hadoop.hbase.io.HbaseObjectWritable: Can't find class d.p.BFilter java.lang.ClassNotFoundException: d.p.BFilter at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:820) at org.apache.hadoop.hbase.io.HbaseObjectWritable.getClassByName(HbaseObjectWritable.java:792) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:679) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594) at org.apache.hadoop.hbase.filter.FilterList.readFields(FilterList.java:324) at org.apache.hadoop.hbase.client.Get.readFields(Get.java:405) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594) at org.apache.hadoop.hbase.client.Action.readFields(Action.java:101) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594) at org.apache.hadoop.hbase.client.MultiAction.readFields(MultiAction.java:116) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690) at org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:126) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1311) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1226) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:748) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:539) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:514) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {noformat} AFilter is not found so it tries with DynamicClassLoader, but when it tries to load AFilter, it uses URLClassLoader and fails without checking out for dynamic jars. I think the issue is
[jira] [Updated] (HBASE-10130) TestSplitLogManager#testTaskResigned fails sometimes
[ https://issues.apache.org/jira/browse/HBASE-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-10130: --- Fix Version/s: 0.98.0 Committed test fix to 0.98 branch. TestSplitLogManager#testTaskResigned fails sometimes Key: HBASE-10130 URL: https://issues.apache.org/jira/browse/HBASE-10130 Project: HBase Issue Type: Test Reporter: Ted Yu Assignee: Ted Yu Priority: Minor Fix For: 0.98.0, 0.99.0 Attachments: 10130-output.txt, 10130-v1.txt, 10130-v2.txt The test failed in https://builds.apache.org/job/PreCommit-HBASE-Build/8131//testReport For testTaskResigned() : {code} int version = ZKUtil.checkExists(zkw, tasknode); // Could be small race here. if (tot_mgr_resubmit.get() == 0) waitForCounter(tot_mgr_resubmit, 0, 1, to/2); {code} There was no log similar to the following (corresponding to waitForCounter() call above): {code} 2013-12-10 21:23:54,905 INFO [main] hbase.Waiter(174): Waiting up to [3,200] milli-secs(wait.for.ratio=[1]) {code} Meaning, the version (2) retrieved corresponded to resubmitted task. version1 retrieved same value, leading to assertion failure. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-6104) Require EXEC permission to call coprocessor endpoints
[ https://issues.apache.org/jira/browse/HBASE-6104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-6104: -- Attachment: 6104.patch Require EXEC permission to call coprocessor endpoints - Key: HBASE-6104 URL: https://issues.apache.org/jira/browse/HBASE-6104 Project: HBase Issue Type: New Feature Components: Coprocessors, security Reporter: Gary Helmling Assignee: Andrew Purtell Fix For: 0.98.0, 0.99.0 Attachments: 6104-addendum-1.patch, 6104-revert.patch, 6104.patch, 6104.patch, 6104.patch, 6104.patch, 6104.patch, 6104.patch, 6104.patch The EXEC action currently exists as only a placeholder in access control. It should really be used to enforce access to coprocessor endpoint RPC calls, which are currently unrestricted. How the ACLs to support this would be modeled deserves some discussion: * Should access be scoped to a specific table and CoprocessorProtocol extension? * Should it be possible to grant access to a CoprocessorProtocol implementation globally (regardless of table)? * Are per-method restrictions necessary? * Should we expose hooks available to endpoint implementors so that they could additionally apply their own permission checks? Some CP endpoints may want to require READ permissions, others may want to enforce WRITE, or READ + WRITE. To apply these kinds of checks we would also have to extend the RegionObserver interface to provide hooks wrapping HRegion.exec(). -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-6104) Require EXEC permission to call coprocessor endpoints
[ https://issues.apache.org/jira/browse/HBASE-6104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-6104: -- Status: Patch Available (was: Reopened) Require EXEC permission to call coprocessor endpoints - Key: HBASE-6104 URL: https://issues.apache.org/jira/browse/HBASE-6104 Project: HBase Issue Type: New Feature Components: Coprocessors, security Reporter: Gary Helmling Assignee: Andrew Purtell Fix For: 0.98.0, 0.99.0 Attachments: 6104-addendum-1.patch, 6104-revert.patch, 6104.patch, 6104.patch, 6104.patch, 6104.patch, 6104.patch, 6104.patch, 6104.patch The EXEC action currently exists as only a placeholder in access control. It should really be used to enforce access to coprocessor endpoint RPC calls, which are currently unrestricted. How the ACLs to support this would be modeled deserves some discussion: * Should access be scoped to a specific table and CoprocessorProtocol extension? * Should it be possible to grant access to a CoprocessorProtocol implementation globally (regardless of table)? * Are per-method restrictions necessary? * Should we expose hooks available to endpoint implementors so that they could additionally apply their own permission checks? Some CP endpoints may want to require READ permissions, others may want to enforce WRITE, or READ + WRITE. To apply these kinds of checks we would also have to extend the RegionObserver interface to provide hooks wrapping HRegion.exec(). -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-6104) Require EXEC permission to call coprocessor endpoints
[ https://issues.apache.org/jira/browse/HBASE-6104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-6104: -- Release Note: If access control is active (the AccessController coprocessor is installed either as a system coprocessor or on a table as a table coprocessor) and the hbase.security.exec.permission.checks configuration setting is true, then you must now grant all relevant users EXEC privilege if they require the ability to execute coprocessor endpoint calls. EXEC privilege, like any other permission, can be granted globally to a user, or to a user on a per table or per namespace basis. For more information on coprocessor endpoints, see the coprocessor section of the HBase online manual. For more information on granting or revoking permissions using the AccessController, see the security section of the HBase online manual. (was: If access control is active (the AccessController coprocessor is installed either as a system coprocessor or on a table as a table coprocessor) then you must now grant all relevant users EXEC privilege if they require the ability to execute coprocessor endpoint calls. EXEC privilege, like any other permission, can be granted globally to a user, or to a user on a per table or per namespace basis. For more information on coprocessor endpoints, see the coprocessor section of the HBase online manual. For more information on granting or revoking permissions using the AccessController, see the security section of the HBase online manual.) Require EXEC permission to call coprocessor endpoints - Key: HBASE-6104 URL: https://issues.apache.org/jira/browse/HBASE-6104 Project: HBase Issue Type: New Feature Components: Coprocessors, security Reporter: Gary Helmling Assignee: Andrew Purtell Fix For: 0.98.0, 0.99.0 Attachments: 6104-addendum-1.patch, 6104-revert.patch, 6104.patch, 6104.patch, 6104.patch, 6104.patch, 6104.patch, 6104.patch, 6104.patch The EXEC action currently exists as only a placeholder in access control. It should really be used to enforce access to coprocessor endpoint RPC calls, which are currently unrestricted. How the ACLs to support this would be modeled deserves some discussion: * Should access be scoped to a specific table and CoprocessorProtocol extension? * Should it be possible to grant access to a CoprocessorProtocol implementation globally (regardless of table)? * Are per-method restrictions necessary? * Should we expose hooks available to endpoint implementors so that they could additionally apply their own permission checks? Some CP endpoints may want to require READ permissions, others may want to enforce WRITE, or READ + WRITE. To apply these kinds of checks we would also have to extend the RegionObserver interface to provide hooks wrapping HRegion.exec(). -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HBASE-10285) All for configurable policies in ChaosMonkey
Cody Marcel created HBASE-10285: --- Summary: All for configurable policies in ChaosMonkey Key: HBASE-10285 URL: https://issues.apache.org/jira/browse/HBASE-10285 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.16 Reporter: Cody Marcel Assignee: Cody Marcel Priority: Minor Fix For: 0.94.16 For command line runs of ChaosMonkey, we should be able to pass policies. They are currently hard coded to EVERY_MINUTE_RANDOM_ACTION_POLICY. I have made this policy the default, but now if you supply a policy as an option on the command line, it will work. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10244) store mvcc in store files per KV for longer time, discard during compactions based on scanner timeout
[ https://issues.apache.org/jira/browse/HBASE-10244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863526#comment-13863526 ] Enis Soztutar commented on HBASE-10244: --- Yes, we are storing mvcc numbers in hfiles, but since the mvcc numbers lifecycle , and a region scanner's lifecycle is bound by the RS lifetime we are just using the lowest mvcc read number of all the open scanners to manage the low watermark. I agree that we can implement a timeout based low watermark tracking. The client-side open scanners will receive the mvcc number. They would have to connect to the next region in an absolute time-bound (like 30 min), otherwise they will fail the scan completely. The RS then can manage a time window for mvcc numbers to expire the mvcc read numbers since any scanner not connecting for the last 30 min will be rejected. store mvcc in store files per KV for longer time, discard during compactions based on scanner timeout - Key: HBASE-10244 URL: https://issues.apache.org/jira/browse/HBASE-10244 Project: HBase Issue Type: Sub-task Components: Compaction, regionserver Reporter: Sergey Shelukhin Priority: Minor -Initially, we can store via KV tag. For perf reasons, it might make sense to make it a first-class member of KV, but that would require HFileV4- -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10241) implement mvcc-consistent scanners (across recovery)
[ https://issues.apache.org/jira/browse/HBASE-10241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863531#comment-13863531 ] Enis Soztutar commented on HBASE-10241: --- We need to fix this for a couple of different reasons: - Fixing scanner consistency with multi-row transactions (see HBASE-9797) - Adding cell-based scanners, and streaming scans - Adding single-row scanners. - Consistent scanners with region replicas in case replicas are mostly up to date (HBASE-10070) What is the plan here? I think we should do subtasks 1 and 3 regardless of HBASE-8763. But it seems that if we do HBASE-8763 first, it will be much cleaner and we won't need subtask 2 at all. implement mvcc-consistent scanners (across recovery) Key: HBASE-10241 URL: https://issues.apache.org/jira/browse/HBASE-10241 Project: HBase Issue Type: New Feature Components: HFile, regionserver, Scanners Affects Versions: 0.99.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: Consistent scanners.pdf Scanners currently use mvcc for consistency. However, mvcc is lost on server restart, or even a region move. This JIRA is to enable the scanners to transfer mvcc (or seqId, or some other number, see HBASE-8763) between servers. First, client scanner needs to get and store the readpoint. Second, mvcc needs to be preserved in WAL. Third, the mvcc needs to be stored in store files per KV and discarded when not needed. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10244) store mvcc in store files per KV for longer time, discard during compactions based on scanner timeout
[ https://issues.apache.org/jira/browse/HBASE-10244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863534#comment-13863534 ] Sergey Shelukhin commented on HBASE-10244: -- that is what the design doc in the parent JIRA says :P store mvcc in store files per KV for longer time, discard during compactions based on scanner timeout - Key: HBASE-10244 URL: https://issues.apache.org/jira/browse/HBASE-10244 Project: HBase Issue Type: Sub-task Components: Compaction, regionserver Reporter: Sergey Shelukhin Priority: Minor -Initially, we can store via KV tag. For perf reasons, it might make sense to make it a first-class member of KV, but that would require HFileV4- -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10285) All for configurable policies in ChaosMonkey
[ https://issues.apache.org/jira/browse/HBASE-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863529#comment-13863529 ] Sergey Shelukhin commented on HBASE-10285: -- This already works on 96/98/trunk, but the changes are pretty involved. Are you sure you need this in 94? All for configurable policies in ChaosMonkey Key: HBASE-10285 URL: https://issues.apache.org/jira/browse/HBASE-10285 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.16 Reporter: Cody Marcel Assignee: Cody Marcel Priority: Minor Fix For: 0.94.16 For command line runs of ChaosMonkey, we should be able to pass policies. They are currently hard coded to EVERY_MINUTE_RANDOM_ACTION_POLICY. I have made this policy the default, but now if you supply a policy as an option on the command line, it will work. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10285) All for configurable policies in ChaosMonkey
[ https://issues.apache.org/jira/browse/HBASE-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Marcel updated HBASE-10285: Status: Patch Available (was: In Progress) All for configurable policies in ChaosMonkey Key: HBASE-10285 URL: https://issues.apache.org/jira/browse/HBASE-10285 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.16 Reporter: Cody Marcel Assignee: Cody Marcel Priority: Minor Fix For: 0.94.16 For command line runs of ChaosMonkey, we should be able to pass policies. They are currently hard coded to EVERY_MINUTE_RANDOM_ACTION_POLICY. I have made this policy the default, but now if you supply a policy as an option on the command line, it will work. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10241) implement mvcc-consistent scanners (across recovery)
[ https://issues.apache.org/jira/browse/HBASE-10241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863546#comment-13863546 ] Sergey Shelukhin commented on HBASE-10241: -- Subtask 2 is trivial; the reason it is not done is that it is being done elsewhere (see how it's resolved as dup), so since it's not blocking us here and now it doesn't make sense to do double work. I am not working on this jira right now (will get back to it hopefully and there's a patch out in client subtask), but the plan was that I will do 1 and 3, and then take 2 if the other JIRA that does 2 is not done by then. HBASE-8763 does not need to block this, it's probably bigger than this entire JIRA. If it's done before this due to delays, good, if not, also good :) implement mvcc-consistent scanners (across recovery) Key: HBASE-10241 URL: https://issues.apache.org/jira/browse/HBASE-10241 Project: HBase Issue Type: New Feature Components: HFile, regionserver, Scanners Affects Versions: 0.99.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: Consistent scanners.pdf Scanners currently use mvcc for consistency. However, mvcc is lost on server restart, or even a region move. This JIRA is to enable the scanners to transfer mvcc (or seqId, or some other number, see HBASE-8763) between servers. First, client scanner needs to get and store the readpoint. Second, mvcc needs to be preserved in WAL. Third, the mvcc needs to be stored in store files per KV and discarded when not needed. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Work started] (HBASE-10285) All for configurable policies in ChaosMonkey
[ https://issues.apache.org/jira/browse/HBASE-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HBASE-10285 started by Cody Marcel. All for configurable policies in ChaosMonkey Key: HBASE-10285 URL: https://issues.apache.org/jira/browse/HBASE-10285 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.16 Reporter: Cody Marcel Assignee: Cody Marcel Priority: Minor Fix For: 0.94.16 For command line runs of ChaosMonkey, we should be able to pass policies. They are currently hard coded to EVERY_MINUTE_RANDOM_ACTION_POLICY. I have made this policy the default, but now if you supply a policy as an option on the command line, it will work. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10285) All for configurable policies in ChaosMonkey
[ https://issues.apache.org/jira/browse/HBASE-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863554#comment-13863554 ] Cody Marcel commented on HBASE-10285: - It's a pretty small change to make this work on 0.94. Patch is coming. All for configurable policies in ChaosMonkey Key: HBASE-10285 URL: https://issues.apache.org/jira/browse/HBASE-10285 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.16 Reporter: Cody Marcel Assignee: Cody Marcel Priority: Minor Fix For: 0.94.16 For command line runs of ChaosMonkey, we should be able to pass policies. They are currently hard coded to EVERY_MINUTE_RANDOM_ACTION_POLICY. I have made this policy the default, but now if you supply a policy as an option on the command line, it will work. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10241) implement mvcc-consistent scanners (across recovery)
[ https://issues.apache.org/jira/browse/HBASE-10241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863552#comment-13863552 ] Sergey Shelukhin commented on HBASE-10241: -- Thanks for HBASE-9797 reference, yes, it is good for that! implement mvcc-consistent scanners (across recovery) Key: HBASE-10241 URL: https://issues.apache.org/jira/browse/HBASE-10241 Project: HBase Issue Type: New Feature Components: HFile, regionserver, Scanners Affects Versions: 0.99.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: Consistent scanners.pdf Scanners currently use mvcc for consistency. However, mvcc is lost on server restart, or even a region move. This JIRA is to enable the scanners to transfer mvcc (or seqId, or some other number, see HBASE-8763) between servers. First, client scanner needs to get and store the readpoint. Second, mvcc needs to be preserved in WAL. Third, the mvcc needs to be stored in store files per KV and discarded when not needed. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10271) [regression] Cannot use the wildcard address since HBASE-9593
[ https://issues.apache.org/jira/browse/HBASE-10271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863560#comment-13863560 ] Andrew Purtell commented on HBASE-10271: Reverted HBASE-5953 on branch 0.98. [regression] Cannot use the wildcard address since HBASE-9593 - Key: HBASE-10271 URL: https://issues.apache.org/jira/browse/HBASE-10271 Project: HBase Issue Type: Bug Affects Versions: 0.98.0, 0.94.13, 0.96.1 Reporter: Jean-Daniel Cryans Priority: Critical Fix For: 0.98.0, 0.94.16, 0.96.2, 0.99.0 Attachments: HBASE-10271.patch HBASE-9593 moved the creation of the ephemeral znode earlier in the region server startup process such that we don't have access to the ServerName from the Master's POV. HRS.getMyEphemeralNodePath() calls HRS.getServerName() which at that point will return this.isa.getHostName(). If you set hbase.regionserver.ipc.address to 0.0.0.0, you will create a znode with that address. What happens next is that the RS will report for duty correctly but the master will do this: {noformat} 2014-01-02 11:45:49,498 INFO [master:172.21.3.117:6] master.ServerManager: Registering server=0:0:0:0:0:0:0:0%0,60020,1388691892014 2014-01-02 11:45:49,498 INFO [master:172.21.3.117:6] master.HMaster: Registered server found up in zk but who has not yet reported in: 0:0:0:0:0:0:0:0%0,60020,1388691892014 {noformat} The cluster is then unusable. I think a better solution is to track the heartbeats for the region servers and expire those that haven't checked-in for some time. The 0.89-fb branch has this concept, and they also use it to detect rack failures: https://github.com/apache/hbase/blob/0.89-fb/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java#L1224. In this jira's scope I would just add the heartbeat tracking and add a unit test for the wildcard address. What do you think [~rajesh23]? -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-9593) Region server left in online servers list forever if it went down after registering to master and before creating ephemeral node
[ https://issues.apache.org/jira/browse/HBASE-9593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-9593: -- Fix Version/s: (was: 0.98.0) Reverted from 0.98 branch. Region server left in online servers list forever if it went down after registering to master and before creating ephemeral node Key: HBASE-9593 URL: https://issues.apache.org/jira/browse/HBASE-9593 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.94.11 Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.96.1 Attachments: 9593-0.94.txt, HBASE-9593.patch, HBASE-9593_v2.patch, HBASE-9593_v3.patch In some of our tests we found that regionserer always showing online in master UI but its actually dead. If region server went down in the middle following steps then the region server always showing in master online servers list. 1) register to master 2) create ephemeral znode Since no notification from zookeeper, master is not removing the expired server from online servers list. Assignments will fail if the RS is selected as destination server. Some cases ROOT or META also wont be assigned if the RS is randomly selected every time need to wait for timeout. Here are the logs: 1) HOST-10-18-40-153 is registered to master {code} 2013-09-19 19:47:41,123 DEBUG org.apache.hadoop.hbase.master.ServerManager: STARTUP: Server HOST-10-18-40-153,61020,1379600260255 came back up, removed it from the dead servers list 2013-09-19 19:47:41,123 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=HOST-10-18-40-153,61020,1379600260255 {code} {code} 2013-09-19 19:47:41,119 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Connected to master at HOST-10-18-40-153/10.18.40.153:61000 2013-09-19 19:47:41,119 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at HOST-10-18-40-153,61000,1379600055284 that we are up with port=61020, startcode=1379600260255 {code} 2) Terminated before creating ephemeral node. {code} Thu Sep 19 19:47:41 IST 2013 Terminating regionserver {code} 3) The RS can be selected for assignment and they will fail. {code} 2013-09-19 19:47:54,049 WARN org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of -ROOT-,,0.70236052 to HOST-10-18-40-153,61020,1379600260255, trying to assign elsewhere instead; retry=0 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493) at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupConnection(HBaseClient.java:390) at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:436) at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1127) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:974) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:86) at $Proxy15.openRegion(Unknown Source) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:533) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1734) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1431) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1406) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1401) at org.apache.hadoop.hbase.master.AssignmentManager.assignRoot(AssignmentManager.java:2374) at org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler.verifyAndAssignRoot(MetaServerShutdownHandler.java:136) at org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler.verifyAndAssignRootWithRetries(MetaServerShutdownHandler.java:160) at org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler.process(MetaServerShutdownHandler.java:82) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:175) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) 2013-09-19 19:47:54,050 DEBUG
[jira] [Commented] (HBASE-10285) All for configurable policies in ChaosMonkey
[ https://issues.apache.org/jira/browse/HBASE-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863567#comment-13863567 ] Sergey Shelukhin commented on HBASE-10285: -- As long as we keep the same cmdline syntax as trunk, works for me... would need +1 from [~lhofhansl] also. Thanks! All for configurable policies in ChaosMonkey Key: HBASE-10285 URL: https://issues.apache.org/jira/browse/HBASE-10285 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.16 Reporter: Cody Marcel Assignee: Cody Marcel Priority: Minor Fix For: 0.94.16 Attachments: HBASE-10285 For command line runs of ChaosMonkey, we should be able to pass policies. They are currently hard coded to EVERY_MINUTE_RANDOM_ACTION_POLICY. I have made this policy the default, but now if you supply a policy as an option on the command line, it will work. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10285) All for configurable policies in ChaosMonkey
[ https://issues.apache.org/jira/browse/HBASE-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-10285: -- Status: Open (was: Patch Available) All for configurable policies in ChaosMonkey Key: HBASE-10285 URL: https://issues.apache.org/jira/browse/HBASE-10285 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.16 Reporter: Cody Marcel Assignee: Cody Marcel Priority: Minor Fix For: 0.94.16 Attachments: HBASE-10285 For command line runs of ChaosMonkey, we should be able to pass policies. They are currently hard coded to EVERY_MINUTE_RANDOM_ACTION_POLICY. I have made this policy the default, but now if you supply a policy as an option on the command line, it will work. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10285) All for configurable policies in ChaosMonkey
[ https://issues.apache.org/jira/browse/HBASE-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Marcel updated HBASE-10285: Attachment: HBASE-10285 All for configurable policies in ChaosMonkey Key: HBASE-10285 URL: https://issues.apache.org/jira/browse/HBASE-10285 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.16 Reporter: Cody Marcel Assignee: Cody Marcel Priority: Minor Fix For: 0.94.16 Attachments: HBASE-10285 For command line runs of ChaosMonkey, we should be able to pass policies. They are currently hard coded to EVERY_MINUTE_RANDOM_ACTION_POLICY. I have made this policy the default, but now if you supply a policy as an option on the command line, it will work. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10285) All for configurable policies in ChaosMonkey
[ https://issues.apache.org/jira/browse/HBASE-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863572#comment-13863572 ] Cody Marcel commented on HBASE-10285: - the syntax would be -policy EVERY_MINUTE_RANDOM_ACTION_POLICY All for configurable policies in ChaosMonkey Key: HBASE-10285 URL: https://issues.apache.org/jira/browse/HBASE-10285 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.16 Reporter: Cody Marcel Assignee: Cody Marcel Priority: Minor Fix For: 0.94.16 Attachments: HBASE-10285 For command line runs of ChaosMonkey, we should be able to pass policies. They are currently hard coded to EVERY_MINUTE_RANDOM_ACTION_POLICY. I have made this policy the default, but now if you supply a policy as an option on the command line, it will work. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10285) All for configurable policies in ChaosMonkey
[ https://issues.apache.org/jira/browse/HBASE-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863573#comment-13863573 ] Cody Marcel commented on HBASE-10285: - Including [~enis] [~jesse_yates] All for configurable policies in ChaosMonkey Key: HBASE-10285 URL: https://issues.apache.org/jira/browse/HBASE-10285 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.16 Reporter: Cody Marcel Assignee: Cody Marcel Priority: Minor Fix For: 0.94.16 Attachments: HBASE-10285 For command line runs of ChaosMonkey, we should be able to pass policies. They are currently hard coded to EVERY_MINUTE_RANDOM_ACTION_POLICY. I have made this policy the default, but now if you supply a policy as an option on the command line, it will work. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10078) Dynamic Filter - Not using DynamicClassLoader when using FilterList
[ https://issues.apache.org/jira/browse/HBASE-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-10078: -- Fix Version/s: (was: 0.96.2) (was: 0.98.0) NP :) This change only went into 0.94, right? (removed the other fix tags) Dynamic Filter - Not using DynamicClassLoader when using FilterList --- Key: HBASE-10078 URL: https://issues.apache.org/jira/browse/HBASE-10078 Project: HBase Issue Type: Bug Components: Filters Affects Versions: 0.94.13 Reporter: Federico Gaule Assignee: Jimmy Xiang Priority: Minor Fix For: 0.94.16 Attachments: 0.94-10078.patch, 0.94-10078_v2.patch, hbase-10078.patch I've tried to use dynamic jar load (https://issues.apache.org/jira/browse/HBASE-1936) but seems to have an issue with FilterList. Here is some log from my app where i send a Get with a FilterList containing AFilter and other with BFilter. {noformat} 2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Class d.p.AFilter not found - using dynamical class loader 2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Finding class: d.p.AFilter 2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Loading new jar files, if any 2013-12-02 13:55:42,677 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Finding class again: d.p.AFilter 2013-12-02 13:55:43,004 ERROR org.apache.hadoop.hbase.io.HbaseObjectWritable: Can't find class d.p.BFilter java.lang.ClassNotFoundException: d.p.BFilter at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:820) at org.apache.hadoop.hbase.io.HbaseObjectWritable.getClassByName(HbaseObjectWritable.java:792) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:679) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594) at org.apache.hadoop.hbase.filter.FilterList.readFields(FilterList.java:324) at org.apache.hadoop.hbase.client.Get.readFields(Get.java:405) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594) at org.apache.hadoop.hbase.client.Action.readFields(Action.java:101) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594) at org.apache.hadoop.hbase.client.MultiAction.readFields(MultiAction.java:116) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690) at org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:126) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1311) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1226) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:748) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:539) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:514) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {noformat} AFilter is not found so it tries with DynamicClassLoader, but when it tries to load AFilter, it uses URLClassLoader and fails without checking out for dynamic jars. I think the issue is releated to FilterList#readFields {code:title=FilterList.java|borderStyle=solid} public void readFields(final DataInput in) throws IOException { byte opByte = in.readByte(); operator = Operator.values()[opByte]; int size = in.readInt(); if (size 0) { filters = new ArrayListFilter(size); for (int i = 0; i size; i++) { Filter filter = (Filter)HbaseObjectWritable.readObject(in, conf);
[jira] [Commented] (HBASE-10078) Dynamic Filter - Not using DynamicClassLoader when using FilterList
[ https://issues.apache.org/jira/browse/HBASE-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863580#comment-13863580 ] Hudson commented on HBASE-10078: FAILURE: Integrated in HBase-0.94 #1253 (See [https://builds.apache.org/job/HBase-0.94/1253/]) HBASE-10078 Dynamic Filter - Not using DynamicClassLoader when using FilterList (jxiang: rev 1556027) * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/filter/FilterList.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/util/Classes.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/client/TestGet.java Dynamic Filter - Not using DynamicClassLoader when using FilterList --- Key: HBASE-10078 URL: https://issues.apache.org/jira/browse/HBASE-10078 Project: HBase Issue Type: Bug Components: Filters Affects Versions: 0.94.13 Reporter: Federico Gaule Assignee: Jimmy Xiang Priority: Minor Fix For: 0.94.16 Attachments: 0.94-10078.patch, 0.94-10078_v2.patch, hbase-10078.patch I've tried to use dynamic jar load (https://issues.apache.org/jira/browse/HBASE-1936) but seems to have an issue with FilterList. Here is some log from my app where i send a Get with a FilterList containing AFilter and other with BFilter. {noformat} 2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Class d.p.AFilter not found - using dynamical class loader 2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Finding class: d.p.AFilter 2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Loading new jar files, if any 2013-12-02 13:55:42,677 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Finding class again: d.p.AFilter 2013-12-02 13:55:43,004 ERROR org.apache.hadoop.hbase.io.HbaseObjectWritable: Can't find class d.p.BFilter java.lang.ClassNotFoundException: d.p.BFilter at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:820) at org.apache.hadoop.hbase.io.HbaseObjectWritable.getClassByName(HbaseObjectWritable.java:792) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:679) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594) at org.apache.hadoop.hbase.filter.FilterList.readFields(FilterList.java:324) at org.apache.hadoop.hbase.client.Get.readFields(Get.java:405) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594) at org.apache.hadoop.hbase.client.Action.readFields(Action.java:101) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594) at org.apache.hadoop.hbase.client.MultiAction.readFields(MultiAction.java:116) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690) at org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:126) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1311) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1226) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:748) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:539) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:514) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {noformat} AFilter is not found so it tries with DynamicClassLoader, but when it tries to load AFilter, it uses URLClassLoader and fails without checking out for dynamic jars. I think the issue is releated to
[jira] [Commented] (HBASE-10271) [regression] Cannot use the wildcard address since HBASE-9593
[ https://issues.apache.org/jira/browse/HBASE-10271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863583#comment-13863583 ] Lars Hofhansl commented on HBASE-10271: --- I'll do the same from 0.94. I'd prefer not to introduce another heartbeat mechanism over what we have from ZK. (we used to have heartbeats and then removed them in favor of ZK before my time, right?) [regression] Cannot use the wildcard address since HBASE-9593 - Key: HBASE-10271 URL: https://issues.apache.org/jira/browse/HBASE-10271 Project: HBase Issue Type: Bug Affects Versions: 0.98.0, 0.94.13, 0.96.1 Reporter: Jean-Daniel Cryans Priority: Critical Fix For: 0.98.0, 0.94.16, 0.96.2, 0.99.0 Attachments: HBASE-10271.patch HBASE-9593 moved the creation of the ephemeral znode earlier in the region server startup process such that we don't have access to the ServerName from the Master's POV. HRS.getMyEphemeralNodePath() calls HRS.getServerName() which at that point will return this.isa.getHostName(). If you set hbase.regionserver.ipc.address to 0.0.0.0, you will create a znode with that address. What happens next is that the RS will report for duty correctly but the master will do this: {noformat} 2014-01-02 11:45:49,498 INFO [master:172.21.3.117:6] master.ServerManager: Registering server=0:0:0:0:0:0:0:0%0,60020,1388691892014 2014-01-02 11:45:49,498 INFO [master:172.21.3.117:6] master.HMaster: Registered server found up in zk but who has not yet reported in: 0:0:0:0:0:0:0:0%0,60020,1388691892014 {noformat} The cluster is then unusable. I think a better solution is to track the heartbeats for the region servers and expire those that haven't checked-in for some time. The 0.89-fb branch has this concept, and they also use it to detect rack failures: https://github.com/apache/hbase/blob/0.89-fb/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java#L1224. In this jira's scope I would just add the heartbeat tracking and add a unit test for the wildcard address. What do you think [~rajesh23]? -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10241) implement mvcc-consistent scanners (across recovery)
[ https://issues.apache.org/jira/browse/HBASE-10241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863590#comment-13863590 ] Enis Soztutar commented on HBASE-10241: --- bq. but the plan was that I will do 1 and 3, and then take 2 if the other JIRA that does 2 is not done by then. Sounds good. I thought HBASE-8721 is won't fix. bq. HBASE-8763 does not need to block this, it's probably bigger than this entire JIRA Indeed. But it will be a shame if we add mvcc's to WAL only to remove them again after HBASE-8763. BTW, I think we also have to handle mvcc / seqId as a part of the serialization in the KV byte array. Do we have any open issues for that? implement mvcc-consistent scanners (across recovery) Key: HBASE-10241 URL: https://issues.apache.org/jira/browse/HBASE-10241 Project: HBase Issue Type: New Feature Components: HFile, regionserver, Scanners Affects Versions: 0.99.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: Consistent scanners.pdf Scanners currently use mvcc for consistency. However, mvcc is lost on server restart, or even a region move. This JIRA is to enable the scanners to transfer mvcc (or seqId, or some other number, see HBASE-8763) between servers. First, client scanner needs to get and store the readpoint. Second, mvcc needs to be preserved in WAL. Third, the mvcc needs to be stored in store files per KV and discarded when not needed. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-9593) Region server left in online servers list forever if it went down after registering to master and before creating ephemeral node
[ https://issues.apache.org/jira/browse/HBASE-9593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863589#comment-13863589 ] Lars Hofhansl commented on HBASE-9593: -- Since there are released versions with this, I think we need to have a separate jira for the revert. Region server left in online servers list forever if it went down after registering to master and before creating ephemeral node Key: HBASE-9593 URL: https://issues.apache.org/jira/browse/HBASE-9593 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.94.11 Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.96.1 Attachments: 9593-0.94.txt, HBASE-9593.patch, HBASE-9593_v2.patch, HBASE-9593_v3.patch In some of our tests we found that regionserer always showing online in master UI but its actually dead. If region server went down in the middle following steps then the region server always showing in master online servers list. 1) register to master 2) create ephemeral znode Since no notification from zookeeper, master is not removing the expired server from online servers list. Assignments will fail if the RS is selected as destination server. Some cases ROOT or META also wont be assigned if the RS is randomly selected every time need to wait for timeout. Here are the logs: 1) HOST-10-18-40-153 is registered to master {code} 2013-09-19 19:47:41,123 DEBUG org.apache.hadoop.hbase.master.ServerManager: STARTUP: Server HOST-10-18-40-153,61020,1379600260255 came back up, removed it from the dead servers list 2013-09-19 19:47:41,123 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=HOST-10-18-40-153,61020,1379600260255 {code} {code} 2013-09-19 19:47:41,119 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Connected to master at HOST-10-18-40-153/10.18.40.153:61000 2013-09-19 19:47:41,119 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at HOST-10-18-40-153,61000,1379600055284 that we are up with port=61020, startcode=1379600260255 {code} 2) Terminated before creating ephemeral node. {code} Thu Sep 19 19:47:41 IST 2013 Terminating regionserver {code} 3) The RS can be selected for assignment and they will fail. {code} 2013-09-19 19:47:54,049 WARN org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of -ROOT-,,0.70236052 to HOST-10-18-40-153,61020,1379600260255, trying to assign elsewhere instead; retry=0 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493) at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupConnection(HBaseClient.java:390) at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:436) at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1127) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:974) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:86) at $Proxy15.openRegion(Unknown Source) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:533) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1734) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1431) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1406) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1401) at org.apache.hadoop.hbase.master.AssignmentManager.assignRoot(AssignmentManager.java:2374) at org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler.verifyAndAssignRoot(MetaServerShutdownHandler.java:136) at org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler.verifyAndAssignRootWithRetries(MetaServerShutdownHandler.java:160) at org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler.process(MetaServerShutdownHandler.java:82) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:175) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at
[jira] [Commented] (HBASE-10285) All for configurable policies in ChaosMonkey
[ https://issues.apache.org/jira/browse/HBASE-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863592#comment-13863592 ] Enis Soztutar commented on HBASE-10285: --- Cody can you re-attach this as a patch. Currently it is .html. All for configurable policies in ChaosMonkey Key: HBASE-10285 URL: https://issues.apache.org/jira/browse/HBASE-10285 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.16 Reporter: Cody Marcel Assignee: Cody Marcel Priority: Minor Fix For: 0.94.16 Attachments: HBASE-10285 For command line runs of ChaosMonkey, we should be able to pass policies. They are currently hard coded to EVERY_MINUTE_RANDOM_ACTION_POLICY. I have made this policy the default, but now if you supply a policy as an option on the command line, it will work. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-9593) Region server left in online servers list forever if it went down after registering to master and before creating ephemeral node
[ https://issues.apache.org/jira/browse/HBASE-9593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863593#comment-13863593 ] Andrew Purtell commented on HBASE-9593: --- A revert is just an application of this patch with -R. I commented here and on HBASE-10271 Region server left in online servers list forever if it went down after registering to master and before creating ephemeral node Key: HBASE-9593 URL: https://issues.apache.org/jira/browse/HBASE-9593 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.94.11 Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.96.1 Attachments: 9593-0.94.txt, HBASE-9593.patch, HBASE-9593_v2.patch, HBASE-9593_v3.patch In some of our tests we found that regionserer always showing online in master UI but its actually dead. If region server went down in the middle following steps then the region server always showing in master online servers list. 1) register to master 2) create ephemeral znode Since no notification from zookeeper, master is not removing the expired server from online servers list. Assignments will fail if the RS is selected as destination server. Some cases ROOT or META also wont be assigned if the RS is randomly selected every time need to wait for timeout. Here are the logs: 1) HOST-10-18-40-153 is registered to master {code} 2013-09-19 19:47:41,123 DEBUG org.apache.hadoop.hbase.master.ServerManager: STARTUP: Server HOST-10-18-40-153,61020,1379600260255 came back up, removed it from the dead servers list 2013-09-19 19:47:41,123 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=HOST-10-18-40-153,61020,1379600260255 {code} {code} 2013-09-19 19:47:41,119 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Connected to master at HOST-10-18-40-153/10.18.40.153:61000 2013-09-19 19:47:41,119 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at HOST-10-18-40-153,61000,1379600055284 that we are up with port=61020, startcode=1379600260255 {code} 2) Terminated before creating ephemeral node. {code} Thu Sep 19 19:47:41 IST 2013 Terminating regionserver {code} 3) The RS can be selected for assignment and they will fail. {code} 2013-09-19 19:47:54,049 WARN org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of -ROOT-,,0.70236052 to HOST-10-18-40-153,61020,1379600260255, trying to assign elsewhere instead; retry=0 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493) at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupConnection(HBaseClient.java:390) at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:436) at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1127) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:974) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:86) at $Proxy15.openRegion(Unknown Source) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:533) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1734) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1431) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1406) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1401) at org.apache.hadoop.hbase.master.AssignmentManager.assignRoot(AssignmentManager.java:2374) at org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler.verifyAndAssignRoot(MetaServerShutdownHandler.java:136) at org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler.verifyAndAssignRootWithRetries(MetaServerShutdownHandler.java:160) at org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler.process(MetaServerShutdownHandler.java:82) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:175) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at
[jira] [Created] (HBASE-10286) Revert HBASE-9593, breaks wildcard addresses
Lars Hofhansl created HBASE-10286: - Summary: Revert HBASE-9593, breaks wildcard addresses Key: HBASE-10286 URL: https://issues.apache.org/jira/browse/HBASE-10286 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Fix For: 0.94.16 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10286) Revert HBASE-9593, breaks wildcard addresses
[ https://issues.apache.org/jira/browse/HBASE-10286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-10286: -- Description: See discussion on HBASE-10271. This breaks regionserver wildcard bind addresses. Revert HBASE-9593, breaks wildcard addresses Key: HBASE-10286 URL: https://issues.apache.org/jira/browse/HBASE-10286 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Fix For: 0.94.16 See discussion on HBASE-10271. This breaks regionserver wildcard bind addresses. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10078) Dynamic Filter - Not using DynamicClassLoader when using FilterList
[ https://issues.apache.org/jira/browse/HBASE-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863595#comment-13863595 ] Jimmy Xiang commented on HBASE-10078: - The fix went into 0.94 only. Also added some tests for other branches to cover FilterList, no other code change. Dynamic Filter - Not using DynamicClassLoader when using FilterList --- Key: HBASE-10078 URL: https://issues.apache.org/jira/browse/HBASE-10078 Project: HBase Issue Type: Bug Components: Filters Affects Versions: 0.94.13 Reporter: Federico Gaule Assignee: Jimmy Xiang Priority: Minor Fix For: 0.94.16 Attachments: 0.94-10078.patch, 0.94-10078_v2.patch, hbase-10078.patch I've tried to use dynamic jar load (https://issues.apache.org/jira/browse/HBASE-1936) but seems to have an issue with FilterList. Here is some log from my app where i send a Get with a FilterList containing AFilter and other with BFilter. {noformat} 2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Class d.p.AFilter not found - using dynamical class loader 2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Finding class: d.p.AFilter 2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Loading new jar files, if any 2013-12-02 13:55:42,677 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Finding class again: d.p.AFilter 2013-12-02 13:55:43,004 ERROR org.apache.hadoop.hbase.io.HbaseObjectWritable: Can't find class d.p.BFilter java.lang.ClassNotFoundException: d.p.BFilter at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:820) at org.apache.hadoop.hbase.io.HbaseObjectWritable.getClassByName(HbaseObjectWritable.java:792) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:679) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594) at org.apache.hadoop.hbase.filter.FilterList.readFields(FilterList.java:324) at org.apache.hadoop.hbase.client.Get.readFields(Get.java:405) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594) at org.apache.hadoop.hbase.client.Action.readFields(Action.java:101) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594) at org.apache.hadoop.hbase.client.MultiAction.readFields(MultiAction.java:116) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690) at org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:126) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1311) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1226) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:748) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:539) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:514) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {noformat} AFilter is not found so it tries with DynamicClassLoader, but when it tries to load AFilter, it uses URLClassLoader and fails without checking out for dynamic jars. I think the issue is releated to FilterList#readFields {code:title=FilterList.java|borderStyle=solid} public void readFields(final DataInput in) throws IOException { byte opByte = in.readByte(); operator = Operator.values()[opByte]; int size = in.readInt(); if (size 0) { filters = new ArrayListFilter(size); for (int i = 0; i size; i++) { Filter filter = (Filter)HbaseObjectWritable.readObject(in, conf);
[jira] [Commented] (HBASE-10284) Build broken with svn 1.8
[ https://issues.apache.org/jira/browse/HBASE-10284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863597#comment-13863597 ] Hudson commented on HBASE-10284: FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #54 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/54/]) HBASE-10284 Build broken with svn 1.8 (larsh: rev 1555963) * /hbase/branches/0.98/hbase-common/src/saveVersion.sh Build broken with svn 1.8 - Key: HBASE-10284 URL: https://issues.apache.org/jira/browse/HBASE-10284 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.98.0, 0.94.16, 0.96.2, 0.99.0 Attachments: 10284.txt Just upgraded my machine and found that {{svn info}} displays a Relative URL: line in svn 1.8. saveVersion.sh does not deal with that correctly. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10078) Dynamic Filter - Not using DynamicClassLoader when using FilterList
[ https://issues.apache.org/jira/browse/HBASE-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863601#comment-13863601 ] Hudson commented on HBASE-10078: SUCCESS: Integrated in HBase-TRUNK #4794 (See [https://builds.apache.org/job/HBase-TRUNK/4794/]) HBASE-10078 Dynamic Filter - Not using DynamicClassLoader when using FilterList (jxiang: rev 1556024) * /hbase/trunk/hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestGet.java Dynamic Filter - Not using DynamicClassLoader when using FilterList --- Key: HBASE-10078 URL: https://issues.apache.org/jira/browse/HBASE-10078 Project: HBase Issue Type: Bug Components: Filters Affects Versions: 0.94.13 Reporter: Federico Gaule Assignee: Jimmy Xiang Priority: Minor Fix For: 0.94.16 Attachments: 0.94-10078.patch, 0.94-10078_v2.patch, hbase-10078.patch I've tried to use dynamic jar load (https://issues.apache.org/jira/browse/HBASE-1936) but seems to have an issue with FilterList. Here is some log from my app where i send a Get with a FilterList containing AFilter and other with BFilter. {noformat} 2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Class d.p.AFilter not found - using dynamical class loader 2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Finding class: d.p.AFilter 2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Loading new jar files, if any 2013-12-02 13:55:42,677 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Finding class again: d.p.AFilter 2013-12-02 13:55:43,004 ERROR org.apache.hadoop.hbase.io.HbaseObjectWritable: Can't find class d.p.BFilter java.lang.ClassNotFoundException: d.p.BFilter at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:820) at org.apache.hadoop.hbase.io.HbaseObjectWritable.getClassByName(HbaseObjectWritable.java:792) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:679) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594) at org.apache.hadoop.hbase.filter.FilterList.readFields(FilterList.java:324) at org.apache.hadoop.hbase.client.Get.readFields(Get.java:405) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594) at org.apache.hadoop.hbase.client.Action.readFields(Action.java:101) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594) at org.apache.hadoop.hbase.client.MultiAction.readFields(MultiAction.java:116) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690) at org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:126) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1311) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1226) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:748) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:539) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:514) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {noformat} AFilter is not found so it tries with DynamicClassLoader, but when it tries to load AFilter, it uses URLClassLoader and fails without checking out for dynamic jars. I think the issue is releated to FilterList#readFields {code:title=FilterList.java|borderStyle=solid} public void readFields(final DataInput in) throws IOException { byte opByte = in.readByte(); operator = Operator.values()[opByte]; int size = in.readInt(); if (size 0) {
[jira] [Updated] (HBASE-10286) Revert HBASE-9593, breaks RS wildcard addresses
[ https://issues.apache.org/jira/browse/HBASE-10286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-10286: -- Summary: Revert HBASE-9593, breaks RS wildcard addresses (was: Revert HBASE-9593, breaks wildcard addresses) Revert HBASE-9593, breaks RS wildcard addresses --- Key: HBASE-10286 URL: https://issues.apache.org/jira/browse/HBASE-10286 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Fix For: 0.94.16 See discussion on HBASE-10271. This breaks regionserver wildcard bind addresses. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Comment Edited] (HBASE-9593) Region server left in online servers list forever if it went down after registering to master and before creating ephemeral node
[ https://issues.apache.org/jira/browse/HBASE-9593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863593#comment-13863593 ] Andrew Purtell edited comment on HBASE-9593 at 1/6/14 11:11 PM: A revert is just an application of this patch with -R. I commented here and on HBASE-10271 Edit: ... for 0.98. For released versions, RMs could create a new JIRA. I don't think that's necessary, a SVN commit starting with Revert HBASE-9593... would work (IMO). was (Author: apurtell): A revert is just an application of this patch with -R. I commented here and on HBASE-10271 Region server left in online servers list forever if it went down after registering to master and before creating ephemeral node Key: HBASE-9593 URL: https://issues.apache.org/jira/browse/HBASE-9593 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.94.11 Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.96.1 Attachments: 9593-0.94.txt, HBASE-9593.patch, HBASE-9593_v2.patch, HBASE-9593_v3.patch In some of our tests we found that regionserer always showing online in master UI but its actually dead. If region server went down in the middle following steps then the region server always showing in master online servers list. 1) register to master 2) create ephemeral znode Since no notification from zookeeper, master is not removing the expired server from online servers list. Assignments will fail if the RS is selected as destination server. Some cases ROOT or META also wont be assigned if the RS is randomly selected every time need to wait for timeout. Here are the logs: 1) HOST-10-18-40-153 is registered to master {code} 2013-09-19 19:47:41,123 DEBUG org.apache.hadoop.hbase.master.ServerManager: STARTUP: Server HOST-10-18-40-153,61020,1379600260255 came back up, removed it from the dead servers list 2013-09-19 19:47:41,123 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=HOST-10-18-40-153,61020,1379600260255 {code} {code} 2013-09-19 19:47:41,119 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Connected to master at HOST-10-18-40-153/10.18.40.153:61000 2013-09-19 19:47:41,119 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at HOST-10-18-40-153,61000,1379600055284 that we are up with port=61020, startcode=1379600260255 {code} 2) Terminated before creating ephemeral node. {code} Thu Sep 19 19:47:41 IST 2013 Terminating regionserver {code} 3) The RS can be selected for assignment and they will fail. {code} 2013-09-19 19:47:54,049 WARN org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of -ROOT-,,0.70236052 to HOST-10-18-40-153,61020,1379600260255, trying to assign elsewhere instead; retry=0 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493) at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupConnection(HBaseClient.java:390) at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:436) at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1127) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:974) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:86) at $Proxy15.openRegion(Unknown Source) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:533) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1734) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1431) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1406) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1401) at org.apache.hadoop.hbase.master.AssignmentManager.assignRoot(AssignmentManager.java:2374) at org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler.verifyAndAssignRoot(MetaServerShutdownHandler.java:136) at org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler.verifyAndAssignRootWithRetries(MetaServerShutdownHandler.java:160) at
[jira] [Updated] (HBASE-10286) Revert HBASE-9593, breaks RS wildcard addresses
[ https://issues.apache.org/jira/browse/HBASE-10286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-10286: -- Committed to 0.94. [~apurtell], [~stack] Revert HBASE-9593, breaks RS wildcard addresses --- Key: HBASE-10286 URL: https://issues.apache.org/jira/browse/HBASE-10286 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Fix For: 0.94.16 Attachments: 10286-0.94.txt See discussion on HBASE-10271. This breaks regionserver wildcard bind addresses. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Resolved] (HBASE-10286) Revert HBASE-9593, breaks RS wildcard addresses
[ https://issues.apache.org/jira/browse/HBASE-10286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl resolved HBASE-10286. --- Resolution: Fixed Assignee: Lars Hofhansl Revert HBASE-9593, breaks RS wildcard addresses --- Key: HBASE-10286 URL: https://issues.apache.org/jira/browse/HBASE-10286 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.16 Attachments: 10286-0.94.txt See discussion on HBASE-10271. This breaks regionserver wildcard bind addresses. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10286) Revert HBASE-9593, breaks RS wildcard addresses
[ https://issues.apache.org/jira/browse/HBASE-10286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-10286: -- Attachment: 10286-0.94.txt Patch for 0.94, reverts part of HBASE-9842. Revert HBASE-9593, breaks RS wildcard addresses --- Key: HBASE-10286 URL: https://issues.apache.org/jira/browse/HBASE-10286 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Fix For: 0.94.16 Attachments: 10286-0.94.txt See discussion on HBASE-10271. This breaks regionserver wildcard bind addresses. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-8889) TestIOFencing#testFencingAroundCompaction occasionally fails
[ https://issues.apache.org/jira/browse/HBASE-8889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863612#comment-13863612 ] Ted Yu commented on HBASE-8889: --- From the log: {code} 2013-12-28 03:13:53,404 DEBUG [pool-1-thread-1] hbase.TestIOFencing$CompactionBlockerRegion(103): allowing compactions ... 2013-12-28 03:13:53,413 DEBUG [RS:0;asf002:54266-shortCompactions-1388200422935] hdfs.DFSInputStream(1095): Error making BlockReader. Closing stale NioInetPeer(Socket[addr=/127.0.0.1,port=57329,localport=55235]) java.io.EOFException: Premature EOF: no length prefix available at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:1492) at org.apache.hadoop.hdfs.RemoteBlockReader2.newBlockReader(RemoteBlockReader2.java:392) at org.apache.hadoop.hdfs.BlockReaderFactory.newBlockReader(BlockReaderFactory.java:131) at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:1088) at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:533) at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:749) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:793) at java.io.DataInputStream.read(DataInputStream.java:132) at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:192) at org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.readAtOffset(HFileBlock.java:1210) at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockDataInternal(HFileBlock.java:1483) at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockData(HFileBlock.java:1314) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:355) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.seekTo(HFileReaderV2.java:765) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:245) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:153) at org.apache.hadoop.hbase.regionserver.StoreScanner.seekScanners(StoreScanner.java:319) at org.apache.hadoop.hbase.regionserver.StoreScanner.lt;init(StoreScanner.java:242) at org.apache.hadoop.hbase.regionserver.StoreScanner.lt;init(StoreScanner.java:202) at org.apache.hadoop.hbase.regionserver.compactions.Compactor.createScanner(Compactor.java:257) at org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:65) at org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:109) at org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1074) at org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:1378) at org.apache.hadoop.hbase.TestIOFencing$CompactionBlockerRegion.compact(TestIOFencing.java:118) {code} There was an EOFException seeking the scanner. {code} public boolean compact(CompactionContext compaction, Store store) throws IOException { try { return super.compact(compaction, store); } finally { compactCount++; } } {code} However, the compactCount was incremented in the finally block, leading to premature exit from the following loop: {code} while (compactingRegion.compactCount == 0) { Thread.sleep(1000); } {code} TestIOFencing#testFencingAroundCompaction occasionally fails Key: HBASE-8889 URL: https://issues.apache.org/jira/browse/HBASE-8889 Project: HBase Issue Type: Test Reporter: Ted Yu Priority: Minor Attachments: TestIOFencing.tar.gz From https://builds.apache.org/job/PreCommit-HBASE-Build/6232//testReport/org.apache.hadoop.hbase/TestIOFencing/testFencingAroundCompaction/ : {code} java.lang.AssertionError: Timed out waiting for new server to open region at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.assertTrue(Assert.java:41) at org.apache.hadoop.hbase.TestIOFencing.doTest(TestIOFencing.java:269) at org.apache.hadoop.hbase.TestIOFencing.testFencingAroundCompaction(TestIOFencing.java:205) {code} {code} 2013-07-06 23:13:53,120 INFO [pool-1-thread-1] hbase.TestIOFencing(266): Waiting for the new server to pick up the region tabletest,,1373152125442.6e62d3b24ea23160931362b60359ff03. 2013-07-06 23:13:54,120 INFO [pool-1-thread-1] hbase.TestIOFencing(266): Waiting for the new server to pick up the region tabletest,,1373152125442.6e62d3b24ea23160931362b60359ff03. 2013-07-06 23:13:55,121 DEBUG [pool-1-thread-1] hbase.TestIOFencing$CompactionBlockerRegion(102): allowing compactions 2013-07-06 23:13:55,121 INFO [pool-1-thread-1] hbase.HBaseTestingUtility(911): Shutting down minicluster 2013-07-06 23:13:55,121 DEBUG [pool-1-thread-1]
[jira] [Updated] (HBASE-10285) All for configurable policies in ChaosMonkey
[ https://issues.apache.org/jira/browse/HBASE-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-10285: -- Attachment: HBASE-10285.txt Reattached as .txt, +1 from me. All for configurable policies in ChaosMonkey Key: HBASE-10285 URL: https://issues.apache.org/jira/browse/HBASE-10285 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.16 Reporter: Cody Marcel Assignee: Cody Marcel Priority: Minor Fix For: 0.94.16 Attachments: HBASE-10285, HBASE-10285.txt For command line runs of ChaosMonkey, we should be able to pass policies. They are currently hard coded to EVERY_MINUTE_RANDOM_ACTION_POLICY. I have made this policy the default, but now if you supply a policy as an option on the command line, it will work. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10241) implement mvcc-consistent scanners (across recovery)
[ https://issues.apache.org/jira/browse/HBASE-10241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863615#comment-13863615 ] Sergey Shelukhin commented on HBASE-10241: -- There's another issue, HBASE-10227 for the WAL stuff. Mvcc can already be serialized with KV in HFile. Comment in KeyValue.java is a lie :) implement mvcc-consistent scanners (across recovery) Key: HBASE-10241 URL: https://issues.apache.org/jira/browse/HBASE-10241 Project: HBase Issue Type: New Feature Components: HFile, regionserver, Scanners Affects Versions: 0.99.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: Consistent scanners.pdf Scanners currently use mvcc for consistency. However, mvcc is lost on server restart, or even a region move. This JIRA is to enable the scanners to transfer mvcc (or seqId, or some other number, see HBASE-8763) between servers. First, client scanner needs to get and store the readpoint. Second, mvcc needs to be preserved in WAL. Third, the mvcc needs to be stored in store files per KV and discarded when not needed. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10078) Dynamic Filter - Not using DynamicClassLoader when using FilterList
[ https://issues.apache.org/jira/browse/HBASE-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863618#comment-13863618 ] Hudson commented on HBASE-10078: ABORTED: Integrated in HBase-0.94-JDK7 #20 (See [https://builds.apache.org/job/HBase-0.94-JDK7/20/]) HBASE-10078 Dynamic Filter - Not using DynamicClassLoader when using FilterList (jxiang: rev 1556027) * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/filter/FilterList.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/util/Classes.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/client/TestGet.java Dynamic Filter - Not using DynamicClassLoader when using FilterList --- Key: HBASE-10078 URL: https://issues.apache.org/jira/browse/HBASE-10078 Project: HBase Issue Type: Bug Components: Filters Affects Versions: 0.94.13 Reporter: Federico Gaule Assignee: Jimmy Xiang Priority: Minor Fix For: 0.94.16 Attachments: 0.94-10078.patch, 0.94-10078_v2.patch, hbase-10078.patch I've tried to use dynamic jar load (https://issues.apache.org/jira/browse/HBASE-1936) but seems to have an issue with FilterList. Here is some log from my app where i send a Get with a FilterList containing AFilter and other with BFilter. {noformat} 2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Class d.p.AFilter not found - using dynamical class loader 2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Finding class: d.p.AFilter 2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Loading new jar files, if any 2013-12-02 13:55:42,677 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: Finding class again: d.p.AFilter 2013-12-02 13:55:43,004 ERROR org.apache.hadoop.hbase.io.HbaseObjectWritable: Can't find class d.p.BFilter java.lang.ClassNotFoundException: d.p.BFilter at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:820) at org.apache.hadoop.hbase.io.HbaseObjectWritable.getClassByName(HbaseObjectWritable.java:792) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:679) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594) at org.apache.hadoop.hbase.filter.FilterList.readFields(FilterList.java:324) at org.apache.hadoop.hbase.client.Get.readFields(Get.java:405) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594) at org.apache.hadoop.hbase.client.Action.readFields(Action.java:101) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594) at org.apache.hadoop.hbase.client.MultiAction.readFields(MultiAction.java:116) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690) at org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:126) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1311) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1226) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:748) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:539) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:514) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {noformat} AFilter is not found so it tries with DynamicClassLoader, but when it tries to load AFilter, it uses URLClassLoader and fails without checking out for dynamic jars. I think the issue is releated to