[jira] [Updated] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] rajeshbabu updated HBASE-8667: -- Attachment: HBASE-8667_trunk_v6.patch [~stack] bq. Seems like a hadoop1 incompatibiity? Sorry for this. I have built tar ball with default hadoop profile 1.1.2, so didnt observe this. In present patch directly binding address to client socket(No change in Netutils.connect),so there wont be compatibility issue. I have built with 1.0.4 as well, its working fine. Thanks Stack. Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch, HBASE-8667_trunk_v4.patch, HBASE-8667_trunk_v5.patch, HBASE-8667_trunk_v6.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13691721#comment-13691721 ] rajeshbabu commented on HBASE-8667: --- Thanks [~viralbajaria] for testing the patch. [~anoop.hbase] bq. Seems NetUtils#connect(Socket socket, SocketAddress endpoint, SocketAddress localAddr, int timeout) not available with hadoop1. Yes Anoop.Its not present in hadoop 1.0.4, In latest patch avoided this. Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch, HBASE-8667_trunk_v4.patch, HBASE-8667_trunk_v5.patch, HBASE-8667_trunk_v6.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039) at
[jira] [Commented] (HBASE-8783) RSSnapshotManager.ZKProcedureMemberRpcs may be initialized with the wrong server name
[ https://issues.apache.org/jira/browse/HBASE-8783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13691740#comment-13691740 ] Hadoop QA commented on HBASE-8783: -- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12589372/HBASE-8783-v1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 6 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/6114//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6114//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6114//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6114//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6114//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6114//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6114//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6114//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6114//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6114//console This message is automatically generated. RSSnapshotManager.ZKProcedureMemberRpcs may be initialized with the wrong server name - Key: HBASE-8783 URL: https://issues.apache.org/jira/browse/HBASE-8783 Project: HBase Issue Type: Bug Components: snapshots Affects Versions: 0.94.8, 0.95.1 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Minor Fix For: 0.95.2, 0.94.9 Attachments: HBASE-8783-0.94-v0.patch, HBASE-8783-v0.patch, HBASE-8783-v1.patch The ZKProcedureMemberRpcs of the RegionServerSnapshotManager may be initialized with the wrong memberName. {code} 2013-06-21 05:03:41,732 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: Initialize Snapshot Manager ... 2013-06-21 05:03:41,875 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us hostname to use. Was=0.0.0.0, Now=srv-5.test.cloudera.com {code} The Region Server Name is used as memberName, but since the snapshot manger is initialized before the RS receives the server name used by the master, the zkprocedure will use the wrong name (0.0.0.0). This will case the snapshot to fail with a TimeoutException since the master will not receive the expected RS {code} Master: ZKProcedureCoordinatorRpcs: Watching for acquire node:/hbase/online-snapshot/acquired/foo23/srv-5.test.cloudera.com,60020,1371813451915 RS: ZKProcedureMemberRpcs: Member: '0.0.0.0,60020,1371814996779' joining acquired barrier for procedure (foo23) in zk ... org.apache.hadoop.hbase.errorhandling.TimeoutException: Timeout elapsed! Source:Timeout caused Foreign Exception Start:1371798732141, End:1371798792141, diff:6, max:6 ms {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8783) RSSnapshotManager.ZKProcedureMemberRpcs may be initialized with the wrong server name
[ https://issues.apache.org/jira/browse/HBASE-8783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-8783: --- Attachment: HBASE-8783-0.94-v1.patch v1 fixes the javadoc warning RSSnapshotManager.ZKProcedureMemberRpcs may be initialized with the wrong server name - Key: HBASE-8783 URL: https://issues.apache.org/jira/browse/HBASE-8783 Project: HBase Issue Type: Bug Components: snapshots Affects Versions: 0.94.8, 0.95.1 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Minor Fix For: 0.95.2, 0.94.9 Attachments: HBASE-8783-0.94-v0.patch, HBASE-8783-0.94-v1.patch, HBASE-8783-v0.patch, HBASE-8783-v1.patch The ZKProcedureMemberRpcs of the RegionServerSnapshotManager may be initialized with the wrong memberName. {code} 2013-06-21 05:03:41,732 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: Initialize Snapshot Manager ... 2013-06-21 05:03:41,875 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us hostname to use. Was=0.0.0.0, Now=srv-5.test.cloudera.com {code} The Region Server Name is used as memberName, but since the snapshot manger is initialized before the RS receives the server name used by the master, the zkprocedure will use the wrong name (0.0.0.0). This will case the snapshot to fail with a TimeoutException since the master will not receive the expected RS {code} Master: ZKProcedureCoordinatorRpcs: Watching for acquire node:/hbase/online-snapshot/acquired/foo23/srv-5.test.cloudera.com,60020,1371813451915 RS: ZKProcedureMemberRpcs: Member: '0.0.0.0,60020,1371814996779' joining acquired barrier for procedure (foo23) in zk ... org.apache.hadoop.hbase.errorhandling.TimeoutException: Timeout elapsed! Source:Timeout caused Foreign Exception Start:1371798732141, End:1371798792141, diff:6, max:6 ms {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8783) RSSnapshotManager.ZKProcedureMemberRpcs may be initialized with the wrong server name
[ https://issues.apache.org/jira/browse/HBASE-8783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13691760#comment-13691760 ] Hadoop QA commented on HBASE-8783: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12589379/HBASE-8783-0.94-v1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 6 new or modified tests. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6116//console This message is automatically generated. RSSnapshotManager.ZKProcedureMemberRpcs may be initialized with the wrong server name - Key: HBASE-8783 URL: https://issues.apache.org/jira/browse/HBASE-8783 Project: HBase Issue Type: Bug Components: snapshots Affects Versions: 0.94.8, 0.95.1 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Minor Fix For: 0.95.2, 0.94.9 Attachments: HBASE-8783-0.94-v0.patch, HBASE-8783-0.94-v1.patch, HBASE-8783-v0.patch, HBASE-8783-v1.patch The ZKProcedureMemberRpcs of the RegionServerSnapshotManager may be initialized with the wrong memberName. {code} 2013-06-21 05:03:41,732 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: Initialize Snapshot Manager ... 2013-06-21 05:03:41,875 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us hostname to use. Was=0.0.0.0, Now=srv-5.test.cloudera.com {code} The Region Server Name is used as memberName, but since the snapshot manger is initialized before the RS receives the server name used by the master, the zkprocedure will use the wrong name (0.0.0.0). This will case the snapshot to fail with a TimeoutException since the master will not receive the expected RS {code} Master: ZKProcedureCoordinatorRpcs: Watching for acquire node:/hbase/online-snapshot/acquired/foo23/srv-5.test.cloudera.com,60020,1371813451915 RS: ZKProcedureMemberRpcs: Member: '0.0.0.0,60020,1371814996779' joining acquired barrier for procedure (foo23) in zk ... org.apache.hadoop.hbase.errorhandling.TimeoutException: Timeout elapsed! Source:Timeout caused Foreign Exception Start:1371798732141, End:1371798792141, diff:6, max:6 ms {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8783) RSSnapshotManager.ZKProcedureMemberRpcs may be initialized with the wrong server name
[ https://issues.apache.org/jira/browse/HBASE-8783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13691772#comment-13691772 ] Lars Hofhansl commented on HBASE-8783: -- +1. Can we commit today, so that I can roll an RC? RSSnapshotManager.ZKProcedureMemberRpcs may be initialized with the wrong server name - Key: HBASE-8783 URL: https://issues.apache.org/jira/browse/HBASE-8783 Project: HBase Issue Type: Bug Components: snapshots Affects Versions: 0.94.8, 0.95.1 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Minor Fix For: 0.95.2, 0.94.9 Attachments: HBASE-8783-0.94-v0.patch, HBASE-8783-0.94-v1.patch, HBASE-8783-v0.patch, HBASE-8783-v1.patch The ZKProcedureMemberRpcs of the RegionServerSnapshotManager may be initialized with the wrong memberName. {code} 2013-06-21 05:03:41,732 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: Initialize Snapshot Manager ... 2013-06-21 05:03:41,875 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us hostname to use. Was=0.0.0.0, Now=srv-5.test.cloudera.com {code} The Region Server Name is used as memberName, but since the snapshot manger is initialized before the RS receives the server name used by the master, the zkprocedure will use the wrong name (0.0.0.0). This will case the snapshot to fail with a TimeoutException since the master will not receive the expected RS {code} Master: ZKProcedureCoordinatorRpcs: Watching for acquire node:/hbase/online-snapshot/acquired/foo23/srv-5.test.cloudera.com,60020,1371813451915 RS: ZKProcedureMemberRpcs: Member: '0.0.0.0,60020,1371814996779' joining acquired barrier for procedure (foo23) in zk ... org.apache.hadoop.hbase.errorhandling.TimeoutException: Timeout elapsed! Source:Timeout caused Foreign Exception Start:1371798732141, End:1371798792141, diff:6, max:6 ms {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5083) Backup HMaster should have http infoport open with link to the active master
[ https://issues.apache.org/jira/browse/HBASE-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-5083: - Attachment: HBASE-5083_trunk.patch And again Backup HMaster should have http infoport open with link to the active master Key: HBASE-5083 URL: https://issues.apache.org/jira/browse/HBASE-5083 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Cody Marcel Fix For: 0.94.9 Attachments: backup_master.png, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083_trunk.patch, HBASE-5083_trunk.patch, HBASE-5083_trunk.patch, master.png, Trunk_Backup_Master.png, Trunk_Master.png Without ssh'ing and jps/ps'ing, it is difficult to see if a backup hmaster is up. It seems like it would be good for a backup hmaster to have a basic web page up on the info port so that users could see that it is up. Also it should probably either provide a link to the active master or automatically forward to the active master. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13691777#comment-13691777 ] Hadoop QA commented on HBASE-8667: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12589373/HBASE-8667_trunk_v6.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/6115//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6115//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6115//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6115//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6115//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6115//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6115//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6115//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6115//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6115//console This message is automatically generated. Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch, HBASE-8667_trunk_v4.patch, HBASE-8667_trunk_v5.patch, HBASE-8667_trunk_v6.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020,
[jira] [Updated] (HBASE-8783) RSSnapshotManager.ZKProcedureMemberRpcs may be initialized with the wrong server name
[ https://issues.apache.org/jira/browse/HBASE-8783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-8783: --- Resolution: Fixed Fix Version/s: 0.98.0 Status: Resolved (was: Patch Available) committed to 0.94, 0.95 and trunk. thanks for the reviews RSSnapshotManager.ZKProcedureMemberRpcs may be initialized with the wrong server name - Key: HBASE-8783 URL: https://issues.apache.org/jira/browse/HBASE-8783 Project: HBase Issue Type: Bug Components: snapshots Affects Versions: 0.94.8, 0.95.1 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Minor Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8783-0.94-v0.patch, HBASE-8783-0.94-v1.patch, HBASE-8783-v0.patch, HBASE-8783-v1.patch The ZKProcedureMemberRpcs of the RegionServerSnapshotManager may be initialized with the wrong memberName. {code} 2013-06-21 05:03:41,732 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: Initialize Snapshot Manager ... 2013-06-21 05:03:41,875 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us hostname to use. Was=0.0.0.0, Now=srv-5.test.cloudera.com {code} The Region Server Name is used as memberName, but since the snapshot manger is initialized before the RS receives the server name used by the master, the zkprocedure will use the wrong name (0.0.0.0). This will case the snapshot to fail with a TimeoutException since the master will not receive the expected RS {code} Master: ZKProcedureCoordinatorRpcs: Watching for acquire node:/hbase/online-snapshot/acquired/foo23/srv-5.test.cloudera.com,60020,1371813451915 RS: ZKProcedureMemberRpcs: Member: '0.0.0.0,60020,1371814996779' joining acquired barrier for procedure (foo23) in zk ... org.apache.hadoop.hbase.errorhandling.TimeoutException: Timeout elapsed! Source:Timeout caused Foreign Exception Start:1371798732141, End:1371798792141, diff:6, max:6 ms {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8776) port HBASE-8723 to 0.94
[ https://issues.apache.org/jira/browse/HBASE-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-8776: - Fix Version/s: (was: 0.94.9) 0.94.10 Pushing to 0.94.10, since we (or at I) are still discussing. port HBASE-8723 to 0.94 --- Key: HBASE-8776 URL: https://issues.apache.org/jira/browse/HBASE-8776 Project: HBase Issue Type: Bug Affects Versions: 0.94.8 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.94.10 Attachments: HBASE-8776-v0.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-8667: - Fix Version/s: (was: 0.94.9) 0.94.10 Pushing to 0.94.10 Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch, HBASE-8667_trunk_v4.patch, HBASE-8667_trunk_v5.patch, HBASE-8667_trunk_v6.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:627) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1826) at
[jira] [Commented] (HBASE-8656) Rpc call may not be notified in SecureClient
[ https://issues.apache.org/jira/browse/HBASE-8656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13691785#comment-13691785 ] Lars Hofhansl commented on HBASE-8656: -- [~apurtell] Did you get a chance to test this? Patch looks good, in line with the non-secure client. If not, I'll just push to 0.94.10. Rpc call may not be notified in SecureClient Key: HBASE-8656 URL: https://issues.apache.org/jira/browse/HBASE-8656 Project: HBase Issue Type: Bug Components: Client, IPC/RPC, security Affects Versions: 0.94.7 Reporter: cuijianwei Assignee: cuijianwei Fix For: 0.94.9 Attachments: HBASE-8656-0.94-v1.txt In SecureClient.java, rpc responses will be processed by receiveResponse() which looks like: {code} try { int id = in.readInt();// try to read an id if (LOG.isDebugEnabled()) LOG.debug(getName() + got value # + id); Call call = calls.remove(id); int state = in.readInt(); // read call status if (LOG.isDebugEnabled()) { LOG.debug(call #+id+ state is + state); } if (state == Status.SUCCESS.state) { Writable value = ReflectionUtils.newInstance(valueClass, conf); value.readFields(in); // read value if (LOG.isDebugEnabled()) { LOG.debug(call #+id+, response is:\n+value.toString()); } // it's possible that this call may have been cleaned up due to a RPC // timeout, so check if it still exists before setting the value. if (call != null) { call.setValue(value); } } else if (state == Status.ERROR.state) { if (call != null) { call.setException(new RemoteException(WritableUtils.readString(in), WritableUtils .readString(in))); } } else if (state == Status.FATAL.state) { // Close the connection markClosed(new RemoteException(WritableUtils.readString(in), WritableUtils.readString(in))); } } catch (IOException e) { if (e instanceof SocketTimeoutException remoteId.rpcTimeout 0) { // Clean up open calls but don't treat this as a fatal condition, // since we expect certain responses to not make it by the specified // {@link ConnectionId#rpcTimeout}. closeException = e; } else { // Since the server did not respond within the default ping interval // time, treat this as a fatal condition and close this connection markClosed(e); } } finally { if (remoteId.rpcTimeout 0) { cleanupCalls(remoteId.rpcTimeout); } } } {code} In above code, in the try block, the call will be firstly removed from call map by: {code} Call call = calls.remove(id); {code} There may be two cases leading the call couldn't be notified and the invoking thread will wait forever. Firstly, if the returned status is Status.FATAL.state by: {code} int state = in.readInt(); // read call status {code} The code will come into: {code} } else if (state == Status.FATAL.state) { // Close the connection markClosed(new RemoteException(WritableUtils.readString(in), WritableUtils.readString(in))); } {code} Here, the SecureConnection is marked as closed and all rpc calls in call map of this connection will be notified to receive an exception. However, the current rpc call has been removed from the call map, it won't be notified. Secondly, after the call has been removed by: {code} Call call = calls.remove(id); {code} If we encounter any exception before the 'try' block finished, the code will come into 'catch' and 'finally' block, neither 'catch' block nor 'finally' block will notify the rpc call because it has been removed from call map. Compared with receiveResponse() in HBaseClient.java, it may be better to get the rpc call from call map and remove it at the end of the 'try' block. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8790) NullPointerException throwed when stopping regionserver
[ https://issues.apache.org/jira/browse/HBASE-8790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiong LIU updated HBASE-8790: - Description: The Hbase cluster is a fresh start with one regionserver. When we stop hbase, an unhandled NullPointerException is throwed in the regionserver. The regionserver's log is as follows: 2013-06-21 10:21:11,284 INFO [regionserver61020] regionserver.HRegionServer: Closing user regions 2013-06-21 10:21:14,288 DEBUG [regionserver61020] regionserver.HRegionServer: Waiting on 1028785192 2013-06-21 10:21:14,290 FATAL [regionserver61020] regionserver.HRegionServer: ABORTING region server HOSTNAME_TEST,61020,1371781086817 : Unhandled: null java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:988) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:832) at java.lang.Thread.run(Thread.java:662) 2013-06-21 10:21:14,292 FATAL [regionserver61020] regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: [org.apache .hadoop.hbase.coprocessor.MultiRowMutationEndpoint] 2013-06-21 10:21:14,293 INFO [regionserver61020] regionserver.HRegionServer: STOPPED: Unhandled: null 2013-06-21 10:21:14,293 INFO [regionserver61020] ipc.RpcServer: Stopping server on 61020 It seems that after closing user regions, the rssStub is null. update: we found that if setting hbase.client.ipc.pool.type to RoundRobinPool and hbase.client.ipc.pool.size to 10(possibly different value on your machine) in hbase-site.xml, the regionserver is continuously attempting connect to master. was: The Hbase cluster is a fresh start with one regionserver. When we stop hbase, an unhandled NullPointerException is throwed in the regionserver. The regionserver's log is as follows: 2013-06-21 10:21:11,284 INFO [regionserver61020] regionserver.HRegionServer: Closing user regions 2013-06-21 10:21:14,288 DEBUG [regionserver61020] regionserver.HRegionServer: Waiting on 1028785192 2013-06-21 10:21:14,290 FATAL [regionserver61020] regionserver.HRegionServer: ABORTING region server HOSTNAME_TEST,61020,1371781086817 : Unhandled: null java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:988) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:832) at java.lang.Thread.run(Thread.java:662) 2013-06-21 10:21:14,292 FATAL [regionserver61020] regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: [org.apache .hadoop.hbase.coprocessor.MultiRowMutationEndpoint] 2013-06-21 10:21:14,293 INFO [regionserver61020] regionserver.HRegionServer: STOPPED: Unhandled: null 2013-06-21 10:21:14,293 INFO [regionserver61020] ipc.RpcServer: Stopping server on 61020 It seems that after closing user regions, the rssStub is null. NullPointerException throwed when stopping regionserver --- Key: HBASE-8790 URL: https://issues.apache.org/jira/browse/HBASE-8790 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.95.1 Environment: CentOS 5.9 x86_64, java version 1.6.0_45, CDH4.3 Reporter: Xiong LIU The Hbase cluster is a fresh start with one regionserver. When we stop hbase, an unhandled NullPointerException is throwed in the regionserver. The regionserver's log is as follows: 2013-06-21 10:21:11,284 INFO [regionserver61020] regionserver.HRegionServer: Closing user regions 2013-06-21 10:21:14,288 DEBUG [regionserver61020] regionserver.HRegionServer: Waiting on 1028785192 2013-06-21 10:21:14,290 FATAL [regionserver61020] regionserver.HRegionServer: ABORTING region server HOSTNAME_TEST,61020,1371781086817 : Unhandled: null java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:988) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:832) at java.lang.Thread.run(Thread.java:662) 2013-06-21 10:21:14,292 FATAL [regionserver61020] regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: [org.apache .hadoop.hbase.coprocessor.MultiRowMutationEndpoint] 2013-06-21 10:21:14,293 INFO [regionserver61020] regionserver.HRegionServer: STOPPED: Unhandled: null 2013-06-21 10:21:14,293 INFO [regionserver61020] ipc.RpcServer: Stopping server on 61020 It seems that after closing user regions, the rssStub is null. update: we found that if setting hbase.client.ipc.pool.type to RoundRobinPool and hbase.client.ipc.pool.size to 10(possibly different value on your machine) in hbase-site.xml, the regionserver is continuously attempting connect to master. -- This message is
[jira] [Commented] (HBASE-8783) RSSnapshotManager.ZKProcedureMemberRpcs may be initialized with the wrong server name
[ https://issues.apache.org/jira/browse/HBASE-8783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13691800#comment-13691800 ] Hudson commented on HBASE-8783: --- Integrated in HBase-0.94-security #177 (See [https://builds.apache.org/job/HBase-0.94-security/177/]) HBASE-8783 RSSnapshotManager.ZKProcedureMemberRpcs may be initialized with the wrong server name (Revision 1495945) Result = SUCCESS mbertozzi : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/procedure/ProcedureMemberRpcs.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/procedure/ZKProcedureCoordinatorRpcs.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/procedure/ZKProcedureMemberRpcs.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/procedure/ZKProcedureUtil.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/snapshot/RegionServerSnapshotManager.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/procedure/TestZKProcedure.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/procedure/TestZKProcedureControllers.java RSSnapshotManager.ZKProcedureMemberRpcs may be initialized with the wrong server name - Key: HBASE-8783 URL: https://issues.apache.org/jira/browse/HBASE-8783 Project: HBase Issue Type: Bug Components: snapshots Affects Versions: 0.94.8, 0.95.1 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Minor Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8783-0.94-v0.patch, HBASE-8783-0.94-v1.patch, HBASE-8783-v0.patch, HBASE-8783-v1.patch The ZKProcedureMemberRpcs of the RegionServerSnapshotManager may be initialized with the wrong memberName. {code} 2013-06-21 05:03:41,732 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: Initialize Snapshot Manager ... 2013-06-21 05:03:41,875 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us hostname to use. Was=0.0.0.0, Now=srv-5.test.cloudera.com {code} The Region Server Name is used as memberName, but since the snapshot manger is initialized before the RS receives the server name used by the master, the zkprocedure will use the wrong name (0.0.0.0). This will case the snapshot to fail with a TimeoutException since the master will not receive the expected RS {code} Master: ZKProcedureCoordinatorRpcs: Watching for acquire node:/hbase/online-snapshot/acquired/foo23/srv-5.test.cloudera.com,60020,1371813451915 RS: ZKProcedureMemberRpcs: Member: '0.0.0.0,60020,1371814996779' joining acquired barrier for procedure (foo23) in zk ... org.apache.hadoop.hbase.errorhandling.TimeoutException: Timeout elapsed! Source:Timeout caused Foreign Exception Start:1371798732141, End:1371798792141, diff:6, max:6 ms {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8790) NullPointerException throwed when stopping regionserver
[ https://issues.apache.org/jira/browse/HBASE-8790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiong LIU updated HBASE-8790: - Description: The Hbase cluster is a fresh start with one regionserver. When we stop hbase, an unhandled NullPointerException is throwed in the regionserver. The regionserver's log is as follows: 2013-06-21 10:21:11,284 INFO [regionserver61020] regionserver.HRegionServer: Closing user regions 2013-06-21 10:21:14,288 DEBUG [regionserver61020] regionserver.HRegionServer: Waiting on 1028785192 2013-06-21 10:21:14,290 FATAL [regionserver61020] regionserver.HRegionServer: ABORTING region server HOSTNAME_TEST,61020,1371781086817 : Unhandled: null java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:988) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:832) at java.lang.Thread.run(Thread.java:662) 2013-06-21 10:21:14,292 FATAL [regionserver61020] regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: [org.apache .hadoop.hbase.coprocessor.MultiRowMutationEndpoint] 2013-06-21 10:21:14,293 INFO [regionserver61020] regionserver.HRegionServer: STOPPED: Unhandled: null 2013-06-21 10:21:14,293 INFO [regionserver61020] ipc.RpcServer: Stopping server on 61020 It seems that after closing user regions, the rssStub is null. update: we found that if setting hbase.client.ipc.pool.type to RoundRobinPool and hbase.client.ipc.pool.size to 10(possibly other values) in hbase-site.xml, the regionserver is continuously attempting connect to master. was: The Hbase cluster is a fresh start with one regionserver. When we stop hbase, an unhandled NullPointerException is throwed in the regionserver. The regionserver's log is as follows: 2013-06-21 10:21:11,284 INFO [regionserver61020] regionserver.HRegionServer: Closing user regions 2013-06-21 10:21:14,288 DEBUG [regionserver61020] regionserver.HRegionServer: Waiting on 1028785192 2013-06-21 10:21:14,290 FATAL [regionserver61020] regionserver.HRegionServer: ABORTING region server HOSTNAME_TEST,61020,1371781086817 : Unhandled: null java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:988) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:832) at java.lang.Thread.run(Thread.java:662) 2013-06-21 10:21:14,292 FATAL [regionserver61020] regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: [org.apache .hadoop.hbase.coprocessor.MultiRowMutationEndpoint] 2013-06-21 10:21:14,293 INFO [regionserver61020] regionserver.HRegionServer: STOPPED: Unhandled: null 2013-06-21 10:21:14,293 INFO [regionserver61020] ipc.RpcServer: Stopping server on 61020 It seems that after closing user regions, the rssStub is null. update: we found that if setting hbase.client.ipc.pool.type to RoundRobinPool and hbase.client.ipc.pool.size to 10(possibly different value on your machine) in hbase-site.xml, the regionserver is continuously attempting connect to master. NullPointerException throwed when stopping regionserver --- Key: HBASE-8790 URL: https://issues.apache.org/jira/browse/HBASE-8790 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.95.1 Environment: CentOS 5.9 x86_64, java version 1.6.0_45, CDH4.3 Reporter: Xiong LIU The Hbase cluster is a fresh start with one regionserver. When we stop hbase, an unhandled NullPointerException is throwed in the regionserver. The regionserver's log is as follows: 2013-06-21 10:21:11,284 INFO [regionserver61020] regionserver.HRegionServer: Closing user regions 2013-06-21 10:21:14,288 DEBUG [regionserver61020] regionserver.HRegionServer: Waiting on 1028785192 2013-06-21 10:21:14,290 FATAL [regionserver61020] regionserver.HRegionServer: ABORTING region server HOSTNAME_TEST,61020,1371781086817 : Unhandled: null java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:988) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:832) at java.lang.Thread.run(Thread.java:662) 2013-06-21 10:21:14,292 FATAL [regionserver61020] regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: [org.apache .hadoop.hbase.coprocessor.MultiRowMutationEndpoint] 2013-06-21 10:21:14,293 INFO [regionserver61020] regionserver.HRegionServer: STOPPED: Unhandled: null 2013-06-21 10:21:14,293 INFO [regionserver61020] ipc.RpcServer: Stopping server on 61020 It seems that after closing user regions, the rssStub is null. update: we found that if setting
[jira] [Updated] (HBASE-8790) NullPointerException throwed when stopping regionserver
[ https://issues.apache.org/jira/browse/HBASE-8790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HBASE-8790: - Attachment: HBase-8790.txt Attached is a trivial fix. rssStub could be null while we hit ServiceException in tryRegionServerReport, then get it from createRegionServerStatusStub(), per Javadoc : @return master + port, or null if server has been stopped so we can ensure rssStub == null only happened while current server was stopped. and a simple fix should be just fine. NullPointerException throwed when stopping regionserver --- Key: HBASE-8790 URL: https://issues.apache.org/jira/browse/HBASE-8790 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.95.1 Environment: CentOS 5.9 x86_64, java version 1.6.0_45, CDH4.3 Reporter: Xiong LIU Attachments: HBase-8790.txt The Hbase cluster is a fresh start with one regionserver. When we stop hbase, an unhandled NullPointerException is throwed in the regionserver. The regionserver's log is as follows: 2013-06-21 10:21:11,284 INFO [regionserver61020] regionserver.HRegionServer: Closing user regions 2013-06-21 10:21:14,288 DEBUG [regionserver61020] regionserver.HRegionServer: Waiting on 1028785192 2013-06-21 10:21:14,290 FATAL [regionserver61020] regionserver.HRegionServer: ABORTING region server HOSTNAME_TEST,61020,1371781086817 : Unhandled: null java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:988) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:832) at java.lang.Thread.run(Thread.java:662) 2013-06-21 10:21:14,292 FATAL [regionserver61020] regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: [org.apache .hadoop.hbase.coprocessor.MultiRowMutationEndpoint] 2013-06-21 10:21:14,293 INFO [regionserver61020] regionserver.HRegionServer: STOPPED: Unhandled: null 2013-06-21 10:21:14,293 INFO [regionserver61020] ipc.RpcServer: Stopping server on 61020 It seems that after closing user regions, the rssStub is null. update: we found that if setting hbase.client.ipc.pool.type to RoundRobinPool and hbase.client.ipc.pool.size to 10(possibly other values) in hbase-site.xml, the regionserver is continuously attempting connect to master. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8790) NullPointerException throwed when stopping regionserver
[ https://issues.apache.org/jira/browse/HBASE-8790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HBASE-8790: - Assignee: Liang Xie Status: Patch Available (was: Open) NullPointerException throwed when stopping regionserver --- Key: HBASE-8790 URL: https://issues.apache.org/jira/browse/HBASE-8790 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.95.1 Environment: CentOS 5.9 x86_64, java version 1.6.0_45, CDH4.3 Reporter: Xiong LIU Assignee: Liang Xie Attachments: HBase-8790.txt The Hbase cluster is a fresh start with one regionserver. When we stop hbase, an unhandled NullPointerException is throwed in the regionserver. The regionserver's log is as follows: 2013-06-21 10:21:11,284 INFO [regionserver61020] regionserver.HRegionServer: Closing user regions 2013-06-21 10:21:14,288 DEBUG [regionserver61020] regionserver.HRegionServer: Waiting on 1028785192 2013-06-21 10:21:14,290 FATAL [regionserver61020] regionserver.HRegionServer: ABORTING region server HOSTNAME_TEST,61020,1371781086817 : Unhandled: null java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:988) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:832) at java.lang.Thread.run(Thread.java:662) 2013-06-21 10:21:14,292 FATAL [regionserver61020] regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: [org.apache .hadoop.hbase.coprocessor.MultiRowMutationEndpoint] 2013-06-21 10:21:14,293 INFO [regionserver61020] regionserver.HRegionServer: STOPPED: Unhandled: null 2013-06-21 10:21:14,293 INFO [regionserver61020] ipc.RpcServer: Stopping server on 61020 It seems that after closing user regions, the rssStub is null. update: we found that if setting hbase.client.ipc.pool.type to RoundRobinPool and hbase.client.ipc.pool.size to 10(possibly other values) in hbase-site.xml, the regionserver is continuously attempting connect to master. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8783) RSSnapshotManager.ZKProcedureMemberRpcs may be initialized with the wrong server name
[ https://issues.apache.org/jira/browse/HBASE-8783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13691808#comment-13691808 ] Hudson commented on HBASE-8783: --- Integrated in HBase-0.94 #1022 (See [https://builds.apache.org/job/HBase-0.94/1022/]) HBASE-8783 RSSnapshotManager.ZKProcedureMemberRpcs may be initialized with the wrong server name (Revision 1495945) Result = SUCCESS mbertozzi : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/procedure/ProcedureMemberRpcs.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/procedure/ZKProcedureCoordinatorRpcs.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/procedure/ZKProcedureMemberRpcs.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/procedure/ZKProcedureUtil.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/snapshot/RegionServerSnapshotManager.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/procedure/TestZKProcedure.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/procedure/TestZKProcedureControllers.java RSSnapshotManager.ZKProcedureMemberRpcs may be initialized with the wrong server name - Key: HBASE-8783 URL: https://issues.apache.org/jira/browse/HBASE-8783 Project: HBase Issue Type: Bug Components: snapshots Affects Versions: 0.94.8, 0.95.1 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Minor Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8783-0.94-v0.patch, HBASE-8783-0.94-v1.patch, HBASE-8783-v0.patch, HBASE-8783-v1.patch The ZKProcedureMemberRpcs of the RegionServerSnapshotManager may be initialized with the wrong memberName. {code} 2013-06-21 05:03:41,732 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: Initialize Snapshot Manager ... 2013-06-21 05:03:41,875 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us hostname to use. Was=0.0.0.0, Now=srv-5.test.cloudera.com {code} The Region Server Name is used as memberName, but since the snapshot manger is initialized before the RS receives the server name used by the master, the zkprocedure will use the wrong name (0.0.0.0). This will case the snapshot to fail with a TimeoutException since the master will not receive the expected RS {code} Master: ZKProcedureCoordinatorRpcs: Watching for acquire node:/hbase/online-snapshot/acquired/foo23/srv-5.test.cloudera.com,60020,1371813451915 RS: ZKProcedureMemberRpcs: Member: '0.0.0.0,60020,1371814996779' joining acquired barrier for procedure (foo23) in zk ... org.apache.hadoop.hbase.errorhandling.TimeoutException: Timeout elapsed! Source:Timeout caused Foreign Exception Start:1371798732141, End:1371798792141, diff:6, max:6 ms {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8790) NullPointerException thrown when stopping regionserver
[ https://issues.apache.org/jira/browse/HBASE-8790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-8790: -- Summary: NullPointerException thrown when stopping regionserver (was: NullPointerException throwed when stopping regionserver) NullPointerException thrown when stopping regionserver -- Key: HBASE-8790 URL: https://issues.apache.org/jira/browse/HBASE-8790 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.95.1 Environment: CentOS 5.9 x86_64, java version 1.6.0_45, CDH4.3 Reporter: Xiong LIU Assignee: Liang Xie Attachments: HBase-8790.txt The Hbase cluster is a fresh start with one regionserver. When we stop hbase, an unhandled NullPointerException is throwed in the regionserver. The regionserver's log is as follows: 2013-06-21 10:21:11,284 INFO [regionserver61020] regionserver.HRegionServer: Closing user regions 2013-06-21 10:21:14,288 DEBUG [regionserver61020] regionserver.HRegionServer: Waiting on 1028785192 2013-06-21 10:21:14,290 FATAL [regionserver61020] regionserver.HRegionServer: ABORTING region server HOSTNAME_TEST,61020,1371781086817 : Unhandled: null java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:988) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:832) at java.lang.Thread.run(Thread.java:662) 2013-06-21 10:21:14,292 FATAL [regionserver61020] regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: [org.apache .hadoop.hbase.coprocessor.MultiRowMutationEndpoint] 2013-06-21 10:21:14,293 INFO [regionserver61020] regionserver.HRegionServer: STOPPED: Unhandled: null 2013-06-21 10:21:14,293 INFO [regionserver61020] ipc.RpcServer: Stopping server on 61020 It seems that after closing user regions, the rssStub is null. update: we found that if setting hbase.client.ipc.pool.type to RoundRobinPool and hbase.client.ipc.pool.size to 10(possibly other values) in hbase-site.xml, the regionserver is continuously attempting connect to master. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8790) NullPointerException thrown when stopping regionserver
[ https://issues.apache.org/jira/browse/HBASE-8790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13691828#comment-13691828 ] Ted Yu commented on HBASE-8790: --- Looks good to me. NullPointerException thrown when stopping regionserver -- Key: HBASE-8790 URL: https://issues.apache.org/jira/browse/HBASE-8790 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.95.1 Environment: CentOS 5.9 x86_64, java version 1.6.0_45, CDH4.3 Reporter: Xiong LIU Assignee: Liang Xie Attachments: HBase-8790.txt The Hbase cluster is a fresh start with one regionserver. When we stop hbase, an unhandled NullPointerException is throwed in the regionserver. The regionserver's log is as follows: 2013-06-21 10:21:11,284 INFO [regionserver61020] regionserver.HRegionServer: Closing user regions 2013-06-21 10:21:14,288 DEBUG [regionserver61020] regionserver.HRegionServer: Waiting on 1028785192 2013-06-21 10:21:14,290 FATAL [regionserver61020] regionserver.HRegionServer: ABORTING region server HOSTNAME_TEST,61020,1371781086817 : Unhandled: null java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:988) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:832) at java.lang.Thread.run(Thread.java:662) 2013-06-21 10:21:14,292 FATAL [regionserver61020] regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: [org.apache .hadoop.hbase.coprocessor.MultiRowMutationEndpoint] 2013-06-21 10:21:14,293 INFO [regionserver61020] regionserver.HRegionServer: STOPPED: Unhandled: null 2013-06-21 10:21:14,293 INFO [regionserver61020] ipc.RpcServer: Stopping server on 61020 It seems that after closing user regions, the rssStub is null. update: we found that if setting hbase.client.ipc.pool.type to RoundRobinPool and hbase.client.ipc.pool.size to 10(possibly other values) in hbase-site.xml, the regionserver is continuously attempting connect to master. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5083) Backup HMaster should have http infoport open with link to the active master
[ https://issues.apache.org/jira/browse/HBASE-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13691836#comment-13691836 ] Hadoop QA commented on HBASE-5083: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12589380/HBASE-5083_trunk.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/6117//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6117//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6117//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6117//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6117//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6117//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6117//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6117//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6117//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6117//console This message is automatically generated. Backup HMaster should have http infoport open with link to the active master Key: HBASE-5083 URL: https://issues.apache.org/jira/browse/HBASE-5083 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Cody Marcel Fix For: 0.94.9 Attachments: backup_master.png, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083_trunk.patch, HBASE-5083_trunk.patch, HBASE-5083_trunk.patch, master.png, Trunk_Backup_Master.png, Trunk_Master.png Without ssh'ing and jps/ps'ing, it is difficult to see if a backup hmaster is up. It seems like it would be good for a backup hmaster to have a basic web page up on the info port so that users could see that it is up. Also it should probably either provide a link to the active master or automatically forward to the active master. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5083) Backup HMaster should have http infoport open with link to the active master
[ https://issues.apache.org/jira/browse/HBASE-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13691837#comment-13691837 ] Hadoop QA commented on HBASE-5083: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12589380/HBASE-5083_trunk.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/6118//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6118//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6118//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6118//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6118//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6118//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6118//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6118//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6118//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6118//console This message is automatically generated. Backup HMaster should have http infoport open with link to the active master Key: HBASE-5083 URL: https://issues.apache.org/jira/browse/HBASE-5083 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Cody Marcel Fix For: 0.94.9 Attachments: backup_master.png, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083_trunk.patch, HBASE-5083_trunk.patch, HBASE-5083_trunk.patch, master.png, Trunk_Backup_Master.png, Trunk_Master.png Without ssh'ing and jps/ps'ing, it is difficult to see if a backup hmaster is up. It seems like it would be good for a backup hmaster to have a basic web page up on the info port so that users could see that it is up. Also it should probably either provide a link to the active master or automatically forward to the active master. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5083) Backup HMaster should have http infoport open with link to the active master
[ https://issues.apache.org/jira/browse/HBASE-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13691838#comment-13691838 ] Lars Hofhansl commented on HBASE-5083: -- I triggered HadoopQA manually. The test is unrelated. I'll break the 100 character line upon commmit. Backup HMaster should have http infoport open with link to the active master Key: HBASE-5083 URL: https://issues.apache.org/jira/browse/HBASE-5083 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Cody Marcel Fix For: 0.94.9 Attachments: backup_master.png, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083_trunk.patch, HBASE-5083_trunk.patch, HBASE-5083_trunk.patch, master.png, Trunk_Backup_Master.png, Trunk_Master.png Without ssh'ing and jps/ps'ing, it is difficult to see if a backup hmaster is up. It seems like it would be good for a backup hmaster to have a basic web page up on the info port so that users could see that it is up. Also it should probably either provide a link to the active master or automatically forward to the active master. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5083) Backup HMaster should have http infoport open with link to the active master
[ https://issues.apache.org/jira/browse/HBASE-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13691839#comment-13691839 ] Lars Hofhansl commented on HBASE-5083: -- The long lines are in the template. I won't break those, as it is not clear whether all browsers know how to deal with whitespace in Urls. Backup HMaster should have http infoport open with link to the active master Key: HBASE-5083 URL: https://issues.apache.org/jira/browse/HBASE-5083 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Cody Marcel Fix For: 0.94.9 Attachments: backup_master.png, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083_trunk.patch, HBASE-5083_trunk.patch, HBASE-5083_trunk.patch, master.png, Trunk_Backup_Master.png, Trunk_Master.png Without ssh'ing and jps/ps'ing, it is difficult to see if a backup hmaster is up. It seems like it would be good for a backup hmaster to have a basic web page up on the info port so that users could see that it is up. Also it should probably either provide a link to the active master or automatically forward to the active master. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5083) Backup HMaster should have http infoport open with link to the active master
[ https://issues.apache.org/jira/browse/HBASE-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13691841#comment-13691841 ] Lars Hofhansl commented on HBASE-5083: -- +1 Backup HMaster should have http infoport open with link to the active master Key: HBASE-5083 URL: https://issues.apache.org/jira/browse/HBASE-5083 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Cody Marcel Fix For: 0.94.9 Attachments: backup_master.png, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083_trunk.patch, HBASE-5083_trunk.patch, HBASE-5083_trunk.patch, master.png, Trunk_Backup_Master.png, Trunk_Master.png Without ssh'ing and jps/ps'ing, it is difficult to see if a backup hmaster is up. It seems like it would be good for a backup hmaster to have a basic web page up on the info port so that users could see that it is up. Also it should probably either provide a link to the active master or automatically forward to the active master. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5083) Backup HMaster should have http infoport open with link to the active master
[ https://issues.apache.org/jira/browse/HBASE-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13691857#comment-13691857 ] Lars Hofhansl commented on HBASE-5083: -- Any objections to a commit today? Backup HMaster should have http infoport open with link to the active master Key: HBASE-5083 URL: https://issues.apache.org/jira/browse/HBASE-5083 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Cody Marcel Fix For: 0.94.9 Attachments: backup_master.png, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083_trunk.patch, HBASE-5083_trunk.patch, HBASE-5083_trunk.patch, master.png, Trunk_Backup_Master.png, Trunk_Master.png Without ssh'ing and jps/ps'ing, it is difficult to see if a backup hmaster is up. It seems like it would be good for a backup hmaster to have a basic web page up on the info port so that users could see that it is up. Also it should probably either provide a link to the active master or automatically forward to the active master. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-8793) Hbase
Michael Czerwiński created HBASE-8793: - Summary: Hbase Key: HBASE-8793 URL: https://issues.apache.org/jira/browse/HBASE-8793 Project: HBase Issue Type: Bug Components: scripts Affects Versions: 0.94.6 Environment: Description:Ubuntu 12.04.2 LTS Hbase: 0.94.6+96-1.cdh4.3.0.p0.13~precise-cdh4.3.0 Reporter: Michael Czerwiński Priority: Minor hbase-regionserver startup script always returns 0 (exit 0 at the end of the script) this is wrong behaviour which causes issues when trying to recognise true status of the service. Replacing it with 'exit $?' seems to fix the problem, looking at hbase master return codes are assigned to RETVAL variable which is used with exit. Not sure if the problem exist in other versions. /etc/init.d/hbase-regionserver.orig status hbase-regionserver is not running. echo $? After fix: /etc/init.d/hbase-regionserver status hbase-regionserver is not running. echo $? 1 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8790) NullPointerException thrown when stopping regionserver
[ https://issues.apache.org/jira/browse/HBASE-8790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiong LIU updated HBASE-8790: - Description: The Hbase cluster is a fresh start with one regionserver. When we stop hbase, an unhandled NullPointerException is throwed in the regionserver. The regionserver's log is as follows: 2013-06-21 10:21:11,284 INFO [regionserver61020] regionserver.HRegionServer: Closing user regions 2013-06-21 10:21:14,288 DEBUG [regionserver61020] regionserver.HRegionServer: Waiting on 1028785192 2013-06-21 10:21:14,290 FATAL [regionserver61020] regionserver.HRegionServer: ABORTING region server HOSTNAME_TEST,61020,1371781086817 : Unhandled: null java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:988) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:832) at java.lang.Thread.run(Thread.java:662) 2013-06-21 10:21:14,292 FATAL [regionserver61020] regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: [org.apache .hadoop.hbase.coprocessor.MultiRowMutationEndpoint] 2013-06-21 10:21:14,293 INFO [regionserver61020] regionserver.HRegionServer: STOPPED: Unhandled: null 2013-06-21 10:21:14,293 INFO [regionserver61020] ipc.RpcServer: Stopping server on 61020 It seems that after closing user regions, the rssStub is null. update: we found that if setting hbase.client.ipc.pool.type to RoundRobinPool(or other pool type) and hbase.client.ipc.pool.size to 10(possibly other values) in hbase-site.xml, the regionserver is continuously attempting connect to master. and if we stop hbase, the above NullPointerException occurred. With hbase.client.ipc.pool.size set to 1, the cluster can be completely stopped. was: The Hbase cluster is a fresh start with one regionserver. When we stop hbase, an unhandled NullPointerException is throwed in the regionserver. The regionserver's log is as follows: 2013-06-21 10:21:11,284 INFO [regionserver61020] regionserver.HRegionServer: Closing user regions 2013-06-21 10:21:14,288 DEBUG [regionserver61020] regionserver.HRegionServer: Waiting on 1028785192 2013-06-21 10:21:14,290 FATAL [regionserver61020] regionserver.HRegionServer: ABORTING region server HOSTNAME_TEST,61020,1371781086817 : Unhandled: null java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:988) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:832) at java.lang.Thread.run(Thread.java:662) 2013-06-21 10:21:14,292 FATAL [regionserver61020] regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: [org.apache .hadoop.hbase.coprocessor.MultiRowMutationEndpoint] 2013-06-21 10:21:14,293 INFO [regionserver61020] regionserver.HRegionServer: STOPPED: Unhandled: null 2013-06-21 10:21:14,293 INFO [regionserver61020] ipc.RpcServer: Stopping server on 61020 It seems that after closing user regions, the rssStub is null. update: we found that if setting hbase.client.ipc.pool.type to RoundRobinPool and hbase.client.ipc.pool.size to 10(possibly other values) in hbase-site.xml, the regionserver is continuously attempting connect to master. NullPointerException thrown when stopping regionserver -- Key: HBASE-8790 URL: https://issues.apache.org/jira/browse/HBASE-8790 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.95.1 Environment: CentOS 5.9 x86_64, java version 1.6.0_45, CDH4.3 Reporter: Xiong LIU Assignee: Liang Xie Attachments: HBase-8790.txt The Hbase cluster is a fresh start with one regionserver. When we stop hbase, an unhandled NullPointerException is throwed in the regionserver. The regionserver's log is as follows: 2013-06-21 10:21:11,284 INFO [regionserver61020] regionserver.HRegionServer: Closing user regions 2013-06-21 10:21:14,288 DEBUG [regionserver61020] regionserver.HRegionServer: Waiting on 1028785192 2013-06-21 10:21:14,290 FATAL [regionserver61020] regionserver.HRegionServer: ABORTING region server HOSTNAME_TEST,61020,1371781086817 : Unhandled: null java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:988) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:832) at java.lang.Thread.run(Thread.java:662) 2013-06-21 10:21:14,292 FATAL [regionserver61020] regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: [org.apache .hadoop.hbase.coprocessor.MultiRowMutationEndpoint] 2013-06-21 10:21:14,293 INFO [regionserver61020] regionserver.HRegionServer: STOPPED:
[jira] [Updated] (HBASE-8793) Regionserver ubuntu's startup script return code always 0
[ https://issues.apache.org/jira/browse/HBASE-8793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Czerwiński updated HBASE-8793: -- Summary: Regionserver ubuntu's startup script return code always 0 (was: Hbase ) Regionserver ubuntu's startup script return code always 0 - Key: HBASE-8793 URL: https://issues.apache.org/jira/browse/HBASE-8793 Project: HBase Issue Type: Bug Components: scripts Affects Versions: 0.94.6 Environment: Description:Ubuntu 12.04.2 LTS Hbase: 0.94.6+96-1.cdh4.3.0.p0.13~precise-cdh4.3.0 Reporter: Michael Czerwiński Priority: Minor hbase-regionserver startup script always returns 0 (exit 0 at the end of the script) this is wrong behaviour which causes issues when trying to recognise true status of the service. Replacing it with 'exit $?' seems to fix the problem, looking at hbase master return codes are assigned to RETVAL variable which is used with exit. Not sure if the problem exist in other versions. /etc/init.d/hbase-regionserver.orig status hbase-regionserver is not running. echo $? After fix: /etc/init.d/hbase-regionserver status hbase-regionserver is not running. echo $? 1 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8790) NullPointerException thrown when stopping regionserver
[ https://issues.apache.org/jira/browse/HBASE-8790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13691875#comment-13691875 ] Hadoop QA commented on HBASE-8790: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12589381/HBase-8790.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/6119//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6119//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6119//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6119//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6119//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6119//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6119//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6119//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6119//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6119//console This message is automatically generated. NullPointerException thrown when stopping regionserver -- Key: HBASE-8790 URL: https://issues.apache.org/jira/browse/HBASE-8790 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.95.1 Environment: CentOS 5.9 x86_64, java version 1.6.0_45, CDH4.3 Reporter: Xiong LIU Assignee: Liang Xie Attachments: HBase-8790.txt The Hbase cluster is a fresh start with one regionserver. When we stop hbase, an unhandled NullPointerException is throwed in the regionserver. The regionserver's log is as follows: 2013-06-21 10:21:11,284 INFO [regionserver61020] regionserver.HRegionServer: Closing user regions 2013-06-21 10:21:14,288 DEBUG [regionserver61020] regionserver.HRegionServer: Waiting on 1028785192 2013-06-21 10:21:14,290 FATAL [regionserver61020] regionserver.HRegionServer: ABORTING region server HOSTNAME_TEST,61020,1371781086817 : Unhandled: null java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:988) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:832) at java.lang.Thread.run(Thread.java:662) 2013-06-21 10:21:14,292 FATAL [regionserver61020] regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: [org.apache .hadoop.hbase.coprocessor.MultiRowMutationEndpoint] 2013-06-21 10:21:14,293 INFO [regionserver61020] regionserver.HRegionServer: STOPPED: Unhandled: null 2013-06-21 10:21:14,293 INFO [regionserver61020] ipc.RpcServer: Stopping server on 61020 It seems that after closing user regions, the rssStub is null. update: we found that if setting hbase.client.ipc.pool.type to
[jira] [Commented] (HBASE-8783) RSSnapshotManager.ZKProcedureMemberRpcs may be initialized with the wrong server name
[ https://issues.apache.org/jira/browse/HBASE-8783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13691877#comment-13691877 ] Hudson commented on HBASE-8783: --- Integrated in hbase-0.95 #263 (See [https://builds.apache.org/job/hbase-0.95/263/]) HBASE-8783 RSSnapshotManager.ZKProcedureMemberRpcs may be initialized with the wrong server name (Revision 1495947) Result = SUCCESS mbertozzi : Files : * /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/procedure/ProcedureMemberRpcs.java * /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/procedure/ZKProcedureCoordinatorRpcs.java * /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/procedure/ZKProcedureMemberRpcs.java * /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/procedure/ZKProcedureUtil.java * /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/snapshot/RegionServerSnapshotManager.java * /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/procedure/TestZKProcedure.java * /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/procedure/TestZKProcedureControllers.java RSSnapshotManager.ZKProcedureMemberRpcs may be initialized with the wrong server name - Key: HBASE-8783 URL: https://issues.apache.org/jira/browse/HBASE-8783 Project: HBase Issue Type: Bug Components: snapshots Affects Versions: 0.94.8, 0.95.1 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Minor Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8783-0.94-v0.patch, HBASE-8783-0.94-v1.patch, HBASE-8783-v0.patch, HBASE-8783-v1.patch The ZKProcedureMemberRpcs of the RegionServerSnapshotManager may be initialized with the wrong memberName. {code} 2013-06-21 05:03:41,732 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: Initialize Snapshot Manager ... 2013-06-21 05:03:41,875 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us hostname to use. Was=0.0.0.0, Now=srv-5.test.cloudera.com {code} The Region Server Name is used as memberName, but since the snapshot manger is initialized before the RS receives the server name used by the master, the zkprocedure will use the wrong name (0.0.0.0). This will case the snapshot to fail with a TimeoutException since the master will not receive the expected RS {code} Master: ZKProcedureCoordinatorRpcs: Watching for acquire node:/hbase/online-snapshot/acquired/foo23/srv-5.test.cloudera.com,60020,1371813451915 RS: ZKProcedureMemberRpcs: Member: '0.0.0.0,60020,1371814996779' joining acquired barrier for procedure (foo23) in zk ... org.apache.hadoop.hbase.errorhandling.TimeoutException: Timeout elapsed! Source:Timeout caused Foreign Exception Start:1371798732141, End:1371798792141, diff:6, max:6 ms {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8781) ImmutableBytesWritable constructor with another IBW as param need to consider the offset of the passed IBW
[ https://issues.apache.org/jira/browse/HBASE-8781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-8781: -- Attachment: HBASE-8781.patch ImmutableBytesWritable constructor with another IBW as param need to consider the offset of the passed IBW -- Key: HBASE-8781 URL: https://issues.apache.org/jira/browse/HBASE-8781 Project: HBase Issue Type: Bug Affects Versions: 0.94.8 Reporter: Anoop Sam John Assignee: Anoop Sam John Priority: Minor Fix For: 0.98.0 Attachments: HBASE-8781.patch {code} /** * Set the new ImmutableBytesWritable to the contents of the passed * codeibw/code. * @param ibw the value to set this ImmutableBytesWritable to. */ public ImmutableBytesWritable(final ImmutableBytesWritable ibw) { this(ibw.get(), 0, ibw.getSize()); } {code} It should be this(ibw.get(), ibw.getOffset(), ibw.getSize()); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8781) ImmutableBytesWritable constructor with another IBW as param need to consider the offset of the passed IBW
[ https://issues.apache.org/jira/browse/HBASE-8781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-8781: -- Status: Patch Available (was: Open) ImmutableBytesWritable constructor with another IBW as param need to consider the offset of the passed IBW -- Key: HBASE-8781 URL: https://issues.apache.org/jira/browse/HBASE-8781 Project: HBase Issue Type: Bug Affects Versions: 0.94.8 Reporter: Anoop Sam John Assignee: Anoop Sam John Priority: Minor Fix For: 0.98.0 Attachments: HBASE-8781.patch {code} /** * Set the new ImmutableBytesWritable to the contents of the passed * codeibw/code. * @param ibw the value to set this ImmutableBytesWritable to. */ public ImmutableBytesWritable(final ImmutableBytesWritable ibw) { this(ibw.get(), 0, ibw.getSize()); } {code} It should be this(ibw.get(), ibw.getOffset(), ibw.getSize()); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8781) ImmutableBytesWritable constructor with another IBW as param need to consider the offset of the passed IBW
[ https://issues.apache.org/jira/browse/HBASE-8781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-8781: -- Status: Open (was: Patch Available) ImmutableBytesWritable constructor with another IBW as param need to consider the offset of the passed IBW -- Key: HBASE-8781 URL: https://issues.apache.org/jira/browse/HBASE-8781 Project: HBase Issue Type: Bug Affects Versions: 0.94.8 Reporter: Anoop Sam John Assignee: Anoop Sam John Priority: Minor Fix For: 0.98.0 Attachments: HBASE-8781.patch {code} /** * Set the new ImmutableBytesWritable to the contents of the passed * codeibw/code. * @param ibw the value to set this ImmutableBytesWritable to. */ public ImmutableBytesWritable(final ImmutableBytesWritable ibw) { this(ibw.get(), 0, ibw.getSize()); } {code} It should be this(ibw.get(), ibw.getOffset(), ibw.getSize()); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8781) ImmutableBytesWritable constructor with another IBW as param need to consider the offset of the passed IBW
[ https://issues.apache.org/jira/browse/HBASE-8781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-8781: -- Attachment: (was: HBASE-8781.patch) ImmutableBytesWritable constructor with another IBW as param need to consider the offset of the passed IBW -- Key: HBASE-8781 URL: https://issues.apache.org/jira/browse/HBASE-8781 Project: HBase Issue Type: Bug Affects Versions: 0.94.8 Reporter: Anoop Sam John Assignee: Anoop Sam John Priority: Minor Fix For: 0.98.0 Attachments: HBASE-8781.patch {code} /** * Set the new ImmutableBytesWritable to the contents of the passed * codeibw/code. * @param ibw the value to set this ImmutableBytesWritable to. */ public ImmutableBytesWritable(final ImmutableBytesWritable ibw) { this(ibw.get(), 0, ibw.getSize()); } {code} It should be this(ibw.get(), ibw.getOffset(), ibw.getSize()); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8783) RSSnapshotManager.ZKProcedureMemberRpcs may be initialized with the wrong server name
[ https://issues.apache.org/jira/browse/HBASE-8783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13691903#comment-13691903 ] Hudson commented on HBASE-8783: --- Integrated in hbase-0.95-on-hadoop2 #146 (See [https://builds.apache.org/job/hbase-0.95-on-hadoop2/146/]) HBASE-8783 RSSnapshotManager.ZKProcedureMemberRpcs may be initialized with the wrong server name (Revision 1495947) Result = FAILURE mbertozzi : Files : * /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/procedure/ProcedureMemberRpcs.java * /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/procedure/ZKProcedureCoordinatorRpcs.java * /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/procedure/ZKProcedureMemberRpcs.java * /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/procedure/ZKProcedureUtil.java * /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/snapshot/RegionServerSnapshotManager.java * /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/procedure/TestZKProcedure.java * /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/procedure/TestZKProcedureControllers.java RSSnapshotManager.ZKProcedureMemberRpcs may be initialized with the wrong server name - Key: HBASE-8783 URL: https://issues.apache.org/jira/browse/HBASE-8783 Project: HBase Issue Type: Bug Components: snapshots Affects Versions: 0.94.8, 0.95.1 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Minor Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8783-0.94-v0.patch, HBASE-8783-0.94-v1.patch, HBASE-8783-v0.patch, HBASE-8783-v1.patch The ZKProcedureMemberRpcs of the RegionServerSnapshotManager may be initialized with the wrong memberName. {code} 2013-06-21 05:03:41,732 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: Initialize Snapshot Manager ... 2013-06-21 05:03:41,875 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us hostname to use. Was=0.0.0.0, Now=srv-5.test.cloudera.com {code} The Region Server Name is used as memberName, but since the snapshot manger is initialized before the RS receives the server name used by the master, the zkprocedure will use the wrong name (0.0.0.0). This will case the snapshot to fail with a TimeoutException since the master will not receive the expected RS {code} Master: ZKProcedureCoordinatorRpcs: Watching for acquire node:/hbase/online-snapshot/acquired/foo23/srv-5.test.cloudera.com,60020,1371813451915 RS: ZKProcedureMemberRpcs: Member: '0.0.0.0,60020,1371814996779' joining acquired barrier for procedure (foo23) in zk ... org.apache.hadoop.hbase.errorhandling.TimeoutException: Timeout elapsed! Source:Timeout caused Foreign Exception Start:1371798732141, End:1371798792141, diff:6, max:6 ms {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8781) ImmutableBytesWritable constructor with another IBW as param need to consider the offset of the passed IBW
[ https://issues.apache.org/jira/browse/HBASE-8781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13691907#comment-13691907 ] rajeshbabu commented on HBASE-8781: --- +1 ImmutableBytesWritable constructor with another IBW as param need to consider the offset of the passed IBW -- Key: HBASE-8781 URL: https://issues.apache.org/jira/browse/HBASE-8781 Project: HBase Issue Type: Bug Affects Versions: 0.94.8 Reporter: Anoop Sam John Assignee: Anoop Sam John Priority: Minor Fix For: 0.98.0 Attachments: HBASE-8781.patch {code} /** * Set the new ImmutableBytesWritable to the contents of the passed * codeibw/code. * @param ibw the value to set this ImmutableBytesWritable to. */ public ImmutableBytesWritable(final ImmutableBytesWritable ibw) { this(ibw.get(), 0, ibw.getSize()); } {code} It should be this(ibw.get(), ibw.getOffset(), ibw.getSize()); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4360) Maintain information on the time a RS went dead
[ https://issues.apache.org/jira/browse/HBASE-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] samar updated HBASE-4360: - Attachment: HBASE-4360_4.patch test fixed Maintain information on the time a RS went dead --- Key: HBASE-4360 URL: https://issues.apache.org/jira/browse/HBASE-4360 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.94.0 Reporter: Harsh J Assignee: samar Priority: Minor Attachments: HBASE-4360_1.patch, HBASE-4360_2.patch, HBASE-4360_3.patch, HBASE-4360_4.patch, master-status1.png Just something that'd be generally helpful, is to maintain DeadServer info with the last timestamp when it was determined as dead. Makes it easier to hunt the logs, and I don't think its much too expensive to maintain (one additional update per dead determination). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8782) Thrift2 can not parse values when using framed transport
[ https://issues.apache.org/jira/browse/HBASE-8782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13691911#comment-13691911 ] Lars George commented on HBASE-8782: Ah, yeah, that makes sense (somewhat). As your error shows, you get way too much info in the array. But I am still confused as to why getBytes() does work, as it iterates over the internal array too. Is it because Buffer.remaining() is returning the shorter, and therefore appropriate size? And forget about my idea of caching, thinking about it again, since you do not have any info on the given ByteBuffer, caching is irrelevant. Bummer. bq. I added the below function to HtableInterface, so that I can just pass the ByteBuffer without using the .getBytes function . Do you think this solution is a good idea ? I think you are saying that you added this to the ThriftHBaseServiceHandler.java, right? Not HTableInterface. So that you can call getTable() with the ByteBuffer and do the conversion in one place. You still need a few changes, i.e. in checkAndPut() etc. but it makes the change a little cleaner. If that is what you are saying, then that makes sense. Also, you are saying Thrift1 does the same on all ByteBuffer's? If that is the case, then we can do the same here too - although while looking at it, I would hope we find a less costly way to use the ByteBuffer data, i.e. with one copy less. Thrift2 can not parse values when using framed transport Key: HBASE-8782 URL: https://issues.apache.org/jira/browse/HBASE-8782 Project: HBase Issue Type: Bug Components: Thrift Affects Versions: 0.95.1 Reporter: Hamed Madani Attachments: HBASE_8782.patch ThriftHBaseServiceHandler.java use .array() on table names , and values (family , qualifier in checkandDelete , etc) which resulted in incorrect values with framed transport. Replacing .array() with getBytes() fixed this problem. I've attached the patch EDIT: updated the patch to cover checkAndPut(), checkAndDelete() -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8783) RSSnapshotManager.ZKProcedureMemberRpcs may be initialized with the wrong server name
[ https://issues.apache.org/jira/browse/HBASE-8783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13691933#comment-13691933 ] Hudson commented on HBASE-8783: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #581 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/581/]) HBASE-8783 RSSnapshotManager.ZKProcedureMemberRpcs may be initialized with the wrong server name (Revision 1495946) Result = FAILURE mbertozzi : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/procedure/ProcedureMemberRpcs.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/procedure/ZKProcedureCoordinatorRpcs.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/procedure/ZKProcedureMemberRpcs.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/procedure/ZKProcedureUtil.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/snapshot/RegionServerSnapshotManager.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/procedure/TestZKProcedure.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/procedure/TestZKProcedureControllers.java RSSnapshotManager.ZKProcedureMemberRpcs may be initialized with the wrong server name - Key: HBASE-8783 URL: https://issues.apache.org/jira/browse/HBASE-8783 Project: HBase Issue Type: Bug Components: snapshots Affects Versions: 0.94.8, 0.95.1 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Minor Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8783-0.94-v0.patch, HBASE-8783-0.94-v1.patch, HBASE-8783-v0.patch, HBASE-8783-v1.patch The ZKProcedureMemberRpcs of the RegionServerSnapshotManager may be initialized with the wrong memberName. {code} 2013-06-21 05:03:41,732 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: Initialize Snapshot Manager ... 2013-06-21 05:03:41,875 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us hostname to use. Was=0.0.0.0, Now=srv-5.test.cloudera.com {code} The Region Server Name is used as memberName, but since the snapshot manger is initialized before the RS receives the server name used by the master, the zkprocedure will use the wrong name (0.0.0.0). This will case the snapshot to fail with a TimeoutException since the master will not receive the expected RS {code} Master: ZKProcedureCoordinatorRpcs: Watching for acquire node:/hbase/online-snapshot/acquired/foo23/srv-5.test.cloudera.com,60020,1371813451915 RS: ZKProcedureMemberRpcs: Member: '0.0.0.0,60020,1371814996779' joining acquired barrier for procedure (foo23) in zk ... org.apache.hadoop.hbase.errorhandling.TimeoutException: Timeout elapsed! Source:Timeout caused Foreign Exception Start:1371798732141, End:1371798792141, diff:6, max:6 ms {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8781) ImmutableBytesWritable constructor with another IBW as param need to consider the offset of the passed IBW
[ https://issues.apache.org/jira/browse/HBASE-8781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13691936#comment-13691936 ] Hadoop QA commented on HBASE-8781: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12589389/HBASE-8781.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/6120//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6120//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6120//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6120//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6120//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6120//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6120//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6120//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6120//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6120//console This message is automatically generated. ImmutableBytesWritable constructor with another IBW as param need to consider the offset of the passed IBW -- Key: HBASE-8781 URL: https://issues.apache.org/jira/browse/HBASE-8781 Project: HBase Issue Type: Bug Affects Versions: 0.94.8 Reporter: Anoop Sam John Assignee: Anoop Sam John Priority: Minor Fix For: 0.98.0 Attachments: HBASE-8781.patch {code} /** * Set the new ImmutableBytesWritable to the contents of the passed * codeibw/code. * @param ibw the value to set this ImmutableBytesWritable to. */ public ImmutableBytesWritable(final ImmutableBytesWritable ibw) { this(ibw.get(), 0, ibw.getSize()); } {code} It should be this(ibw.get(), ibw.getOffset(), ibw.getSize()); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8792) Organize EventType Java Docs
[ https://issues.apache.org/jira/browse/HBASE-8792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13691937#comment-13691937 ] Gustavo Anatoly commented on HBASE-8792: Hi Stack, I will fix this error, today. Thanks. Organize EventType Java Docs Key: HBASE-8792 URL: https://issues.apache.org/jira/browse/HBASE-8792 Project: HBase Issue Type: Task Reporter: Gustavo Anatoly Assignee: Gustavo Anatoly Priority: Trivial Attachments: HBASE-8792.patch Organize description for declared enums. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-8794) DependentColumnFilter.toString() throws NullPointerException
Stefan Seelmann created HBASE-8794: -- Summary: DependentColumnFilter.toString() throws NullPointerException Key: HBASE-8794 URL: https://issues.apache.org/jira/browse/HBASE-8794 Project: HBase Issue Type: Bug Components: Filters Affects Versions: 0.95.1, 0.94.8 Reporter: Stefan Seelmann Priority: Minor Fix For: 0.98.0, 0.95.2, 0.94.9 DependentColumnFilter.toString() accesses comparator which can be null. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8794) DependentColumnFilter.toString() throws NullPointerException
[ https://issues.apache.org/jira/browse/HBASE-8794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Seelmann updated HBASE-8794: --- Attachment: HBASE-8794-trunk.patch HBASE-8794-0.95.patch HBASE-8794-0.94.patch Patches for 0.94, 0.95, and trunk. DependentColumnFilter.toString() throws NullPointerException Key: HBASE-8794 URL: https://issues.apache.org/jira/browse/HBASE-8794 Project: HBase Issue Type: Bug Components: Filters Affects Versions: 0.94.8, 0.95.1 Reporter: Stefan Seelmann Priority: Minor Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8794-0.94.patch, HBASE-8794-0.95.patch, HBASE-8794-trunk.patch DependentColumnFilter.toString() accesses comparator which can be null. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8793) Regionserver ubuntu's startup script return code always 0
[ https://issues.apache.org/jira/browse/HBASE-8793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13691952#comment-13691952 ] Jean-Marc Spaggiari commented on HBASE-8793: Strange, there is already an exit command on this script... status_of_proc -p ${HBASE_PID_DIR}/hbase-hbase-regionserver.pid ${JAVA_HOME}/bin/java hbase-regionserver exit 0 || exit $? So the exit at the end of the script should not be used. Can you had some traces on your init script to verify why those 2 exits are not done and why it's using the last exit? Regionserver ubuntu's startup script return code always 0 - Key: HBASE-8793 URL: https://issues.apache.org/jira/browse/HBASE-8793 Project: HBase Issue Type: Bug Components: scripts Affects Versions: 0.94.6 Environment: Description:Ubuntu 12.04.2 LTS Hbase: 0.94.6+96-1.cdh4.3.0.p0.13~precise-cdh4.3.0 Reporter: Michael Czerwiński Priority: Minor hbase-regionserver startup script always returns 0 (exit 0 at the end of the script) this is wrong behaviour which causes issues when trying to recognise true status of the service. Replacing it with 'exit $?' seems to fix the problem, looking at hbase master return codes are assigned to RETVAL variable which is used with exit. Not sure if the problem exist in other versions. /etc/init.d/hbase-regionserver.orig status hbase-regionserver is not running. echo $? After fix: /etc/init.d/hbase-regionserver status hbase-regionserver is not running. echo $? 1 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8794) DependentColumnFilter.toString() throws NullPointerException
[ https://issues.apache.org/jira/browse/HBASE-8794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Seelmann updated HBASE-8794: --- Status: Patch Available (was: Open) DependentColumnFilter.toString() throws NullPointerException Key: HBASE-8794 URL: https://issues.apache.org/jira/browse/HBASE-8794 Project: HBase Issue Type: Bug Components: Filters Affects Versions: 0.95.1, 0.94.8 Reporter: Stefan Seelmann Priority: Minor Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8794-0.94.patch, HBASE-8794-0.95.patch, HBASE-8794-trunk.patch DependentColumnFilter.toString() accesses comparator which can be null. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8793) Regionserver ubuntu's startup script return code always 0
[ https://issues.apache.org/jira/browse/HBASE-8793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13691960#comment-13691960 ] Michael Czerwiński commented on HBASE-8793: --- Well I think that the whole script is different than the on you are using (see below). The problem is that when checking status, status() function only returns and probably should call exit with a return code. Because return is not handled in any way exit 0 (last line) takes place indicating invalid service status. The package comes from Cloudera's CDH4. --- CUT HERE --- #! /bin/bash # # Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version 2.0 # (the License); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an AS IS BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # This file is used to run multiple instances of certain HBase daemons using init scripts. # It replaces the local-regionserver.sh and local-master.sh scripts for Bigtop packages. # By default, this script runs a single daemon normally. If offsets are provided, additional # daemons are run, identified by the offset in log and pid files, and listening on the default # port + the offset. Offsets can be provided as arguments when invoking init scripts directly: # # /etc/init.d/hbase-regionserver start 1 2 3 4 # # or you can list the offsets to run in /etc/init.d/regionserver_offsets: # #echo regionserver_OFFSETS='1 2 3 4' /etc/default/hbase #sudo service hbase-$HBASE_DAEMON@ start # # Offsets specified on the command-line always override the offsets file. If no offsets are # specified on the command-line when stopping or restarting daemons, all running instances of the # daemon are stopped (regardless of the contents of the offsets file). # chkconfig: 2345 87 13 # description: Summary: HBase is the Hadoop database. Use it when you need random, realtime read/write access to your Big Data. This project's goal is the hosting of very large tables -- billions of rows X millions of columns -- atop clusters of commodity hardware. # processname: HBase # ### BEGIN INIT INFO # Provides: hbase-regionserver # Required-Start:$network $local_fs $remote_fs # Required-Stop: $remote_fs # Should-Start: $named # Should-Stop: # Default-Start: 2 3 4 5 # Default-Stop: 0 1 6 # Short-Description: Hadoop HBase regionserver daemon ### END INIT INFO . /etc/default/hadoop . /etc/default/hbase # Autodetect JAVA_HOME if not defined . /usr/lib/bigtop-utils/bigtop-detect-javahome # Our default HBASE_HOME, HBASE_PID_DIR and HBASE_CONF_DIR export HBASE_HOME=${HBASE_HOME:-/usr/lib/hbase} export HBASE_PID_DIR=${HBASE_PID_DIR:-/var/run/hbase} export HBASE_LOG_DIR=${HBASE_LOG_DIR:-/var/log/hbase} install -d -m 0755 -o hbase -g hbase ${HBASE_PID_DIR} PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin DAEMON_SCRIPT=$HBASE_HOME/bin/hbase-daemon.sh NAME=hbase-regionserver DESC=Hadoop HBase regionserver daemon PID_FILE=$HBASE_PID_DIR/hbase-hbase-regionserver.pid CONF_DIR=/etc/hbase/conf DODTIME=3 # Time to wait for the server to die, in seconds # If this value is set too low you might not # let some servers to die gracefully and # 'restart' will not work UPPERCASE_HBASE_DAEMON=$(echo regionserver | tr '[:lower:]' '[:upper:]') ALL_DAEMONS_RUNNING=0 NO_DAEMONS_RUNNING=1 SOME_OFFSET_DAEMONS_FAILING=2 INVALID_OFFSETS_PROVIDED=3 # These limits are not easily configurable - they are enforced by HBase if [ regionserver == master ] ; then FIRST_PORT=6 FIRST_INFO_PORT=60010 OFFSET_LIMIT=10 elif [ regionserver == regionserver ] ; then FIRST_PORT=60200 FIRST_INFO_PORT=60300 OFFSET_LIMIT=100 fi validate_offsets() { for OFFSET in $1; do if [[ ! $OFFSET =~ ^((0)|([1-9][0-9]{0,2}))$ ]]; then echo ERROR: All offsets must be positive integers (no leading zeros, max $OFFSET_LIMIT) exit $INVALID_OFFSETS_PROVIDED fi if [ ${OFFSET} -lt 0 ] ; then echo ERROR: Cannot start regionserver with negative offset 2 exit $INVALID_OFFSETS_PROVIDED fi if [ ${OFFSET} -ge ${OFFSET_LIMIT} ] ; then echo ERROR: Cannot start regionserver with offset higher than
[jira] [Commented] (HBASE-8793) Regionserver ubuntu's startup script return code always 0
[ https://issues.apache.org/jira/browse/HBASE-8793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13691963#comment-13691963 ] Michael Czerwiński commented on HBASE-8793: --- The paste did not work out well, try this: http://pastebin.com/cm9g3mtr Regionserver ubuntu's startup script return code always 0 - Key: HBASE-8793 URL: https://issues.apache.org/jira/browse/HBASE-8793 Project: HBase Issue Type: Bug Components: scripts Affects Versions: 0.94.6 Environment: Description:Ubuntu 12.04.2 LTS Hbase: 0.94.6+96-1.cdh4.3.0.p0.13~precise-cdh4.3.0 Reporter: Michael Czerwiński Priority: Minor hbase-regionserver startup script always returns 0 (exit 0 at the end of the script) this is wrong behaviour which causes issues when trying to recognise true status of the service. Replacing it with 'exit $?' seems to fix the problem, looking at hbase master return codes are assigned to RETVAL variable which is used with exit. Not sure if the problem exist in other versions. /etc/init.d/hbase-regionserver.orig status hbase-regionserver is not running. echo $? After fix: /etc/init.d/hbase-regionserver status hbase-regionserver is not running. echo $? 1 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4360) Maintain information on the time a RS went dead
[ https://issues.apache.org/jira/browse/HBASE-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13691973#comment-13691973 ] Hadoop QA commented on HBASE-4360: -- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12589392/HBASE-4360_4.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 6 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/6121//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6121//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6121//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6121//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6121//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6121//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6121//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6121//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6121//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6121//console This message is automatically generated. Maintain information on the time a RS went dead --- Key: HBASE-4360 URL: https://issues.apache.org/jira/browse/HBASE-4360 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.94.0 Reporter: Harsh J Assignee: samar Priority: Minor Attachments: HBASE-4360_1.patch, HBASE-4360_2.patch, HBASE-4360_3.patch, HBASE-4360_4.patch, master-status1.png Just something that'd be generally helpful, is to maintain DeadServer info with the last timestamp when it was determined as dead. Makes it easier to hunt the logs, and I don't think its much too expensive to maintain (one additional update per dead determination). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8793) Regionserver ubuntu's startup script return code always 0
[ https://issues.apache.org/jira/browse/HBASE-8793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13691980#comment-13691980 ] Jean-Marc Spaggiari commented on HBASE-8793: Hi Michael, The version of this script for the Cloudera distribution has been modified a lot. So you will need to report the issue on the Cloudera's CDH distribution list and open a defect on that side. This defect (HBASE-8793) will have to be closed since this doesn't applied to the Apache version. Let me know if you are not on the CDH distribution list and I will point you to the right URLs/directions. Regionserver ubuntu's startup script return code always 0 - Key: HBASE-8793 URL: https://issues.apache.org/jira/browse/HBASE-8793 Project: HBase Issue Type: Bug Components: scripts Affects Versions: 0.94.6 Environment: Description:Ubuntu 12.04.2 LTS Hbase: 0.94.6+96-1.cdh4.3.0.p0.13~precise-cdh4.3.0 Reporter: Michael Czerwiński Priority: Minor hbase-regionserver startup script always returns 0 (exit 0 at the end of the script) this is wrong behaviour which causes issues when trying to recognise true status of the service. Replacing it with 'exit $?' seems to fix the problem, looking at hbase master return codes are assigned to RETVAL variable which is used with exit. Not sure if the problem exist in other versions. /etc/init.d/hbase-regionserver.orig status hbase-regionserver is not running. echo $? After fix: /etc/init.d/hbase-regionserver status hbase-regionserver is not running. echo $? 1 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8793) Regionserver ubuntu's startup script return code always 0
[ https://issues.apache.org/jira/browse/HBASE-8793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13691985#comment-13691985 ] Michael Czerwiński commented on HBASE-8793: --- I understand, sounds good, if you can drop me a link that would be great, thanks again for your time. Regionserver ubuntu's startup script return code always 0 - Key: HBASE-8793 URL: https://issues.apache.org/jira/browse/HBASE-8793 Project: HBase Issue Type: Bug Components: scripts Affects Versions: 0.94.6 Environment: Description:Ubuntu 12.04.2 LTS Hbase: 0.94.6+96-1.cdh4.3.0.p0.13~precise-cdh4.3.0 Reporter: Michael Czerwiński Priority: Minor hbase-regionserver startup script always returns 0 (exit 0 at the end of the script) this is wrong behaviour which causes issues when trying to recognise true status of the service. Replacing it with 'exit $?' seems to fix the problem, looking at hbase master return codes are assigned to RETVAL variable which is used with exit. Not sure if the problem exist in other versions. /etc/init.d/hbase-regionserver.orig status hbase-regionserver is not running. echo $? After fix: /etc/init.d/hbase-regionserver status hbase-regionserver is not running. echo $? 1 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8792) Organize EventType Java Docs
[ https://issues.apache.org/jira/browse/HBASE-8792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gustavo Anatoly updated HBASE-8792: --- Attachment: (was: HBASE-8792.patch) Organize EventType Java Docs Key: HBASE-8792 URL: https://issues.apache.org/jira/browse/HBASE-8792 Project: HBase Issue Type: Task Reporter: Gustavo Anatoly Assignee: Gustavo Anatoly Priority: Trivial Organize description for declared enums. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8794) DependentColumnFilter.toString() throws NullPointerException
[ https://issues.apache.org/jira/browse/HBASE-8794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692045#comment-13692045 ] Hadoop QA commented on HBASE-8794: -- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12589400/HBASE-8794-trunk.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/6122//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6122//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6122//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6122//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6122//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6122//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6122//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6122//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6122//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6122//console This message is automatically generated. DependentColumnFilter.toString() throws NullPointerException Key: HBASE-8794 URL: https://issues.apache.org/jira/browse/HBASE-8794 Project: HBase Issue Type: Bug Components: Filters Affects Versions: 0.94.8, 0.95.1 Reporter: Stefan Seelmann Priority: Minor Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8794-0.94.patch, HBASE-8794-0.95.patch, HBASE-8794-trunk.patch DependentColumnFilter.toString() accesses comparator which can be null. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8792) Organize EventType Java Docs
[ https://issues.apache.org/jira/browse/HBASE-8792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gustavo Anatoly updated HBASE-8792: --- Attachment: HBASE-8792.patch Error (patch unexpectedly ends in middle of line) fixed and tested on svn trunk. Organize EventType Java Docs Key: HBASE-8792 URL: https://issues.apache.org/jira/browse/HBASE-8792 Project: HBase Issue Type: Task Reporter: Gustavo Anatoly Assignee: Gustavo Anatoly Priority: Trivial Attachments: HBASE-8792.patch Organize description for declared enums. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8792) Organize EventType Java Docs
[ https://issues.apache.org/jira/browse/HBASE-8792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692097#comment-13692097 ] Hadoop QA commented on HBASE-8792: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12589428/HBASE-8792.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestTableLockManager org.apache.hadoop.hbase.security.access.TestAccessController Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/6123//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6123//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6123//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6123//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6123//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6123//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6123//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6123//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6123//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6123//console This message is automatically generated. Organize EventType Java Docs Key: HBASE-8792 URL: https://issues.apache.org/jira/browse/HBASE-8792 Project: HBase Issue Type: Task Reporter: Gustavo Anatoly Assignee: Gustavo Anatoly Priority: Trivial Attachments: HBASE-8792.patch Organize description for declared enums. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8785) revise zookeeper session timeout setting
[ https://issues.apache.org/jira/browse/HBASE-8785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692103#comment-13692103 ] stack commented on HBASE-8785: -- [~sershe] This issue is not valid? revise zookeeper session timeout setting Key: HBASE-8785 URL: https://issues.apache.org/jira/browse/HBASE-8785 Project: HBase Issue Type: Improvement Affects Versions: 0.95.1 Reporter: Sergey Shelukhin Fix For: 0.95.2 Current ZK session timeout is set to 90sec., and the comment in the doc says: This setting becomes zookeeper's 'maxSessionTimeout'. However, this comment is misleading - it doesn't always become maxSessionTimeout, min(our timeout, maxSessionTimeout) is chosen. Moreover, the default maxSessionTimeout in ZK that I'm looking at is 40s, so this setting doesn't do anything. Additionally, 40s. seems like a lot of time. 1) Should the comment be changed to tell the user to change ZK config if they want higher timeout? 2) Should the setting be revised down? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8768) Improve bulk load performance by moving key value construction from map phase to reduce phase.
[ https://issues.apache.org/jira/browse/HBASE-8768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-8768: - Fix Version/s: (was: 0.98.0) (was: 0.95.2) Moving out this improvement. If done in time, lets pull it back in (sounds like a nice improvement) Improve bulk load performance by moving key value construction from map phase to reduce phase. -- Key: HBASE-8768 URL: https://issues.apache.org/jira/browse/HBASE-8768 Project: HBase Issue Type: Improvement Components: mapreduce, Performance Reporter: rajeshbabu Assignee: rajeshbabu ImportTSV bulkloading approach uses MapReduce framework. Existing mapper and reducer classes used by ImportTSV are TsvImporterMapper.java and PutSortReducer.java. ImportTSV tool parses the tab(by default) seperated values from the input files and Mapper class generates the PUT objects for each row using the Key value pairs created from the parsed text. PutSortReducer then uses the partions based on the regions and sorts the Put objects for each region. Overheads we can see in the above approach: == 1) keyvalue construction for each parsed value in the line adding extra data like rowkey,columnfamily,qualifier which will increase around 5x extra data to be shuffled in reduce phase. We can calculate data size to shuffled as below {code} Data to be shuffled = nl*nt*(rl+cfl+cql+vall+tsl+30) {code} If we move keyvalue construction to reduce phase we datasize to be shuffle will be which is very less compared to above. {code} Data to be shuffled = nl*nt*vall {code} nl - Number of lines in the raw file nt - Number of tabs or columns including row key. rl - row length which will be different for each line. cfl - column family length which will be different for each family cql - qualifier length tsl - timestamp length. vall- each parsed value length. 30 bytes for kv size,number of families etc. 2) In mapper side we are creating put objects by adding all keyvalues constructed for each line and in reducer we will again collect keyvalues from put and sort them. Instead we can directly create and sort keyvalues in reducer. Solution: We can improve bulk load performance by moving the key value construction from mapper to reducer so that Mapper just sends the raw text for each row to the Reducer. Reducer then parses the records for rows and create and sort the key value pairs before writing to HFiles. Conclusion: === The above suggestions will improve map phase performance by avoiding keyvalue construction and reduce phase performance by avoiding excess data to be shuffled. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-8795) bin/hbase zkcli cannot take arguments anymore
Nicolas Liochon created HBASE-8795: -- Summary: bin/hbase zkcli cannot take arguments anymore Key: HBASE-8795 URL: https://issues.apache.org/jira/browse/HBASE-8795 Project: HBase Issue Type: Bug Components: scripts Affects Versions: 0.98.0 Reporter: Nicolas Liochon Priority: Critical It used to be possible to do stuff like bin/hbase zkcli stat And we have this kind of stuff in the standard hbase scripts. This has been broken by HBASE-8766 (reverting is an easy way to fix, it's unlikely to be the right thing to do. Pinging [~enis]) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8662) [rest] support impersonation
[ https://issues.apache.org/jira/browse/HBASE-8662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-8662: --- Release Note: With the patch, if hbase.rest.authentication is set to kerberos and security is turned on, REST server will impersonate the authenticated user in access HBase. The RPC layer proxy user settings should be configured properly to allow impersonation. was: With the patch, if hbase.rest.authentication is set to kerberos (the only authentication method supported currently), REST server will impersonate the authenticated user in access HBase. The RPC layer proxy user settings should be configured properly. Status: Patch Available (was: Open) [rest] support impersonation Key: HBASE-8662 URL: https://issues.apache.org/jira/browse/HBASE-8662 Project: HBase Issue Type: Sub-task Components: REST, security Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.98.0 Attachments: method_doas.patch, secure_rest.patch, trunk-8662.patch, trunk-8662_v2.patch, trunk-8662_v3.patch, trunk-8662_v4.patch Currently, our client API uses a fixed user: the current user. It should accept a user passed in, if authenticated. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8793) Regionserver ubuntu's startup script return code always 0
[ https://issues.apache.org/jira/browse/HBASE-8793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692139#comment-13692139 ] Jean-Marc Spaggiari commented on HBASE-8793: Hi Michael, Here is the CDH list: https://groups.google.com/a/cloudera.org/forum/#!forum/cdh-user Alway willing to help. Talk to you on the CDH list ;) Regionserver ubuntu's startup script return code always 0 - Key: HBASE-8793 URL: https://issues.apache.org/jira/browse/HBASE-8793 Project: HBase Issue Type: Bug Components: scripts Affects Versions: 0.94.6 Environment: Description:Ubuntu 12.04.2 LTS Hbase: 0.94.6+96-1.cdh4.3.0.p0.13~precise-cdh4.3.0 Reporter: Michael Czerwiński Priority: Minor hbase-regionserver startup script always returns 0 (exit 0 at the end of the script) this is wrong behaviour which causes issues when trying to recognise true status of the service. Replacing it with 'exit $?' seems to fix the problem, looking at hbase master return codes are assigned to RETVAL variable which is used with exit. Not sure if the problem exist in other versions. /etc/init.d/hbase-regionserver.orig status hbase-regionserver is not running. echo $? After fix: /etc/init.d/hbase-regionserver status hbase-regionserver is not running. echo $? 1 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8794) DependentColumnFilter.toString() throws NullPointerException
[ https://issues.apache.org/jira/browse/HBASE-8794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692149#comment-13692149 ] Lars Hofhansl commented on HBASE-8794: -- What good a DependentColumnFilter when no comparator is passed? Is there a use case for that? Otherwise (in 0.95 and trunk) we should check for that in the constructor and throw an exception when unset. DependentColumnFilter.toString() throws NullPointerException Key: HBASE-8794 URL: https://issues.apache.org/jira/browse/HBASE-8794 Project: HBase Issue Type: Bug Components: Filters Affects Versions: 0.94.8, 0.95.1 Reporter: Stefan Seelmann Priority: Minor Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8794-0.94.patch, HBASE-8794-0.95.patch, HBASE-8794-trunk.patch DependentColumnFilter.toString() accesses comparator which can be null. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8794) DependentColumnFilter.toString() throws NullPointerException
[ https://issues.apache.org/jira/browse/HBASE-8794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-8794: - Fix Version/s: (was: 0.94.9) 0.94.10 DependentColumnFilter.toString() throws NullPointerException Key: HBASE-8794 URL: https://issues.apache.org/jira/browse/HBASE-8794 Project: HBase Issue Type: Bug Components: Filters Affects Versions: 0.94.8, 0.95.1 Reporter: Stefan Seelmann Priority: Minor Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: HBASE-8794-0.94.patch, HBASE-8794-0.95.patch, HBASE-8794-trunk.patch DependentColumnFilter.toString() accesses comparator which can be null. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8778) Region assigments scan table directory making them slow for huge tables
[ https://issues.apache.org/jira/browse/HBASE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Latham updated HBASE-8778: --- Attachment: HBASE-8778-0.94.5-v2.patch Here's an updated patch with a bit of cleanup. We've begun rolling this out to one of our production clusters, and it is showing about a 5x speedup in assignments during the rolling restart. Region assigments scan table directory making them slow for huge tables --- Key: HBASE-8778 URL: https://issues.apache.org/jira/browse/HBASE-8778 Project: HBase Issue Type: Improvement Reporter: Dave Latham Attachments: HBASE-8778-0.94.5.patch, HBASE-8778-0.94.5-v2.patch On a table with 130k regions it takes about 3 seconds for a region server to open a region once it has been assigned. Watching the threads for a region server running 0.94.5 that is opening many such regions shows the thread opening the reigon in code like this: {noformat} PRI IPC Server handler 4 on 60020 daemon prio=10 tid=0x2aaac07e9000 nid=0x6566 runnable [0x4c46d000] java.lang.Thread.State: RUNNABLE at java.lang.String.indexOf(String.java:1521) at java.net.URI$Parser.scan(URI.java:2912) at java.net.URI$Parser.parse(URI.java:3004) at java.net.URI.init(URI.java:736) at org.apache.hadoop.fs.Path.initialize(Path.java:145) at org.apache.hadoop.fs.Path.init(Path.java:126) at org.apache.hadoop.fs.Path.init(Path.java:50) at org.apache.hadoop.hdfs.protocol.HdfsFileStatus.getFullPath(HdfsFileStatus.java:215) at org.apache.hadoop.hdfs.DistributedFileSystem.makeQualified(DistributedFileSystem.java:252) at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:311) at org.apache.hadoop.fs.FilterFileSystem.listStatus(FilterFileSystem.java:159) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:842) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:867) at org.apache.hadoop.hbase.util.FSUtils.listStatus(FSUtils.java:1168) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:269) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:255) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoModtime(FSTableDescriptors.java:368) at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:155) at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:126) at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2834) at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2807) at sun.reflect.GeneratedMethodAccessor64.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426) {noformat} To open the region, the region server first loads the latest HTableDescriptor. Since HBASE-4553 HTableDescriptor's are stored in the file system at /hbase/tableDir/.tableinfo.sequenceNum. The file with the largest sequenceNum is the current descriptor. This is done so that the current descirptor is updated atomically. However, since the filename is not known in advance FSTableDescriptors it has to do a FileSystem.listStatus operation which has to list all files in the directory to find it. The directory also contains all the region directories, so in our case it has to load 130k FileStatus objects. Even using a globStatus matching function still transfers all the objects to the client before performing the pattern matching. Furthermore HDFS uses a default of transferring 1000 directory entries in each RPC call, so it requires 130 roundtrips to the namenode to fetch all the directory entries. Consequently, to reassign all the regions of a table (or a constant fraction thereof) requires time proportional to the square of the number of regions. In our case, if a region server fails with 200 such regions, it takes 10+ minutes for them all to be reassigned, after the zk expiration and log splitting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8778) Region assigments scan table directory making them slow for huge tables
[ https://issues.apache.org/jira/browse/HBASE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Latham updated HBASE-8778: --- Attachment: (was: HBASE-8778-0.94.5-v2.patch) Region assigments scan table directory making them slow for huge tables --- Key: HBASE-8778 URL: https://issues.apache.org/jira/browse/HBASE-8778 Project: HBase Issue Type: Improvement Reporter: Dave Latham Attachments: HBASE-8778-0.94.5.patch, HBASE-8778-0.94.5-v2.patch On a table with 130k regions it takes about 3 seconds for a region server to open a region once it has been assigned. Watching the threads for a region server running 0.94.5 that is opening many such regions shows the thread opening the reigon in code like this: {noformat} PRI IPC Server handler 4 on 60020 daemon prio=10 tid=0x2aaac07e9000 nid=0x6566 runnable [0x4c46d000] java.lang.Thread.State: RUNNABLE at java.lang.String.indexOf(String.java:1521) at java.net.URI$Parser.scan(URI.java:2912) at java.net.URI$Parser.parse(URI.java:3004) at java.net.URI.init(URI.java:736) at org.apache.hadoop.fs.Path.initialize(Path.java:145) at org.apache.hadoop.fs.Path.init(Path.java:126) at org.apache.hadoop.fs.Path.init(Path.java:50) at org.apache.hadoop.hdfs.protocol.HdfsFileStatus.getFullPath(HdfsFileStatus.java:215) at org.apache.hadoop.hdfs.DistributedFileSystem.makeQualified(DistributedFileSystem.java:252) at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:311) at org.apache.hadoop.fs.FilterFileSystem.listStatus(FilterFileSystem.java:159) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:842) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:867) at org.apache.hadoop.hbase.util.FSUtils.listStatus(FSUtils.java:1168) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:269) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:255) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoModtime(FSTableDescriptors.java:368) at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:155) at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:126) at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2834) at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2807) at sun.reflect.GeneratedMethodAccessor64.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426) {noformat} To open the region, the region server first loads the latest HTableDescriptor. Since HBASE-4553 HTableDescriptor's are stored in the file system at /hbase/tableDir/.tableinfo.sequenceNum. The file with the largest sequenceNum is the current descriptor. This is done so that the current descirptor is updated atomically. However, since the filename is not known in advance FSTableDescriptors it has to do a FileSystem.listStatus operation which has to list all files in the directory to find it. The directory also contains all the region directories, so in our case it has to load 130k FileStatus objects. Even using a globStatus matching function still transfers all the objects to the client before performing the pattern matching. Furthermore HDFS uses a default of transferring 1000 directory entries in each RPC call, so it requires 130 roundtrips to the namenode to fetch all the directory entries. Consequently, to reassign all the regions of a table (or a constant fraction thereof) requires time proportional to the square of the number of regions. In our case, if a region server fails with 200 such regions, it takes 10+ minutes for them all to be reassigned, after the zk expiration and log splitting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8778) Region assigments scan table directory making them slow for huge tables
[ https://issues.apache.org/jira/browse/HBASE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Latham updated HBASE-8778: --- Attachment: HBASE-8778-0.94.5-v2.patch Region assigments scan table directory making them slow for huge tables --- Key: HBASE-8778 URL: https://issues.apache.org/jira/browse/HBASE-8778 Project: HBase Issue Type: Improvement Reporter: Dave Latham Attachments: HBASE-8778-0.94.5.patch, HBASE-8778-0.94.5-v2.patch On a table with 130k regions it takes about 3 seconds for a region server to open a region once it has been assigned. Watching the threads for a region server running 0.94.5 that is opening many such regions shows the thread opening the reigon in code like this: {noformat} PRI IPC Server handler 4 on 60020 daemon prio=10 tid=0x2aaac07e9000 nid=0x6566 runnable [0x4c46d000] java.lang.Thread.State: RUNNABLE at java.lang.String.indexOf(String.java:1521) at java.net.URI$Parser.scan(URI.java:2912) at java.net.URI$Parser.parse(URI.java:3004) at java.net.URI.init(URI.java:736) at org.apache.hadoop.fs.Path.initialize(Path.java:145) at org.apache.hadoop.fs.Path.init(Path.java:126) at org.apache.hadoop.fs.Path.init(Path.java:50) at org.apache.hadoop.hdfs.protocol.HdfsFileStatus.getFullPath(HdfsFileStatus.java:215) at org.apache.hadoop.hdfs.DistributedFileSystem.makeQualified(DistributedFileSystem.java:252) at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:311) at org.apache.hadoop.fs.FilterFileSystem.listStatus(FilterFileSystem.java:159) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:842) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:867) at org.apache.hadoop.hbase.util.FSUtils.listStatus(FSUtils.java:1168) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:269) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:255) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoModtime(FSTableDescriptors.java:368) at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:155) at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:126) at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2834) at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2807) at sun.reflect.GeneratedMethodAccessor64.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426) {noformat} To open the region, the region server first loads the latest HTableDescriptor. Since HBASE-4553 HTableDescriptor's are stored in the file system at /hbase/tableDir/.tableinfo.sequenceNum. The file with the largest sequenceNum is the current descriptor. This is done so that the current descirptor is updated atomically. However, since the filename is not known in advance FSTableDescriptors it has to do a FileSystem.listStatus operation which has to list all files in the directory to find it. The directory also contains all the region directories, so in our case it has to load 130k FileStatus objects. Even using a globStatus matching function still transfers all the objects to the client before performing the pattern matching. Furthermore HDFS uses a default of transferring 1000 directory entries in each RPC call, so it requires 130 roundtrips to the namenode to fetch all the directory entries. Consequently, to reassign all the regions of a table (or a constant fraction thereof) requires time proportional to the square of the number of regions. In our case, if a region server fails with 200 such regions, it takes 10+ minutes for them all to be reassigned, after the zk expiration and log splitting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8785) revise zookeeper session timeout setting
[ https://issues.apache.org/jira/browse/HBASE-8785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692165#comment-13692165 ] Sergey Shelukhin commented on HBASE-8785: - That is if maxSessionTimeout defined on ZK, not in our config. I was running integration tests and saw that negotiated timeout is 40s., coming from ZK (tick * 20?) So at least the comment needs to be changed imo. revise zookeeper session timeout setting Key: HBASE-8785 URL: https://issues.apache.org/jira/browse/HBASE-8785 Project: HBase Issue Type: Improvement Affects Versions: 0.95.1 Reporter: Sergey Shelukhin Fix For: 0.95.2 Current ZK session timeout is set to 90sec., and the comment in the doc says: This setting becomes zookeeper's 'maxSessionTimeout'. However, this comment is misleading - it doesn't always become maxSessionTimeout, min(our timeout, maxSessionTimeout) is chosen. Moreover, the default maxSessionTimeout in ZK that I'm looking at is 40s, so this setting doesn't do anything. Additionally, 40s. seems like a lot of time. 1) Should the comment be changed to tell the user to change ZK config if they want higher timeout? 2) Should the setting be revised down? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8776) port HBASE-8723 to 0.94
[ https://issues.apache.org/jira/browse/HBASE-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692169#comment-13692169 ] Sergey Shelukhin commented on HBASE-8776: - If the machine disappears for 5 minutes we hope the recovery can proceed in 2~ minutes, so we only need 2-3 minutes of timeout, hopefully. Agree that 128s. fallback is a little bit too long, maybe the last one should be 64sec? port HBASE-8723 to 0.94 --- Key: HBASE-8776 URL: https://issues.apache.org/jira/browse/HBASE-8776 Project: HBase Issue Type: Bug Affects Versions: 0.94.8 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.94.10 Attachments: HBASE-8776-v0.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8743) upgrade hadoop-23 version to 0.23.7
[ https://issues.apache.org/jira/browse/HBASE-8743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692170#comment-13692170 ] Francis Liu commented on HBASE-8743: Yep we do. Thanks Guys. upgrade hadoop-23 version to 0.23.7 --- Key: HBASE-8743 URL: https://issues.apache.org/jira/browse/HBASE-8743 Project: HBase Issue Type: Bug Affects Versions: 0.94.8 Reporter: Francis Liu Assignee: Francis Liu Fix For: 0.94.9 There's a missing class that's causing compile time issues when building with security. This got fixed in 0.23.7. No patch needed just need to bump up the version in pom -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8060) Num compacting KVs diverges from num compacted KVs over time
[ https://issues.apache.org/jira/browse/HBASE-8060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692171#comment-13692171 ] Sergey Shelukhin commented on HBASE-8060: - ping? Num compacting KVs diverges from num compacted KVs over time Key: HBASE-8060 URL: https://issues.apache.org/jira/browse/HBASE-8060 Project: HBase Issue Type: Bug Components: Compaction, UI Affects Versions: 0.94.6, 0.95.0, 0.95.2 Reporter: Andrew Purtell Assignee: Sergey Shelukhin Attachments: HBASE-8060-v0.patch, screenshot.png I have been running what amounts to an ingestion test for a day or so. This is an all-in-one cluster launched with './bin/hbase master start' from sources. In the RS stats on the master UI, the num compacting KVs has diverged from num compacted KVs even though compaction has been completed from perspective of selection, no compaction tasks are running on the RS. I think this could be confusing -- is compaction happening or not? Or maybe I'm misunderstanding what this is supposed to show? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8785) revise zookeeper session timeout setting
[ https://issues.apache.org/jira/browse/HBASE-8785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692172#comment-13692172 ] stack commented on HBASE-8785: -- bq. That is if maxSessionTimeout defined on ZK, not in our config. Our config is zk config when we run the ensemble? We bundle 3.4.5 in 0.95/trunk so I'd think the above pasted code would have an effect? Lets figure this one (I'm pretty sure I've seen sessions 20 * tickTime (2s)) revise zookeeper session timeout setting Key: HBASE-8785 URL: https://issues.apache.org/jira/browse/HBASE-8785 Project: HBase Issue Type: Improvement Affects Versions: 0.95.1 Reporter: Sergey Shelukhin Fix For: 0.95.2 Current ZK session timeout is set to 90sec., and the comment in the doc says: This setting becomes zookeeper's 'maxSessionTimeout'. However, this comment is misleading - it doesn't always become maxSessionTimeout, min(our timeout, maxSessionTimeout) is chosen. Moreover, the default maxSessionTimeout in ZK that I'm looking at is 40s, so this setting doesn't do anything. Additionally, 40s. seems like a lot of time. 1) Should the comment be changed to tell the user to change ZK config if they want higher timeout? 2) Should the setting be revised down? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8015) Support for Namespaces
[ https://issues.apache.org/jira/browse/HBASE-8015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692174#comment-13692174 ] Sergey Shelukhin commented on HBASE-8015: - To solve ambiguity create should have for FQ overload imo. Then, that means you always operate on existing tables if they have a dot, so you can just prevent conflicts at create time. Support for Namespaces -- Key: HBASE-8015 URL: https://issues.apache.org/jira/browse/HBASE-8015 Project: HBase Issue Type: New Feature Reporter: Francis Liu Assignee: Francis Liu Attachments: HBASE-8015_draft_94.patch, Namespace Design.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8776) port HBASE-8723 to 0.94
[ https://issues.apache.org/jira/browse/HBASE-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692177#comment-13692177 ] stack commented on HBASE-8776: -- bq. ...HBase contract is 'any operation will eventually succeed' Stating above helps here. That said, 5minutes is good, ten minutes at a stretch. 40 minutes is abusive; ops won't be able to tell difference between this and hung process I'd say. I'd be good w/ coming down from 128s for last one to 64s. port HBASE-8723 to 0.94 --- Key: HBASE-8776 URL: https://issues.apache.org/jira/browse/HBASE-8776 Project: HBase Issue Type: Bug Affects Versions: 0.94.8 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.94.10 Attachments: HBASE-8776-v0.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8060) Num compacting KVs diverges from num compacted KVs over time
[ https://issues.apache.org/jira/browse/HBASE-8060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692179#comment-13692179 ] stack commented on HBASE-8060: -- +1 Num compacting KVs diverges from num compacted KVs over time Key: HBASE-8060 URL: https://issues.apache.org/jira/browse/HBASE-8060 Project: HBase Issue Type: Bug Components: Compaction, UI Affects Versions: 0.94.6, 0.95.0, 0.95.2 Reporter: Andrew Purtell Assignee: Sergey Shelukhin Attachments: HBASE-8060-v0.patch, screenshot.png I have been running what amounts to an ingestion test for a day or so. This is an all-in-one cluster launched with './bin/hbase master start' from sources. In the RS stats on the master UI, the num compacting KVs has diverged from num compacted KVs even though compaction has been completed from perspective of selection, no compaction tasks are running on the RS. I think this could be confusing -- is compaction happening or not? Or maybe I'm misunderstanding what this is supposed to show? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6295) Possible performance improvement in client batch operations: presplit and send in background
[ https://issues.apache.org/jira/browse/HBASE-6295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692183#comment-13692183 ] Sergey Shelukhin commented on HBASE-6295: - the patch looks reasonable, thanks Possible performance improvement in client batch operations: presplit and send in background Key: HBASE-6295 URL: https://issues.apache.org/jira/browse/HBASE-6295 Project: HBase Issue Type: Improvement Components: Client, Performance Affects Versions: 0.95.2 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Labels: noob Fix For: 0.98.0 Attachments: 6295.v11.patch, 6295.v12.patch, 6295.v14.patch, 6295.v15.patch, 6295.v1.patch, 6295.v2.patch, 6295.v3.patch, 6295.v4.patch, 6295.v5.patch, 6295.v6.patch, 6295.v8.patch, 6295.v9.patch today batch algo is: {noformat} for Operation o: ListOp{ add o to todolist if todolist maxsize or o last in list split todolist per location send split lists to region servers clear todolist wait } {noformat} We could: - create immediately the final object instead of an intermediate array - split per location immediately - instead of sending when the list as a whole is full, send it when there is enough data for a single location It would be: {noformat} for Operation o: ListOp{ get location add o to todo location.todolist if (location.todolist maxLocationSize) send location.todolist to region server clear location.todolist // don't wait, continue the loop } send remaining wait {noformat} It's not trivial to write if you add error management: retried list must be shared with the operations added in the todolist. But it's doable. It's interesting mainly for 'big' writes -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8329) Limit compaction speed
[ https://issues.apache.org/jira/browse/HBASE-8329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692185#comment-13692185 ] Sergey Shelukhin commented on HBASE-8329: - [~aoxiang] do you think I should commit it? Or do you want to test on cluster? Limit compaction speed -- Key: HBASE-8329 URL: https://issues.apache.org/jira/browse/HBASE-8329 Project: HBase Issue Type: Improvement Components: Compaction Reporter: binlijin Attachments: HBASE-8329-2-trunk.patch, HBASE-8329-3-trunk.patch, HBASE-8329-4-trunk.patch, HBASE-8329-trunk.patch There is no speed or resource limit for compaction,I think we should add this feature especially when request burst. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-8796) Add mention of new builds mailing list to site
stack created HBASE-8796: Summary: Add mention of new builds mailing list to site Key: HBASE-8796 URL: https://issues.apache.org/jira/browse/HBASE-8796 Project: HBase Issue Type: Task Components: site Reporter: stack Assignee: stack Add mention of new builds list, a list into which will dump all build output (rather than emit into dev). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-8796) Add mention of new builds mailing list to site
[ https://issues.apache.org/jira/browse/HBASE-8796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-8796. -- Resolution: Fixed Fix Version/s: 0.95.2 0.98.0 Committed to 0.95 and trunk Add mention of new builds mailing list to site -- Key: HBASE-8796 URL: https://issues.apache.org/jira/browse/HBASE-8796 Project: HBase Issue Type: Task Components: site Reporter: stack Assignee: stack Fix For: 0.98.0, 0.95.2 Add mention of new builds list, a list into which will dump all build output (rather than emit into dev). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8662) [rest] support impersonation
[ https://issues.apache.org/jira/browse/HBASE-8662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692191#comment-13692191 ] Hadoop QA commented on HBASE-8662: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12589110/trunk-8662_v4.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings (more than the trunk's current 0 warnings). {color:red}-1 lineLengths{color}. The patch introduces lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/6124//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6124//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6124//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6124//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6124//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6124//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6124//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6124//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6124//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6124//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6124//console This message is automatically generated. [rest] support impersonation Key: HBASE-8662 URL: https://issues.apache.org/jira/browse/HBASE-8662 Project: HBase Issue Type: Sub-task Components: REST, security Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.98.0 Attachments: method_doas.patch, secure_rest.patch, trunk-8662.patch, trunk-8662_v2.patch, trunk-8662_v3.patch, trunk-8662_v4.patch Currently, our client API uses a fixed user: the current user. It should accept a user passed in, if authenticated. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8778) Region assigments scan table directory making them slow for huge tables
[ https://issues.apache.org/jira/browse/HBASE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692206#comment-13692206 ] Sergey Shelukhin commented on HBASE-8778: - Actually on an entirely unrelated note it would be interesting to learn how you run with so many regions (200+ memstores per server?). Are you coming to Hadoop summit? Was there an HBaseCon talk I missed ;) Region assigments scan table directory making them slow for huge tables --- Key: HBASE-8778 URL: https://issues.apache.org/jira/browse/HBASE-8778 Project: HBase Issue Type: Improvement Reporter: Dave Latham Attachments: HBASE-8778-0.94.5.patch, HBASE-8778-0.94.5-v2.patch On a table with 130k regions it takes about 3 seconds for a region server to open a region once it has been assigned. Watching the threads for a region server running 0.94.5 that is opening many such regions shows the thread opening the reigon in code like this: {noformat} PRI IPC Server handler 4 on 60020 daemon prio=10 tid=0x2aaac07e9000 nid=0x6566 runnable [0x4c46d000] java.lang.Thread.State: RUNNABLE at java.lang.String.indexOf(String.java:1521) at java.net.URI$Parser.scan(URI.java:2912) at java.net.URI$Parser.parse(URI.java:3004) at java.net.URI.init(URI.java:736) at org.apache.hadoop.fs.Path.initialize(Path.java:145) at org.apache.hadoop.fs.Path.init(Path.java:126) at org.apache.hadoop.fs.Path.init(Path.java:50) at org.apache.hadoop.hdfs.protocol.HdfsFileStatus.getFullPath(HdfsFileStatus.java:215) at org.apache.hadoop.hdfs.DistributedFileSystem.makeQualified(DistributedFileSystem.java:252) at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:311) at org.apache.hadoop.fs.FilterFileSystem.listStatus(FilterFileSystem.java:159) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:842) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:867) at org.apache.hadoop.hbase.util.FSUtils.listStatus(FSUtils.java:1168) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:269) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:255) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoModtime(FSTableDescriptors.java:368) at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:155) at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:126) at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2834) at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2807) at sun.reflect.GeneratedMethodAccessor64.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426) {noformat} To open the region, the region server first loads the latest HTableDescriptor. Since HBASE-4553 HTableDescriptor's are stored in the file system at /hbase/tableDir/.tableinfo.sequenceNum. The file with the largest sequenceNum is the current descriptor. This is done so that the current descirptor is updated atomically. However, since the filename is not known in advance FSTableDescriptors it has to do a FileSystem.listStatus operation which has to list all files in the directory to find it. The directory also contains all the region directories, so in our case it has to load 130k FileStatus objects. Even using a globStatus matching function still transfers all the objects to the client before performing the pattern matching. Furthermore HDFS uses a default of transferring 1000 directory entries in each RPC call, so it requires 130 roundtrips to the namenode to fetch all the directory entries. Consequently, to reassign all the regions of a table (or a constant fraction thereof) requires time proportional to the square of the number of regions. In our case, if a region server fails with 200 such regions, it takes 10+ minutes for them all to be reassigned, after the zk expiration and log splitting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8778) Region assigments scan table directory making them slow for huge tables
[ https://issues.apache.org/jira/browse/HBASE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692205#comment-13692205 ] Sergey Shelukhin commented on HBASE-8778: - Can you please post on RB (https://reviews.apache.org/r/)? The patch is relatively large Region assigments scan table directory making them slow for huge tables --- Key: HBASE-8778 URL: https://issues.apache.org/jira/browse/HBASE-8778 Project: HBase Issue Type: Improvement Reporter: Dave Latham Attachments: HBASE-8778-0.94.5.patch, HBASE-8778-0.94.5-v2.patch On a table with 130k regions it takes about 3 seconds for a region server to open a region once it has been assigned. Watching the threads for a region server running 0.94.5 that is opening many such regions shows the thread opening the reigon in code like this: {noformat} PRI IPC Server handler 4 on 60020 daemon prio=10 tid=0x2aaac07e9000 nid=0x6566 runnable [0x4c46d000] java.lang.Thread.State: RUNNABLE at java.lang.String.indexOf(String.java:1521) at java.net.URI$Parser.scan(URI.java:2912) at java.net.URI$Parser.parse(URI.java:3004) at java.net.URI.init(URI.java:736) at org.apache.hadoop.fs.Path.initialize(Path.java:145) at org.apache.hadoop.fs.Path.init(Path.java:126) at org.apache.hadoop.fs.Path.init(Path.java:50) at org.apache.hadoop.hdfs.protocol.HdfsFileStatus.getFullPath(HdfsFileStatus.java:215) at org.apache.hadoop.hdfs.DistributedFileSystem.makeQualified(DistributedFileSystem.java:252) at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:311) at org.apache.hadoop.fs.FilterFileSystem.listStatus(FilterFileSystem.java:159) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:842) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:867) at org.apache.hadoop.hbase.util.FSUtils.listStatus(FSUtils.java:1168) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:269) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:255) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoModtime(FSTableDescriptors.java:368) at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:155) at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:126) at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2834) at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2807) at sun.reflect.GeneratedMethodAccessor64.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426) {noformat} To open the region, the region server first loads the latest HTableDescriptor. Since HBASE-4553 HTableDescriptor's are stored in the file system at /hbase/tableDir/.tableinfo.sequenceNum. The file with the largest sequenceNum is the current descriptor. This is done so that the current descirptor is updated atomically. However, since the filename is not known in advance FSTableDescriptors it has to do a FileSystem.listStatus operation which has to list all files in the directory to find it. The directory also contains all the region directories, so in our case it has to load 130k FileStatus objects. Even using a globStatus matching function still transfers all the objects to the client before performing the pattern matching. Furthermore HDFS uses a default of transferring 1000 directory entries in each RPC call, so it requires 130 roundtrips to the namenode to fetch all the directory entries. Consequently, to reassign all the regions of a table (or a constant fraction thereof) requires time proportional to the square of the number of regions. In our case, if a region server fails with 200 such regions, it takes 10+ minutes for them all to be reassigned, after the zk expiration and log splitting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8778) Region assigments scan table directory making them slow for huge tables
[ https://issues.apache.org/jira/browse/HBASE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692210#comment-13692210 ] Ian Friedman commented on HBASE-8778: - Actually Dave was on the panel at the HBase Operations session at HBaseCon, so if you went to that you might have heard about it. Also FYI looks like we have something like 258 regions per server nowadays. :) Region assigments scan table directory making them slow for huge tables --- Key: HBASE-8778 URL: https://issues.apache.org/jira/browse/HBASE-8778 Project: HBase Issue Type: Improvement Reporter: Dave Latham Attachments: HBASE-8778-0.94.5.patch, HBASE-8778-0.94.5-v2.patch On a table with 130k regions it takes about 3 seconds for a region server to open a region once it has been assigned. Watching the threads for a region server running 0.94.5 that is opening many such regions shows the thread opening the reigon in code like this: {noformat} PRI IPC Server handler 4 on 60020 daemon prio=10 tid=0x2aaac07e9000 nid=0x6566 runnable [0x4c46d000] java.lang.Thread.State: RUNNABLE at java.lang.String.indexOf(String.java:1521) at java.net.URI$Parser.scan(URI.java:2912) at java.net.URI$Parser.parse(URI.java:3004) at java.net.URI.init(URI.java:736) at org.apache.hadoop.fs.Path.initialize(Path.java:145) at org.apache.hadoop.fs.Path.init(Path.java:126) at org.apache.hadoop.fs.Path.init(Path.java:50) at org.apache.hadoop.hdfs.protocol.HdfsFileStatus.getFullPath(HdfsFileStatus.java:215) at org.apache.hadoop.hdfs.DistributedFileSystem.makeQualified(DistributedFileSystem.java:252) at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:311) at org.apache.hadoop.fs.FilterFileSystem.listStatus(FilterFileSystem.java:159) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:842) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:867) at org.apache.hadoop.hbase.util.FSUtils.listStatus(FSUtils.java:1168) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:269) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:255) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoModtime(FSTableDescriptors.java:368) at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:155) at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:126) at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2834) at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2807) at sun.reflect.GeneratedMethodAccessor64.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426) {noformat} To open the region, the region server first loads the latest HTableDescriptor. Since HBASE-4553 HTableDescriptor's are stored in the file system at /hbase/tableDir/.tableinfo.sequenceNum. The file with the largest sequenceNum is the current descriptor. This is done so that the current descirptor is updated atomically. However, since the filename is not known in advance FSTableDescriptors it has to do a FileSystem.listStatus operation which has to list all files in the directory to find it. The directory also contains all the region directories, so in our case it has to load 130k FileStatus objects. Even using a globStatus matching function still transfers all the objects to the client before performing the pattern matching. Furthermore HDFS uses a default of transferring 1000 directory entries in each RPC call, so it requires 130 roundtrips to the namenode to fetch all the directory entries. Consequently, to reassign all the regions of a table (or a constant fraction thereof) requires time proportional to the square of the number of regions. In our case, if a region server fails with 200 such regions, it takes 10+ minutes for them all to be reassigned, after the zk expiration and log splitting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-8797) Prevent merging regions from moving during online merge
Jimmy Xiang created HBASE-8797: -- Summary: Prevent merging regions from moving during online merge Key: HBASE-8797 URL: https://issues.apache.org/jira/browse/HBASE-8797 Project: HBase Issue Type: Bug Components: regionserver Reporter: Jimmy Xiang Assignee: Jimmy Xiang When two regions are merged online, they are closed but master doesn't know they should stay closed during the merge. If master moves them by mistake, for example, load balancer kicks in, the merge could be messed up. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8782) Thrift2 can not parse values when using framed transport
[ https://issues.apache.org/jira/browse/HBASE-8782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692214#comment-13692214 ] Hamed Madani commented on HBASE-8782: - Well from what I understand ByteBuffer.array() returns the internal array of the buffer. But getBytes() return a new subset of internal array. getBytes() extract this small array by looking at position and limit variables of byteBuffer. With Binary protocol ByteBuffer.readBinary() returns a new ByteBuffer with a small internal array. (position=0, size= size of our useful data). With framed transport, however, ByteBuffer.readBinary() returns the original *trans_* array, but with new *position* and *limit* variables. so internal arrays with framed transport are very large containing all the data in one connection. As for solution, my first solution to avoid copying the array was to modify HtableInterface to accept ByteBuffer as input and separately take care of other cases in checkAndPut() and checkAndDelete(). However, I can see that means adding to HTableInterface! I found *org.apache.thrift.TBaseHelper.byteBufferToByteArray* to be a more efficient function for this use case that Bytes.getBytes(). {code} public static byte[] byteBufferToByteArray(ByteBuffer byteBuffer) { if (wrapsFullArray(byteBuffer)) { return byteBuffer.array(); } byte[] target = new byte[byteBuffer.remaining()]; byteBufferToByteArray(byteBuffer, target, 0); return target; } public static boolean wrapsFullArray(ByteBuffer byteBuffer) { return byteBuffer.hasArray() byteBuffer.position() == 0 byteBuffer.arrayOffset() == 0 byteBuffer.remaining() == byteBuffer.capacity(); } public static int byteBufferToByteArray(ByteBuffer byteBuffer, byte[] target, int offset) { int remaining = byteBuffer.remaining(); System.arraycopy(byteBuffer.array(), byteBuffer.arrayOffset() + byteBuffer.position(), target, offset, remaining); return remaining; } {code} the above function is more efficient because for binary protocol it simply returns the inner array with .array() and for framed protocol it uses system.arraycopy rather than a for loop to copy the elements. Also above function avoids byteBuffer.duplicate(). If you also think this is a better alternative than getBytes() I can make a new patch using byteBufferToByteArray() instead of getBytes(); Thrift2 can not parse values when using framed transport Key: HBASE-8782 URL: https://issues.apache.org/jira/browse/HBASE-8782 Project: HBase Issue Type: Bug Components: Thrift Affects Versions: 0.95.1 Reporter: Hamed Madani Attachments: HBASE_8782.patch ThriftHBaseServiceHandler.java use .array() on table names , and values (family , qualifier in checkandDelete , etc) which resulted in incorrect values with framed transport. Replacing .array() with getBytes() fixed this problem. I've attached the patch EDIT: updated the patch to cover checkAndPut(), checkAndDelete() -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8792) Organize EventType Java Docs
[ https://issues.apache.org/jira/browse/HBASE-8792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692216#comment-13692216 ] Hadoop QA commented on HBASE-8792: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12589428/HBASE-8792.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/6125//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6125//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6125//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6125//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6125//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6125//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6125//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6125//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6125//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6125//console This message is automatically generated. Organize EventType Java Docs Key: HBASE-8792 URL: https://issues.apache.org/jira/browse/HBASE-8792 Project: HBase Issue Type: Task Reporter: Gustavo Anatoly Assignee: Gustavo Anatoly Priority: Trivial Attachments: HBASE-8792.patch Organize description for declared enums. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8776) port HBASE-8723 to 0.94
[ https://issues.apache.org/jira/browse/HBASE-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692220#comment-13692220 ] Lars Hofhansl commented on HBASE-8776: -- +1 to what Stack said. I'd be happy with 5 mins. port HBASE-8723 to 0.94 --- Key: HBASE-8776 URL: https://issues.apache.org/jira/browse/HBASE-8776 Project: HBase Issue Type: Bug Affects Versions: 0.94.8 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.94.10 Attachments: HBASE-8776-v0.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5083) Backup HMaster should have http infoport open with link to the active master
[ https://issues.apache.org/jira/browse/HBASE-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692227#comment-13692227 ] Lars Hofhansl commented on HBASE-5083: -- Anybody? :) Backup HMaster should have http infoport open with link to the active master Key: HBASE-5083 URL: https://issues.apache.org/jira/browse/HBASE-5083 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Cody Marcel Fix For: 0.94.9 Attachments: backup_master.png, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083_trunk.patch, HBASE-5083_trunk.patch, HBASE-5083_trunk.patch, master.png, Trunk_Backup_Master.png, Trunk_Master.png Without ssh'ing and jps/ps'ing, it is difficult to see if a backup hmaster is up. It seems like it would be good for a backup hmaster to have a basic web page up on the info port so that users could see that it is up. Also it should probably either provide a link to the active master or automatically forward to the active master. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8656) Rpc call may not be notified in SecureClient
[ https://issues.apache.org/jira/browse/HBASE-8656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-8656: - Fix Version/s: (was: 0.94.9) 0.94.10 Let's push to 0.94.10. As the 0.94 release are starting to get smaller I'll try to have a release every 4 weeks. Rpc call may not be notified in SecureClient Key: HBASE-8656 URL: https://issues.apache.org/jira/browse/HBASE-8656 Project: HBase Issue Type: Bug Components: Client, IPC/RPC, security Affects Versions: 0.94.7 Reporter: cuijianwei Assignee: cuijianwei Fix For: 0.94.10 Attachments: HBASE-8656-0.94-v1.txt In SecureClient.java, rpc responses will be processed by receiveResponse() which looks like: {code} try { int id = in.readInt();// try to read an id if (LOG.isDebugEnabled()) LOG.debug(getName() + got value # + id); Call call = calls.remove(id); int state = in.readInt(); // read call status if (LOG.isDebugEnabled()) { LOG.debug(call #+id+ state is + state); } if (state == Status.SUCCESS.state) { Writable value = ReflectionUtils.newInstance(valueClass, conf); value.readFields(in); // read value if (LOG.isDebugEnabled()) { LOG.debug(call #+id+, response is:\n+value.toString()); } // it's possible that this call may have been cleaned up due to a RPC // timeout, so check if it still exists before setting the value. if (call != null) { call.setValue(value); } } else if (state == Status.ERROR.state) { if (call != null) { call.setException(new RemoteException(WritableUtils.readString(in), WritableUtils .readString(in))); } } else if (state == Status.FATAL.state) { // Close the connection markClosed(new RemoteException(WritableUtils.readString(in), WritableUtils.readString(in))); } } catch (IOException e) { if (e instanceof SocketTimeoutException remoteId.rpcTimeout 0) { // Clean up open calls but don't treat this as a fatal condition, // since we expect certain responses to not make it by the specified // {@link ConnectionId#rpcTimeout}. closeException = e; } else { // Since the server did not respond within the default ping interval // time, treat this as a fatal condition and close this connection markClosed(e); } } finally { if (remoteId.rpcTimeout 0) { cleanupCalls(remoteId.rpcTimeout); } } } {code} In above code, in the try block, the call will be firstly removed from call map by: {code} Call call = calls.remove(id); {code} There may be two cases leading the call couldn't be notified and the invoking thread will wait forever. Firstly, if the returned status is Status.FATAL.state by: {code} int state = in.readInt(); // read call status {code} The code will come into: {code} } else if (state == Status.FATAL.state) { // Close the connection markClosed(new RemoteException(WritableUtils.readString(in), WritableUtils.readString(in))); } {code} Here, the SecureConnection is marked as closed and all rpc calls in call map of this connection will be notified to receive an exception. However, the current rpc call has been removed from the call map, it won't be notified. Secondly, after the call has been removed by: {code} Call call = calls.remove(id); {code} If we encounter any exception before the 'try' block finished, the code will come into 'catch' and 'finally' block, neither 'catch' block nor 'finally' block will notify the rpc call because it has been removed from call map. Compared with receiveResponse() in HBaseClient.java, it may be better to get the rpc call from call map and remove it at the end of the 'try' block. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5083) Backup HMaster should have http infoport open with link to the active master
[ https://issues.apache.org/jira/browse/HBASE-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692239#comment-13692239 ] stack commented on HBASE-5083: -- I skimmed. LGTM. Backup HMaster should have http infoport open with link to the active master Key: HBASE-5083 URL: https://issues.apache.org/jira/browse/HBASE-5083 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Cody Marcel Fix For: 0.94.9 Attachments: backup_master.png, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083_trunk.patch, HBASE-5083_trunk.patch, HBASE-5083_trunk.patch, master.png, Trunk_Backup_Master.png, Trunk_Master.png Without ssh'ing and jps/ps'ing, it is difficult to see if a backup hmaster is up. It seems like it would be good for a backup hmaster to have a basic web page up on the info port so that users could see that it is up. Also it should probably either provide a link to the active master or automatically forward to the active master. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (HBASE-5083) Backup HMaster should have http infoport open with link to the active master
[ https://issues.apache.org/jira/browse/HBASE-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692245#comment-13692245 ] Jesse Yates edited comment on HBASE-5083 at 6/24/13 6:51 PM: - Looking at the trunk patch, a couple of nits: in hbase-server/src/main/jamon/org/apache/hadoop/hbase/tmpl/master/BackupMasterStatusTmpl.jamon {quote} +Copyright 2011 The Apache Software Foundation {quote} isn't needed. {quote} + /tr + %java +Arrays.sort(serverNames); +for (ServerName serverName: serverNames) { + /%java {quote} Spacing looks a little bit off. In HMaster, {quote} +if(master.isActiveMaster()){ + metaLocation = getMetaLocationOrNull(master); + //ServerName metaLocation = master.getCatalogTracker().getMetaLocation(); + servers = master.getServerManager().getOnlineServersList(); + deadServers = master.getServerManager().getDeadServers().copyServerNames(); +} {quote} Spacing is off (everything else looks to be 2 spaces, not 4 (or is that a tab? can't tell just reading the text diff)). {quote} + return (master.getCatalogTracker() == null) ? null : master.getCatalogTracker().getMetaLocation(); {quote} Wish there wasn't a need for the null here and instead a special ServerName that we could use when its null (increases potential for NullPointerExceptions, makes code a little more brittle, requires more null checks other places, etc.), but just complaining - this is fine and fits in with everything else. otherwise, it looks fine. +1, if you wouldn't mind fixing the nits on commit. was (Author: jesse_yates): nits: in hbase-server/src/main/jamon/org/apache/hadoop/hbase/tmpl/master/BackupMasterStatusTmpl.jamon {quote} +Copyright 2011 The Apache Software Foundation {quote} isn't needed. {quote} + /tr + %java +Arrays.sort(serverNames); +for (ServerName serverName: serverNames) { + /%java {quote} Spacing looks a little bit off. In HMaster, {quote} +if(master.isActiveMaster()){ + metaLocation = getMetaLocationOrNull(master); + //ServerName metaLocation = master.getCatalogTracker().getMetaLocation(); + servers = master.getServerManager().getOnlineServersList(); + deadServers = master.getServerManager().getDeadServers().copyServerNames(); +} {quote} Spacing is off (everything else looks to be 2 spaces, not 4 (or is that a tab? can't tell just reading the text diff)). {quote} + return (master.getCatalogTracker() == null) ? null : master.getCatalogTracker().getMetaLocation(); {quote} Wish there wasn't a need for the null here and instead a special ServerName that we could use when its null (increases potential for NullPointerExceptions, makes code a little more brittle, requires more null checks other places, etc.), but just complaining - this is fine and fits in with everything else. otherwise, it looks fine. +1, if you wouldn't mind fixing the nits on commit. Backup HMaster should have http infoport open with link to the active master Key: HBASE-5083 URL: https://issues.apache.org/jira/browse/HBASE-5083 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Cody Marcel Fix For: 0.94.9 Attachments: backup_master.png, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083_trunk.patch, HBASE-5083_trunk.patch, HBASE-5083_trunk.patch, master.png, Trunk_Backup_Master.png, Trunk_Master.png Without ssh'ing and jps/ps'ing, it is difficult to see if a backup hmaster is up. It seems like it would be good for a backup hmaster to have a basic web page up on the info port so that users could see that it is up. Also it should probably either provide a link to the active master or automatically forward to the active master. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5083) Backup HMaster should have http infoport open with link to the active master
[ https://issues.apache.org/jira/browse/HBASE-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692245#comment-13692245 ] Jesse Yates commented on HBASE-5083: nits: in hbase-server/src/main/jamon/org/apache/hadoop/hbase/tmpl/master/BackupMasterStatusTmpl.jamon {quote} +Copyright 2011 The Apache Software Foundation {quote} isn't needed. {quote} + /tr + %java +Arrays.sort(serverNames); +for (ServerName serverName: serverNames) { + /%java {quote} Spacing looks a little bit off. In HMaster, {quote} +if(master.isActiveMaster()){ + metaLocation = getMetaLocationOrNull(master); + //ServerName metaLocation = master.getCatalogTracker().getMetaLocation(); + servers = master.getServerManager().getOnlineServersList(); + deadServers = master.getServerManager().getDeadServers().copyServerNames(); +} {quote} Spacing is off (everything else looks to be 2 spaces, not 4 (or is that a tab? can't tell just reading the text diff)). {quote} + return (master.getCatalogTracker() == null) ? null : master.getCatalogTracker().getMetaLocation(); {quote} Wish there wasn't a need for the null here and instead a special ServerName that we could use when its null (increases potential for NullPointerExceptions, makes code a little more brittle, requires more null checks other places, etc.), but just complaining - this is fine and fits in with everything else. otherwise, it looks fine. +1, if you wouldn't mind fixing the nits on commit. Backup HMaster should have http infoport open with link to the active master Key: HBASE-5083 URL: https://issues.apache.org/jira/browse/HBASE-5083 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Cody Marcel Fix For: 0.94.9 Attachments: backup_master.png, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083_trunk.patch, HBASE-5083_trunk.patch, HBASE-5083_trunk.patch, master.png, Trunk_Backup_Master.png, Trunk_Master.png Without ssh'ing and jps/ps'ing, it is difficult to see if a backup hmaster is up. It seems like it would be good for a backup hmaster to have a basic web page up on the info port so that users could see that it is up. Also it should probably either provide a link to the active master or automatically forward to the active master. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8532) [Webui] Bootstrap based webui compatibility for IE and also fix some page format issues.
[ https://issues.apache.org/jira/browse/HBASE-8532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692253#comment-13692253 ] Elliott Clark commented on HBASE-8532: -- +1 sorry it took so long. I had a harder time getting an old ie than I thought I would. Looks good. THanks Julian [Webui] Bootstrap based webui compatibility for IE and also fix some page format issues. Key: HBASE-8532 URL: https://issues.apache.org/jira/browse/HBASE-8532 Project: HBase Issue Type: Bug Components: UI Affects Versions: 0.98.0, 0.95.0, 0.95.2 Reporter: Julian Zhou Assignee: Julian Zhou Priority: Minor Fix For: 0.98.0, 0.95.0, 0.95.2 Attachments: 8532-trunk-0.95-v1.patch, hbase-8532_v0.patch, webui-IE-error-apos;.png, webui-IE-error.png HBASE-7425 brings bootstrap based webui to hbase. While trying on trunk version, Firefox works well, but IE 8/9 layout and style look strange due to compatibility issue. Add !DOCTYPE html ... at the beginning of all jamon html/jsp templates to fix it. Seems HBase-2110 had a work to comment out the DOCTYPE for all .jsp to make the browser run the pages in Quirks mode (http://en.wikipedia.org/wiki/Quirks_mode) due to jetty issue at that time? To support the compatibility of webui across browsers (IE/Firefox/Chrome, etc.), there are some guidelines for choosing rendering the page under standard mode or quirk mode: http://en.wikipedia.org/wiki/Quirks_mode http://hsivonen.iki.fi/doctype/ According to document, !DOCTYPE html PUBLIC -//W3C//DTD XHTML 1.1//EN http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd; has the most extensive compatibility even for HTML 5. According to my test, add this could make webui works in IE (standard mode), while Firefox could not work well with styles. Looks like if in Firefox, we still need the quirk mode (no DOCTYPE declaration). So just adding conditional DOCTYPE declaration for IE, !--[if IE] !DOCTYPE html PUBLIC -//W3C//DTD XHTML 1.1//EN http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd; ![endif]-- this could make webui works for both IE and Firefox, also for Chrome and other browsers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6295) Possible performance improvement in client batch operations: presplit and send in background
[ https://issues.apache.org/jira/browse/HBASE-6295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692258#comment-13692258 ] Nicolas Liochon commented on HBASE-6295: Committed to trunk 0.95. I've done a lot of tests, but it's quite easy to break something in this area. So ping me if there is anything suspicious in the next days. Thanks a lot for the reviews, and especially to Jean-Marc for all these tests. Possible performance improvement in client batch operations: presplit and send in background Key: HBASE-6295 URL: https://issues.apache.org/jira/browse/HBASE-6295 Project: HBase Issue Type: Improvement Components: Client, Performance Affects Versions: 0.95.2 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Labels: noob Fix For: 0.98.0 Attachments: 6295.v11.patch, 6295.v12.patch, 6295.v14.patch, 6295.v15.patch, 6295.v1.patch, 6295.v2.patch, 6295.v3.patch, 6295.v4.patch, 6295.v5.patch, 6295.v6.patch, 6295.v8.patch, 6295.v9.patch today batch algo is: {noformat} for Operation o: ListOp{ add o to todolist if todolist maxsize or o last in list split todolist per location send split lists to region servers clear todolist wait } {noformat} We could: - create immediately the final object instead of an intermediate array - split per location immediately - instead of sending when the list as a whole is full, send it when there is enough data for a single location It would be: {noformat} for Operation o: ListOp{ get location add o to todo location.todolist if (location.todolist maxLocationSize) send location.todolist to region server clear location.todolist // don't wait, continue the loop } send remaining wait {noformat} It's not trivial to write if you add error management: retried list must be shared with the operations added in the todolist. But it's doable. It's interesting mainly for 'big' writes -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5083) Backup HMaster should have http infoport open with link to the active master
[ https://issues.apache.org/jira/browse/HBASE-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692259#comment-13692259 ] Jesse Yates commented on HBASE-5083: skimmed 0.94 patch, LGTM. Backup HMaster should have http infoport open with link to the active master Key: HBASE-5083 URL: https://issues.apache.org/jira/browse/HBASE-5083 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Cody Marcel Fix For: 0.94.9 Attachments: backup_master.png, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083_trunk.patch, HBASE-5083_trunk.patch, HBASE-5083_trunk.patch, master.png, Trunk_Backup_Master.png, Trunk_Master.png Without ssh'ing and jps/ps'ing, it is difficult to see if a backup hmaster is up. It seems like it would be good for a backup hmaster to have a basic web page up on the info port so that users could see that it is up. Also it should probably either provide a link to the active master or automatically forward to the active master. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6295) Possible performance improvement in client batch operations: presplit and send in background
[ https://issues.apache.org/jira/browse/HBASE-6295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Liochon updated HBASE-6295: --- Resolution: Fixed Fix Version/s: 0.95.2 Release Note: The puts are now streamed, i.e. sent asynchronously to the region servers if autoflush it set to false. If a region server is slow or does not respond, its puts are kept into the write buffer while the others are sent to these respective region server, until the write buffer is full. This feature is keeps the semantic of the interface already existing in 0.94 when using autoflush. Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Possible performance improvement in client batch operations: presplit and send in background Key: HBASE-6295 URL: https://issues.apache.org/jira/browse/HBASE-6295 Project: HBase Issue Type: Improvement Components: Client, Performance Affects Versions: 0.95.2 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Labels: noob Fix For: 0.98.0, 0.95.2 Attachments: 6295.v11.patch, 6295.v12.patch, 6295.v14.patch, 6295.v15.patch, 6295.v1.patch, 6295.v2.patch, 6295.v3.patch, 6295.v4.patch, 6295.v5.patch, 6295.v6.patch, 6295.v8.patch, 6295.v9.patch today batch algo is: {noformat} for Operation o: ListOp{ add o to todolist if todolist maxsize or o last in list split todolist per location send split lists to region servers clear todolist wait } {noformat} We could: - create immediately the final object instead of an intermediate array - split per location immediately - instead of sending when the list as a whole is full, send it when there is enough data for a single location It would be: {noformat} for Operation o: ListOp{ get location add o to todo location.todolist if (location.todolist maxLocationSize) send location.todolist to region server clear location.todolist // don't wait, continue the loop } send remaining wait {noformat} It's not trivial to write if you add error management: retried list must be shared with the operations added in the todolist. But it's doable. It's interesting mainly for 'big' writes -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8532) [Webui] Bootstrap based webui compatibility for IE and also fix some page format issues.
[ https://issues.apache.org/jira/browse/HBASE-8532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-8532: - Resolution: Fixed Fix Version/s: (was: 0.95.0) Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed trunk and 0.95. Thanks Julian (and E for review). [Webui] Bootstrap based webui compatibility for IE and also fix some page format issues. Key: HBASE-8532 URL: https://issues.apache.org/jira/browse/HBASE-8532 Project: HBase Issue Type: Bug Components: UI Affects Versions: 0.98.0, 0.95.0, 0.95.2 Reporter: Julian Zhou Assignee: Julian Zhou Priority: Minor Fix For: 0.98.0, 0.95.2 Attachments: 8532-trunk-0.95-v1.patch, hbase-8532_v0.patch, webui-IE-error-apos;.png, webui-IE-error.png HBASE-7425 brings bootstrap based webui to hbase. While trying on trunk version, Firefox works well, but IE 8/9 layout and style look strange due to compatibility issue. Add !DOCTYPE html ... at the beginning of all jamon html/jsp templates to fix it. Seems HBase-2110 had a work to comment out the DOCTYPE for all .jsp to make the browser run the pages in Quirks mode (http://en.wikipedia.org/wiki/Quirks_mode) due to jetty issue at that time? To support the compatibility of webui across browsers (IE/Firefox/Chrome, etc.), there are some guidelines for choosing rendering the page under standard mode or quirk mode: http://en.wikipedia.org/wiki/Quirks_mode http://hsivonen.iki.fi/doctype/ According to document, !DOCTYPE html PUBLIC -//W3C//DTD XHTML 1.1//EN http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd; has the most extensive compatibility even for HTML 5. According to my test, add this could make webui works in IE (standard mode), while Firefox could not work well with styles. Looks like if in Firefox, we still need the quirk mode (no DOCTYPE declaration). So just adding conditional DOCTYPE declaration for IE, !--[if IE] !DOCTYPE html PUBLIC -//W3C//DTD XHTML 1.1//EN http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd; ![endif]-- this could make webui works for both IE and Firefox, also for Chrome and other browsers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8792) Organize EventType Java Docs
[ https://issues.apache.org/jira/browse/HBASE-8792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-8792: - Resolution: Fixed Fix Version/s: 0.95.2 0.98.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks for the patch Gustavo. I applied to 0.95 and trunk. Organize EventType Java Docs Key: HBASE-8792 URL: https://issues.apache.org/jira/browse/HBASE-8792 Project: HBase Issue Type: Task Reporter: Gustavo Anatoly Assignee: Gustavo Anatoly Priority: Trivial Fix For: 0.98.0, 0.95.2 Attachments: HBASE-8792.patch Organize description for declared enums. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8782) Thrift2 can not parse values when using framed transport
[ https://issues.apache.org/jira/browse/HBASE-8782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692307#comment-13692307 ] Lars George commented on HBASE-8782: bq. As for solution, my first solution to avoid copying the array was to modify HtableInterface to accept ByteBuffer as input and separately take care of other cases in checkAndPut() and checkAndDelete(). However, I can see that means adding to HTableInterface! I do not think you have to do that. getTable() is a function of HTablePool, not the HTableInterface. What you found there seems to exist for exactly the right reasons though, nice find indeed! Yes, please update the patch so that we can look at it in context. Good on you, keep hacking away on it. Thrift2 can not parse values when using framed transport Key: HBASE-8782 URL: https://issues.apache.org/jira/browse/HBASE-8782 Project: HBase Issue Type: Bug Components: Thrift Affects Versions: 0.95.1 Reporter: Hamed Madani Attachments: HBASE_8782.patch ThriftHBaseServiceHandler.java use .array() on table names , and values (family , qualifier in checkandDelete , etc) which resulted in incorrect values with framed transport. Replacing .array() with getBytes() fixed this problem. I've attached the patch EDIT: updated the patch to cover checkAndPut(), checkAndDelete() -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8792) Organize EventType Java Docs
[ https://issues.apache.org/jira/browse/HBASE-8792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692324#comment-13692324 ] Gustavo Anatoly commented on HBASE-8792: Hi, Stack. Thankful for your patience. Organize EventType Java Docs Key: HBASE-8792 URL: https://issues.apache.org/jira/browse/HBASE-8792 Project: HBase Issue Type: Task Reporter: Gustavo Anatoly Assignee: Gustavo Anatoly Priority: Trivial Fix For: 0.98.0, 0.95.2 Attachments: HBASE-8792.patch Organize description for declared enums. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8734) HBase side support for BIGTOP-1007 (Introduce a modules system for HBase coprocessor applications)
[ https://issues.apache.org/jira/browse/HBASE-8734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692337#comment-13692337 ] Andrew Purtell commented on HBASE-8734: --- bq. What about the execution order of the CPs. There are a couple of ways to do this. See subsequent discussion on BIGTOP-1007. Was thinking of modifying the system coprocessor specifications in hbase site files to allow for optional path to jar instead of requiring dropping jars on the master or RS classpath. Can also modify for specifying priority. In other words, make system coprocessor specifications look and function like table coprocessor specifications. HBase side support for BIGTOP-1007 (Introduce a modules system for HBase coprocessor applications) -- Key: HBASE-8734 URL: https://issues.apache.org/jira/browse/HBASE-8734 Project: HBase Issue Type: Improvement Components: Coprocessors Reporter: Andrew Purtell Assignee: Andrew Purtell BIGTOP-1007 proposes a modules system convention (/etc/hbase/modules.d), a common pattern used for example by Apache httpd, for easily installation and removal of HBase coprocessor applications. Will propose any HBase side changes needed here, or close soon if none are required. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5083) Backup HMaster should have http infoport open with link to the active master
[ https://issues.apache.org/jira/browse/HBASE-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692334#comment-13692334 ] Lars Hofhansl commented on HBASE-5083: -- I fixed up the whitespace issues (there were some more). Backup HMaster should have http infoport open with link to the active master Key: HBASE-5083 URL: https://issues.apache.org/jira/browse/HBASE-5083 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Cody Marcel Fix For: 0.94.9 Attachments: backup_master.png, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083_trunk.patch, HBASE-5083_trunk.patch, HBASE-5083_trunk.patch, master.png, Trunk_Backup_Master.png, Trunk_Master.png Without ssh'ing and jps/ps'ing, it is difficult to see if a backup hmaster is up. It seems like it would be good for a backup hmaster to have a basic web page up on the info port so that users could see that it is up. Also it should probably either provide a link to the active master or automatically forward to the active master. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira