Re: High Full GC count for Region server
Hi, Can anyone please reply to the above query ? On Tue, Oct 29, 2013 at 10:48 AM, Vimal Jain vkj...@gmail.com wrote: Hi, Here is my analysis of this problem.Please correct me if i wrong somewhere. I have assigned 2 GB to region server process.I think its sufficient enough to handle around 9GB of data. I have not changed much of the parameters , especially memstore size which is 128 GB for 0.94.7 by default. Also as per my understanding , each col-family has one memstore associated with it.So my memstores are taking 128*3 = 384 MB ( I have 3 column families). So i think i should reduce memstore size to something like 32/64 MB so that data is flushed to disk at higher frequency then current frequency.This will save some memory. Is there any other parameter other then memstore size which affects memory utilization. Also I am getting below exceptions in data node log and region server log every day.Is it due to long GC pauses ? Data node logs :- hadoop-hadoop-datanode-woody.log:2013-10-29 00:12:13,127 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration( 192.168.20.30:5001 0, storageID=DS-1816106352-192.168.20.30-50010-1369314076237, infoPort=50075, ipcPort=50020):Got exception while serving blk_-560908881317618221_58058 to /192.168.20.30: hadoop-hadoop-datanode-woody.log:java.net.SocketTimeoutException: 48 millis timeout while waiting for channel to be ready for write. ch : java.nio .channels.SocketChannel[connected local=/192.168.20.30:50010 remote=/ 192.168.20.30:39413] hadoop-hadoop-datanode-woody.log:2013-10-29 00:12:13,127 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration( 192.168.20.30:500 10, storageID=DS-1816106352-192.168.20.30-50010-1369314076237, infoPort=50075, ipcPort=50020):DataXceiver hadoop-hadoop-datanode-woody.log:java.net.SocketTimeoutException: 48 millis timeout while waiting for channel to be ready for write. ch : java.nio .channels.SocketChannel[connected local=/192.168.20.30:50010 remote=/ 192.168.20.30:39413] Region server logs :- hbase-hadoop-regionserver-woody.log:2013-10-29 01:01:16,475 WARN org.apache.hadoop.ipc.HBaseServer: (responseTooSlow): {processingtimems:15827,call :multi(org.apache.hadoop.hbase.client.MultiAction@2918e464), rpc version=1, client version=29, methodsFingerPrint=-1368823753,client:192.168.20. 31:50619,starttimems:1382988660645,queuetimems:0,class:HRegionServer,responsesize:0,method:multi} hbase-hadoop-regionserver-woody.log:2013-10-29 06:01:27,459 WARN org.apache.hadoop.ipc.HBaseServer: (operationTooSlow): {processingtimems:14745,cli ent:192.168.20.31:50908 ,timeRange:[0,9223372036854775807],starttimems:1383006672707,responsesize:55,class:HRegionServer,table:event_da ta,cacheBlocks:true,families:{oinfo:[clubStatus]},row:1752869,queuetimems:1,method:get,totalColumns:1,maxVersions:1} On Mon, Oct 28, 2013 at 11:55 PM, Asaf Mesika asaf.mes...@gmail.comwrote: Check through HDFS UI that your cluster haven't reached maximum disk capacity On Thursday, October 24, 2013, Vimal Jain wrote: Hi Ted/Jean, Can you please help here ? On Tue, Oct 22, 2013 at 10:29 PM, Vimal Jain vkj...@gmail.com javascript:; wrote: Hi Ted, Yes i checked namenode and datanode logs and i found below exceptions in both the logs:- Name node :- java.io.IOException: File /hbase/event_data/433b61f2a4ebff8f2e4b89890508a3b7/.tmp/99797a61a8f7471cb6df8f7b95f18e9e could only be replicated to 0 nodes, instead of 1 java.io.IOException: Got blockReceived message from unregistered or dead node blk_-2949905629769882833_52274 Data node :- 48 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/ 192.168.20.30:50010 remote=/192.168.20.30:36188] ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(192.168.20.30:50010, storageID=DS-1816106352-192.168.20.30-50010-1369314076237, infoPort=50075, ipcPort=50020):DataXceiver java.io.EOFException: while trying to read 39309 bytes On Tue, Oct 22, 2013 at 10:19 PM, Ted Yu yuzhih...@gmail.com wrote: bq. java.io.IOException: File /hbase/event_data/ 4c3765c51911d6c67037a983d205a010/.tmp/bfaf8df33d5b4068825e3664d3e4b2b0 could only be replicated to 0 nodes, instead of 1 Have you checked Namenode / Datanode logs ? Looks like hdfs was not stable. On Tue, Oct 22, 2013 at 9:01 AM, Vimal Jain vkj...@gmail.com wrote: HI Jean, Thanks for your reply. I have total 8 GB memory and distribution is as follows:- Region server - 2 GB Master,Namenode,Datanode,Secondary Namenode,Zookepeer - 1 GB OS - 1 GB Please let me know if you need more information. On Tue, Oct 22, 2013 at 8:15 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Vimal, What are your
Re: Online Schema Changes
Is it safe to turn it on for a single-admin cluster? I guess it's off to prevent from concurrent changes... On Tue, Oct 29, 2013 at 5:29 PM, Ted Yu yuzhih...@gmail.com wrote: In 0.96.0 and 0.94 the feature is disabled by default. property namehbase.online.schema.update.enable/name valuefalse/value On Tue, Oct 29, 2013 at 9:20 AM, Chetan chetan.ka...@gmail.com wrote: Oops .. I mean to say, I saw some post about this feature being disabled by default in 0.94 On Tue, Oct 29, 2013 at 12:20 PM, Chetan chetan.ka...@gmail.com wrote: We are currently doing a prototype using 0.94.x. Is this feature supported in 0.94.x ? I saw some post about this feature being disabled by default in 0.96 On Tue, Oct 29, 2013 at 12:15 PM, Ted Yu yuzhih...@gmail.com wrote: Are you using HBase 0.94.x or 0.96.0 ? In 0.96.0 online schema change is supported. On Tue, Oct 29, 2013 at 9:09 AM, Chetan chetan.ka...@gmail.com wrote: Is it possible to make schema changes to HBase Table without having to explicitly disable the Table ? ( I want to add new Column Family to existing table). Thanks, Chetan -- - Chetan http://about.me/chetan.kadam -- - Chetan http://about.me/chetan.kadam -- Adrien Mogenet http://www.borntosegfault.com
Re: High Full GC count for Region server
The responseTooSlow message is triggered whenever a bunch of operations is taking more than a configured amount of time. In your case, processing 15827 elements can lead into long response time, so no worry about this. However, your SocketTimeoutException might be due to long GC pauses. I guess it might also be due to network failures or RS contention (too many requests on this RS, no more IPC slot...) On Thu, Oct 31, 2013 at 9:52 AM, Vimal Jain vkj...@gmail.com wrote: Hi, Can anyone please reply to the above query ? On Tue, Oct 29, 2013 at 10:48 AM, Vimal Jain vkj...@gmail.com wrote: Hi, Here is my analysis of this problem.Please correct me if i wrong somewhere. I have assigned 2 GB to region server process.I think its sufficient enough to handle around 9GB of data. I have not changed much of the parameters , especially memstore size which is 128 GB for 0.94.7 by default. Also as per my understanding , each col-family has one memstore associated with it.So my memstores are taking 128*3 = 384 MB ( I have 3 column families). So i think i should reduce memstore size to something like 32/64 MB so that data is flushed to disk at higher frequency then current frequency.This will save some memory. Is there any other parameter other then memstore size which affects memory utilization. Also I am getting below exceptions in data node log and region server log every day.Is it due to long GC pauses ? Data node logs :- hadoop-hadoop-datanode-woody.log:2013-10-29 00:12:13,127 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration( 192.168.20.30:5001 0, storageID=DS-1816106352-192.168.20.30-50010-1369314076237, infoPort=50075, ipcPort=50020):Got exception while serving blk_-560908881317618221_58058 to /192.168.20.30: hadoop-hadoop-datanode-woody.log:java.net.SocketTimeoutException: 48 millis timeout while waiting for channel to be ready for write. ch : java.nio .channels.SocketChannel[connected local=/192.168.20.30:50010 remote=/ 192.168.20.30:39413] hadoop-hadoop-datanode-woody.log:2013-10-29 00:12:13,127 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration( 192.168.20.30:500 10, storageID=DS-1816106352-192.168.20.30-50010-1369314076237, infoPort=50075, ipcPort=50020):DataXceiver hadoop-hadoop-datanode-woody.log:java.net.SocketTimeoutException: 48 millis timeout while waiting for channel to be ready for write. ch : java.nio .channels.SocketChannel[connected local=/192.168.20.30:50010 remote=/ 192.168.20.30:39413] Region server logs :- hbase-hadoop-regionserver-woody.log:2013-10-29 01:01:16,475 WARN org.apache.hadoop.ipc.HBaseServer: (responseTooSlow): {processingtimems:15827,call :multi(org.apache.hadoop.hbase.client.MultiAction@2918e464), rpc version=1, client version=29, methodsFingerPrint=-1368823753,client:192.168.20. 31:50619,starttimems:1382988660645,queuetimems:0,class:HRegionServer,responsesize:0,method:multi} hbase-hadoop-regionserver-woody.log:2013-10-29 06:01:27,459 WARN org.apache.hadoop.ipc.HBaseServer: (operationTooSlow): {processingtimems:14745,cli ent:192.168.20.31:50908 ,timeRange:[0,9223372036854775807],starttimems:1383006672707,responsesize:55,class:HRegionServer,table:event_da ta,cacheBlocks:true,families:{oinfo:[clubStatus]},row:1752869,queuetimems:1,method:get,totalColumns:1,maxVersions:1} On Mon, Oct 28, 2013 at 11:55 PM, Asaf Mesika asaf.mes...@gmail.com wrote: Check through HDFS UI that your cluster haven't reached maximum disk capacity On Thursday, October 24, 2013, Vimal Jain wrote: Hi Ted/Jean, Can you please help here ? On Tue, Oct 22, 2013 at 10:29 PM, Vimal Jain vkj...@gmail.com javascript:; wrote: Hi Ted, Yes i checked namenode and datanode logs and i found below exceptions in both the logs:- Name node :- java.io.IOException: File /hbase/event_data/433b61f2a4ebff8f2e4b89890508a3b7/.tmp/99797a61a8f7471cb6df8f7b95f18e9e could only be replicated to 0 nodes, instead of 1 java.io.IOException: Got blockReceived message from unregistered or dead node blk_-2949905629769882833_52274 Data node :- 48 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/ 192.168.20.30:50010 remote=/192.168.20.30:36188] ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(192.168.20.30:50010, storageID=DS-1816106352-192.168.20.30-50010-1369314076237, infoPort=50075, ipcPort=50020):DataXceiver java.io.EOFException: while trying to read 39309 bytes On Tue, Oct 22, 2013 at 10:19 PM, Ted Yu yuzhih...@gmail.com wrote: bq. java.io.IOException: File /hbase/event_data/ 4c3765c51911d6c67037a983d205a010/.tmp/bfaf8df33d5b4068825e3664d3e4b2b0 could only be
Re: Online Schema Changes
See this comment: https://issues.apache.org/jira/browse/HBASE-9792?focusedCommentId=13803252page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13803252 which refers to HBASE-9818. Cheers On Thu, Oct 31, 2013 at 3:14 AM, Adrien Mogenet adrien.moge...@gmail.comwrote: Is it safe to turn it on for a single-admin cluster? I guess it's off to prevent from concurrent changes... On Tue, Oct 29, 2013 at 5:29 PM, Ted Yu yuzhih...@gmail.com wrote: In 0.96.0 and 0.94 the feature is disabled by default. property namehbase.online.schema.update.enable/name valuefalse/value On Tue, Oct 29, 2013 at 9:20 AM, Chetan chetan.ka...@gmail.com wrote: Oops .. I mean to say, I saw some post about this feature being disabled by default in 0.94 On Tue, Oct 29, 2013 at 12:20 PM, Chetan chetan.ka...@gmail.com wrote: We are currently doing a prototype using 0.94.x. Is this feature supported in 0.94.x ? I saw some post about this feature being disabled by default in 0.96 On Tue, Oct 29, 2013 at 12:15 PM, Ted Yu yuzhih...@gmail.com wrote: Are you using HBase 0.94.x or 0.96.0 ? In 0.96.0 online schema change is supported. On Tue, Oct 29, 2013 at 9:09 AM, Chetan chetan.ka...@gmail.com wrote: Is it possible to make schema changes to HBase Table without having to explicitly disable the Table ? ( I want to add new Column Family to existing table). Thanks, Chetan -- - Chetan http://about.me/chetan.kadam -- - Chetan http://about.me/chetan.kadam -- Adrien Mogenet http://www.borntosegfault.com
Re: HBase ShutdownHook problem
Is it just me or your classpath contains both Hadoop 1.2.1 and Hadoop 2.0.5-alpha jars? You might want to get that cleared up first. J-D On Wed, Oct 30, 2013 at 12:48 AM, Salih Kardan karda...@gmail.com wrote: Sorry for the late reply. Here is the stack trace 13/10/30 09:46:42 INFO util.ProcessTree: setsid exited with exit code 0 13/10/30 09:46:42 INFO util.VersionInfo: HBase 0.94.10 13/10/30 09:46:42 INFO util.VersionInfo: Subversion https://svn.apache.org/repos/asf/hbase/tags/0.94.10mvn2 -r 1506655 13/10/30 09:46:42 INFO util.VersionInfo: Compiled by lhofhans on Wed Jul 24 11:39:30 PDT 2013 13/10/30 09:46:42 INFO mapred.TaskTracker: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@1e5ec04b 13/10/30 09:46:42 WARN mapred.TaskTracker: TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is disabled. 13/10/30 09:46:42 INFO mapred.IndexCache: IndexCache created with max memory = 10485760 13/10/30 09:46:42 INFO impl.MetricsSourceAdapter: MBean for source ShuffleServerMetrics registered. 13/10/30 09:46:42 INFO http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50060 13/10/30 09:46:42 INFO http.HttpServer: listener.getLocalPort() returned 50060 webServer.getConnectors()[0].getLocalPort() returned 50060 13/10/30 09:46:42 INFO http.HttpServer: Jetty bound to port 50060 13/10/30 09:46:42 INFO mortbay.log: jetty-6.1.26 13/10/30 09:46:42 INFO mortbay.log: Extract jar:file:/home/skardan/.m2/repository/org/apache/hadoop/hadoop-core/1.2.1/hadoop-core-1.2.1.jar!/webapps/task to /tmp/Jetty_0_0_0_0_50060_task.2vcltf/webapp 13/10/30 09:46:42 INFO server.ZooKeeperServer: Server environment:zookeeper.version=3.4.5-1392090, built on 09/30/2012 17:52 GMT 13/10/30 09:46:42 INFO server.ZooKeeperServer: Server environment: host.name =salih 13/10/30 09:46:42 INFO server.ZooKeeperServer: Server environment:java.version=1.7.0_25 13/10/30 09:46:42 INFO server.ZooKeeperServer: Server environment:java.vendor=Oracle Corporation 13/10/30 09:46:42 INFO server.ZooKeeperServer: Server environment:java.home=/usr/lib/jvm/java-7-openjdk-amd64/jre 13/10/30 09:46:42 INFO server.ZooKeeperServer: Server
Re: 0.96 and Hadoop 2.1.1
Thanks for tracking this down Cosmin, I left a note in https://issues.apache.org/jira/browse/HADOOP-9944 (took me a while to edit my comment to remove as many expletives as possible). On Thu, Oct 31, 2013 at 2:16 AM, Cosmin Lehene cleh...@adobe.com wrote: Benoit, This (Unknown out of band call #-2147483647) hints that something ³else² is being parsed. CallId has been changed from uint32 to sint32 f9cc07986d797d4d0731d8774e7a1f4bcf3a1738 Merge -c 1523885 from trunk to branch-2 to fix HADOOP-9944. Fix RpcRequestHeaderProto.callId to be sint32 rather than uint32 since ipc.Client.CONNECTION_CONTEXT_CALL_ID is signed (i.e. -3). Contributed by Arun C. Murthy. git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2.1-beta@152 3887 13f79535-47bb-0310-9956-ffa450edef68 Cosmin On 29/10/13 01:19, tsuna tsuna...@gmail.com wrote: Hi there, I have a cluster running vanilla Hadoop 2.1.1 and am trying to deploy HBase 0.96 on top. At first the master was crapping out with this when I was trying to start it: 2013-10-28 16:11:32,778 FATAL [master:r12s1:9102] master.HMaster: Unhandled exception. Starting shutdown. org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RpcServerExcep tion): Unknown out of band call #-2147483647 at org.apache.hadoop.ipc.Client.call(Client.java:1347) at org.apache.hadoop.ipc.Client.call(Client.java:1300) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.j ava:206) at $Proxy12.setSafeMode(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java: 57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm pl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvoca tionHandler.java:188) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHa ndler.java:102) at $Proxy12.setSafeMode(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.setSa feMode(ClientNamenodeProtocolTranslatorPB.java:561) at org.apache.hadoop.hdfs.DFSClient.setSafeMode(DFSClient.java:2124) at org.apache.hadoop.hdfs.DistributedFileSystem.setSafeMode(DistributedFileSy stem.java:994) at org.apache.hadoop.hdfs.DistributedFileSystem.setSafeMode(DistributedFileSy stem.java:978) at org.apache.hadoop.hbase.util.FSUtils.isInSafeMode(FSUtils.java:433) at org.apache.hadoop.hbase.util.FSUtils.waitOnSafeMode(FSUtils.java:852) at org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSys tem.java:435) at org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLay out(MasterFileSystem.java:146) at org.apache.hadoop.hbase.master.MasterFileSystem.init(MasterFileSystem.ja va:127) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:7 86) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:603) at java.lang.Thread.run(Thread.java:636) 2013-10-28 16:11:32,781 INFO [master:r12s1:9102] master.HMaster: Aborting Earlier posts on the ML suggest copying the hadoop-hdfs and hadoop-common jars from the Hadoop distro, so I did that and replaced the hadoop-hdfs-2.1.0-beta.jar and hadoop-common-2.1.0-beta.jar that came with 0.96 under the lib/ directory with the corresponding 2.1.1 jars. The master is now failing to start with this: 2013-10-28 16:18:22,293 ERROR [main] master.HMasterCommandLine: Master exiting java.lang.RuntimeException: Failed construction of Master: class org.apache.hadoop.hbase.master.HMaster at org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2773) at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterComma ndLine.java:184) at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.j ava:134) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.ja va:126) at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2787) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.util.PlatformName at java.net.URLClassLoader$1.run(URLClassLoader.java:217) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:205) at java.lang.ClassLoader.loadClass(ClassLoader.java:321) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) at java.lang.ClassLoader.loadClass(ClassLoader.java:266) at org.apache.hadoop.security.UserGroupInformation.getOSLoginModuleName(UserG roupInformation.java:302) at
Column qualifiers with hierarchy and filters
Hi, I'm trying to determine the best way to serialize a sequence of integers/strings that represent a hierarchy for a column qualifier, which would be compatible with the ColumnPrefixFilters, and BinaryComparators. However, due to the lexicographical sorting, it's awkward to serialize the sequence of values needed to get it to work. What are the typical solutions to this? Do people just zero pad integers to make sure they sort correctly? Or do I have to implement my own QualifierFilter - which seems expensive since I'd be deserializing every byte array just to compare. Thanks - Nasron