Hi Jean-Marc, Thanks for your reply. I have one node cluster(pseudo distributed mode) , so 192.168.20.30 is the only server which hosts all 6 processes ( namenode,datanode,secondarynamenode,Hmaster,HRegion and Zookeeper).
At time of this problem, i had given following memory to these processes ( I had total 4 GB memory on this box): NameNode,DataNode,SecondaryNamenode - 1 GB ( default) HMaster,Zookeeper - 512MB HRegion-1.5 GB So total memory allocation is 5.5 GB and it exceeds physical memory.( I am planning to increase RAM on my system to 8GB and change memory allocation as 2 GB to HRegion and 1 GB to other 5 ). So could that be a problem ? I am talking about memory stuff as i see following in Hmaster log. *org.apache.hadoop.hbase.util.Sleeper: We slept 1266874891120ms instead of 10000ms, this is likely due to a long garbage collecting pause and it's usually bad* * * On Tue, Aug 13, 2013 at 6:42 PM, Jean-Marc Spaggiari < [email protected]> wrote: > Hi Vimal, > > What was your cluster doing at that time? Was it very busy? Looks like one > server (192.168.20.30 <http://192.168.20.30:50010>) went so busy that it > failed to report active and closed. > > JM > > 2013/8/13 Vimal Jain <[email protected]> > > > Hi, > > I am using Hadoop and Hbase in pseudo distributed mode. > > I am using Hadoop version - 1.1.2 and Hbase version - 0.94.7 > > > > Recently i found some exception in hadoop and hbase logs. > > I am not sure what has caused this. > > Requesting you to please help here. > > > > > > *Exception in Master log :* > > 2013-07-31 15:51:04,694 WARN org.apache.hadoop.hbase.util. > > Sleeper: We slept 1266874891120ms instead of 10000ms, this is likely due > to > > a long garbage c > > ollecting pause and it's usually bad, see > > http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired > > 2013-07-31 15:51:04,798 WARN > org.apache.hadoop.hbase.master.CatalogJanitor: > > Failed scan of catalog table > > org.apache.hadoop.hbase.client.ScannerTimeoutException: 82253ms passed > > since the last invocation, timeout is currently set to 60000 > > at > > org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:283) > > at > > org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:727) > > at > > org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:184) > > at > > org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:169) > > at > > > > > org.apache.hadoop.hbase.master.CatalogJanitor.getSplitParents(CatalogJanitor.java:123) > > at > > > org.apache.hadoop.hbase.master.CatalogJanitor.scan(CatalogJanitor.java:134) > > at > > > org.apache.hadoop.hbase.master.CatalogJanitor.chore(CatalogJanitor.java:92) > > at org.apache.hadoop.hbase.Chore.run(Chore.java:67) > > at java.lang.Thread.run(Thread.java:662) > > Caused by: org.apache.hadoop.hbase.UnknownScannerException: > > org.apache.hadoop.hbase.UnknownScannerException: Name: > -8839286818925700393 > > at > > > > > org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2544) > > at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source) > > > > at > > > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > at java.lang.reflect.Method.invoke(Method.java:597) > > at > > > > > org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320) > > at > > > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426) > > > > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > > Method) > > at > > > > > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) > > at > > > > > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) > > at java.lang.reflect.Constructor.newInstance(Constructor.java:513) > > at > > > > > org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:96) > > at > > > > > org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:143) > > at > > > > > org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:42) > > at > > > > > org.apache.hadoop.hbase.client.ServerCallable.withRetries(ServerCallable.java:164) > > at > > org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:274) > > ... 8 more > > 2013-07-31 15:54:42,526 DEBUG > org.apache.hadoop.hbase.client.ClientScanner: > > Creating scanner over .META. starting at key '' > > 2013-07-31 15:54:42,526 DEBUG > org.apache.hadoop.hbase.client.ClientScanner: > > Advancing internal scanner to startKey at '' > > 2013-07-31 15:54:42,531 DEBUG > org.apache.hadoop.hbase.client.ClientScanner: > > Finished with scanning at {NAME => '.META.,,1', STARTKEY => '', ENDKEY => > > '', ENCODED => 1028785192,} > > 2013-07-31 15:54:42,532 DEBUG > > org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 5 catalog row(s) > and > > gc'd 0 unreferenced parent region(s) > > 2013-07-31 15:54:42,751 INFO org.apache.hadoop.hbase.master.LoadBalancer: > > Skipping load balancing because balanced cluster; servers=1 regions=5 > > averag > > e=5.0 mostloaded=5 leastloaded=5 > > > > 2013-07-31 16:43:23,358 WARN org.apache.hadoop.hbase.util.Sleeper: We > slept > > 25771ms instead of 1000ms, this is likely due to a long garbage > collecting > > pause and it's usually bad, see > > http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired > > 2013-07-31 16:43:23,358 WARN org.apache.hadoop.hbase.util.Sleeper: We > slept > > 30091ms instead of 1000ms, this is likely due to a long garbage > collecting > > pause and it's usually bad, see > > http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired > > 2013-07-31 16:43:23,361 WARN org.apache.hadoop.hbase.util.Sleeper: We > slept > > 28613ms instead of 10000ms, this is likely due to a long garbage > collectin > > g pause and it's usually bad, see > > http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired > > 2013-07-31 16:43:23,361 WARN org.apache.hadoop.hbase.util.Sleeper: We > slept > > 27457ms instead of 10000ms, this is likely due to a long garbage > collectin > > g pause and it's usually bad, see > > http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired > > 2013-07-31 16:43:23,362 WARN org.apache.hadoop.hbase.util.Sleeper: We > slept > > 34587ms instead of 10000ms, this is likely due to a long garbage > collectin > > g pause and it's usually bad, see > > http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired > > 2013-07-31 16:43:23,367 WARN org.apache.hadoop.hbase.util.Sleeper: We > slept > > 78600ms instead of 60000ms, this is likely due to a long garbage > collectin > > g pause and it's usually bad, see > > http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired > > 2013-07-31 16:43:23,369 WARN org.apache.hadoop.hbase.util.Sleeper: We > slept > > 78562ms instead of 60000ms, this is likely due to a long garbage > collectin > > g pause and it's usually bad, see > > http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired > > > > > > *Exception in Region log : > > * > > java.io.IOException: Reflection > > at org.apache.hadoop.hbase. > > > regionserver.wal.SequenceFileLogWriter.sync(SequenceFileLogWriter.java:304) > > at > org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1375) > > at > org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1319) > > at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1480) > > at > > > org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.run(HLog.java:1271) > > at java.lang.Thread.run(Thread.java:662) > > Caused by: java.lang.reflect.InvocationTargetException > > at sun.reflect.GeneratedMethodAccessor156.invoke(Unknown Source) > > > > at > > > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > at java.lang.reflect.Method.invoke(Method.java:597) > > at > > > > > org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.sync(SequenceFileLogWriter.java:302) > > ... 5 more > > Caused by: java.io.IOException: DFSOutputStream is closed > > at > > > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.sync(DFSClient.java:3669) > > at > > org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:97) > > at > > org.apache.hadoop.io.SequenceFile$Writer.syncFs(SequenceFile.java:995) > > ... 9 more > > 2013-07-31 15:50:37,761 FATAL > > org.apache.hadoop.hbase.regionserver.wal.HLog: Could not sync. Requesting > > close of hlog > > > > 2013-07-31 19:23:38,289 INFO org.apache.hadoop.hdfs.DFSClient: Could not > > complete file /hbase/.logs/hbase.xyz.com,60020,1370497170634/ > hbase.xyz.com > > %2C60020%2C1370497170634.1375265949987 > > retrying... > > 2013-07-31 19:23:38,289 INFO org.apache.hadoop.hdfs.DFSClient: Could not > > complete file /hbase/.logs/hbase.xyz.com,60020,1370497170634/ > hbase.xyz.com > > %2C60020%2C1370497170634.1375265949987 > > retrying... > > > > > > *Exception in data node :-* > > > > 2013-07-31 19:39:51,094 WARN org.apache.hadoop.hdfs.server. > > datanode.DataNode: org.apache.hadoop.ipc.RemoteException: > > java.io.IOException: Got blockRec > > eived message from unregistered or dead node > blk_-4787262105551508952_28369 > > at > > > > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.blockReceived(FSNamesystem.java:4188) > > at > > > > > org.apache.hadoop.hdfs.server.namenode.NameNode.blockReceived(NameNode.java:1069) > > at sun.reflect.GeneratedMethodAccessor56.invoke(Unknown Source) > > at > > > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > at java.lang.reflect.Method.invoke(Method.java:597) > > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578) > > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393) > > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389) > > at java.security.AccessController.doPrivileged(Native Method) > > at javax.security.auth.Subject.doAs(Subject.java:396) > > at > > > > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) > > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387) > > > > at org.apache.hadoop.ipc.Client.call(Client.java:1107) > > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229) > > at sun.proxy.$Proxy5.blockReceived(Unknown Source) > > at > > > > > org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:1006) > > at > > org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1527) > > at java.lang.Thread.run(Thread.java:662) > > > > *Exception in name node :- * > > > > 2013-07-31 19:39:50,671 WARN org.apache.hadoop.hdfs. > > StateChange: BLOCK* NameSystem.blockReceived: > > blk_-4787262105551508952_28369 is received from dead > > or unregistered node 192.168.20.30:50010 > > 2013-07-31 19:39:50,671 ERROR > > org.apache.hadoop.security.UserGroupInformation: > PriviledgedActionException > > as:hadoop cause:java.io.IOException: Got blo > > ckReceived message from unregistered or dead node > > blk_-4787262105551508952_28369 > > 2013-07-31 19:39:50,671 INFO org.apache.hadoop.ipc.Server: IPC Server > > handler 5 on 9000, call blockReceived(DatanodeRegistration( > > 192.168.20.30:50010, > > storageID=DS-1816106352-192.168.20.30-50010-1369314076237, > infoPort=50075, > > ipcPort=50020), [Lorg.apache.hadoop.hdfs.protocol.Block;@64f2d559, > > [Ljava.l > > ang.String;@294f9d6) from 192.168.20.30:59764: error: > java.io.IOException: > > Got blockReceived message from unregistered or dead node > blk_-4787262105551 > > 508952_28369 > > java.io.IOException: Got blockReceived message from unregistered or dead > > node blk_-4787262105551508952_28369 > > at > > > > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.blockReceived(FSNamesystem.java:4188) > > at > > > > > org.apache.hadoop.hdfs.server.namenode.NameNode.blockReceived(NameNode.java:1069) > > at sun.reflect.GeneratedMethodAccessor56.invoke(Unknown Source) > > at > > > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > at java.lang.reflect.Method.invoke(Method.java:597) > > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578) > > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393) > > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389) > > at java.security.AccessController.doPrivileged(Native Method) > > at javax.security.auth.Subject.doAs(Subject.java:396) > > at > > > > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) > > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387) > > > > > > > > > > > > -- > > Thanks and Regards, > > Vimal Jain > > > -- Thanks and Regards, Vimal Jain
