Sam, There is no formula for determining how much memory one should give to datanode and tasktracker. Ther formula is available for how many slots you want to have on a machine.
In my prior experience, we did give 512MB memory each to a datanode and tasktracker. On Mon, May 13, 2013 at 11:18 AM, sam liu <[email protected]> wrote: > For node3, the memory is: > total used free shared buffers > cached > Mem: 3834 3666 167 0 187 1136 > -/+ buffers/cache: 2342 1491 > Swap: 8196 0 8196 > > To a 3 nodes cluster as mine, what's the required minimum free/available > memory for the datanode process and tasktracker process, without running > any map/reduce task? > Any formula to determine it? > > > 2013/5/13 Rishi Yadav <[email protected]> > >> can you tell specs of node3. Even on a test/demo cluster, anything below >> 4 GB ram makes the node almost inaccessible as per my experience. >> >> >> >> On Sun, May 12, 2013 at 8:25 PM, sam liu <[email protected]> wrote: >> >>> Got some exceptions on node3: >>> 1. datanode log: >>> 2013-04-17 11:13:44,719 INFO >>> org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock >>> blk_2478755809192724446_1477 received exception >>> java.net.SocketTimeoutException: 63000 millis timeout while waiting for >>> channel to be ready for read. ch : >>> java.nio.channels.SocketChannel[connected local=/9.50.102.80:58371remote=/ >>> 9.50.102.79:50010] >>> 2013-04-17 11:13:44,721 ERROR >>> org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration( >>> 9.50.102.80:50010, >>> storageID=DS-2038715921-9.50.102.80-50010-1366091297051, infoPort=50075, >>> ipcPort=50020):DataXceiver >>> java.net.SocketTimeoutException: 63000 millis timeout while waiting for >>> channel to be ready for read. ch : >>> java.nio.channels.SocketChannel[connected local=/9.50.102.80:58371remote=/ >>> 9.50.102.79:50010] >>> at >>> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164) >>> at >>> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155) >>> at >>> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128) >>> at >>> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:116) >>> at java.io.DataInputStream.readShort(DataInputStream.java:306) >>> at >>> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:359) >>> at >>> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:112) >>> at java.lang.Thread.run(Thread.java:738) >>> 2013-04-17 11:13:44,818 INFO >>> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block >>> blk_8413378381769505032_1477 src: /9.50.102.81:35279 dest: / >>> 9.50.102.80:50010 >>> >>> >>> 2. tasktracker log: >>> 2013-04-23 11:48:26,783 INFO org.apache.hadoop.mapred.UserLogCleaner: >>> Deleting user log path job_201304152248_0011 >>> 2013-04-30 14:48:15,506 ERROR org.apache.hadoop.mapred.TaskTracker: >>> Caught exception: java.io.IOException: Call to node1/9.50.102.81:9001failed >>> on local exception: java.io.IOException: Connection reset by peer >>> at org.apache.hadoop.ipc.Client.wrapException(Client.java:1144) >>> at org.apache.hadoop.ipc.Client.call(Client.java:1112) >>> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229) >>> at org.apache.hadoop.mapred.$Proxy2.heartbeat(Unknown Source) >>> at >>> org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:2008) >>> at >>> org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1802) >>> at >>> org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2654) >>> at >>> org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3909) >>> Caused by: java.io.IOException: Connection reset by peer >>> at sun.nio.ch.FileDispatcher.read0(Native Method) >>> at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:33) >>> at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:210) >>> at sun.nio.ch.IOUtil.read(IOUtil.java:183) >>> at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:257) >>> at >>> org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:55) >>> at >>> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142) >>> at >>> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155) >>> at >>> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128) >>> at java.io.FilterInputStream.read(FilterInputStream.java:127) >>> at >>> org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:361) >>> at java.io.BufferedInputStream.fill(BufferedInputStream.java:229) >>> at java.io.BufferedInputStream.read(BufferedInputStream.java:248) >>> at java.io.DataInputStream.readInt(DataInputStream.java:381) >>> at >>> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:841) >>> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:786) >>> >>> 2013-04-30 14:48:15,517 INFO org.apache.hadoop.mapred.TaskTracker: >>> Resending 'status' to 'node1' with reponseId '-12904 >>> 2013-04-30 14:48:16,404 INFO org.apache.hadoop.mapred.TaskTracker: >>> SHUTDOWN_MSG: >>> >>> >>> >>> 2013/5/13 Rishi Yadav <[email protected]> >>> >>>> do you get any error when trying to connect to cluster, something like >>>> 'tried n times' or replicated 0 times. >>>> >>>> >>>> >>>> >>>> On Sun, May 12, 2013 at 7:28 PM, sam liu <[email protected]>wrote: >>>> >>>>> Hi, >>>>> >>>>> I setup a cluster with 3 nodes, and after that I did not submit any >>>>> job on it. But, after few days, I found the cluster is unhealthy: >>>>> - No result returned after issuing command 'hadoop dfs -ls /' or >>>>> 'hadoop dfsadmin -report' for a while >>>>> - The page of 'http://namenode:50070' could not be opened as >>>>> expected... >>>>> - ... >>>>> >>>>> I did not find any usefull info in the logs, but found the avaible >>>>> memory of the cluster nodes are very low at that time: >>>>> - node1(NN,JT,DN,TT): 158 mb mem is available >>>>> - node2(DN,TT): 75 mb mem is available >>>>> - node3(DN,TT): 174 mb mem is available >>>>> >>>>> I guess the issue of my cluster is caused by lacking of memeory, and >>>>> my questions are: >>>>> - Without running jobs, what's the minimum memory requirements to >>>>> datanode and namenode? >>>>> - How to define the minimum memeory for datanode and namenode? >>>>> >>>>> Thanks! >>>>> >>>>> Sam Liu >>>>> >>>> >>>> >>> >> > -- Nitin Pawar
