Shashank, > Hi, > > Setup Info: > I have 2 node hadoop (20.2) cluster on Linux boxes. > HW info: 16 CPU (Hyperthreaded) > RAM: 32 GB > > I am trying to configure capacity scheduling. I want to use memory > management provided by capacity scheduler. But I am facing few issues. > I have added hadoop-0.20.2-capacity-scheduler.jar in lib. Also added > ‘mapred.jobtracker.taskScheduler’ in hadoop-site.xml
First things first - the memory management implementation in the capacity scheduler has seen significant improvements in Hadoop 0.21. Specifically, the implementation in Hadoop 0.20 could cause a high degree of cluster under utilization that was fixed in MAPREDUCE-516 and subsequent JIRAs in Hadoop 0.21. > I have added below in capacity-scheduler.xml file, but I get error: > <property> > <name>mapred.tasktracker.vmem.reserved</name> > <value>26624m</value> > <description>A number, in bytes, that represents an offset. The total > VMEM > on the machine, minus this offset, is the VMEM node-limit for all > tasks, and their descendants, spawned by the TT. > </description> > </property> > <property> > <name>mapred.task.default.maxvmem</name> > <value>512k</value> > <description>A number, in bytes, that represents the default VMEM > task-limit associated with a task. Unless overridden by a job's > setting, this number defines the VMEM task-limit. > </description> > </property> > <property> > <name>mapred.task.limit.maxvmem</name> > <value>4096m</value> > <description>A number, in bytes, that represents the upper VMEM > task-limit > associated with a task. Users, when specifying a VMEM task-limit for > their tasks, should not specify a limit which exceeds this amount. > </description> > </property> > <property> > <name>mapred.tasktracker.pmem.reserved</name> > <value>26624m</value> > <description>Physical Memory > </description> > </property> IIRC, these parameters were removed and certain new parameters were introduced. Trunk's documentation is now updated with the exact list of these parameters, their descriptions and usage - but I fear if the parameter names in Hadoop 20 and trunk would have changed. Your best bet could be to use the parameters listed in http://bit.ly/97SDz2 and try out. > > Error: > 2010-06-25 08:02:06,026 ERROR org.apache.hadoop.mapred.TaskTracker: Can not > start task tracker because java.io.IOException: Call to > node1.hadoopcluster.com/192.168.1.241:9001 failed on local exception: > java.io.IOException: Connection reset by peer > at org.apache.hadoop.ipc.Client.wrapException(Client.java:775) > at org.apache.hadoop.ipc.Client.call(Client.java:743) > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) > at org.apache.hadoop.mapred.$Proxy1.getProtocolVersion(Unknown > Source) > at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359) > at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:346) > at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:383) > at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:314) > at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:291) > at > org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:514) > at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:934) > at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:2833) > Caused by: java.io.IOException: Connection reset by peer > at sun.nio.ch.FileDispatcher.read0(Native Method) > at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:33) > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:234) > at sun.nio.ch.IOUtil.read(IOUtil.java:207) > at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:236) > at > org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:55) > at > org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142) > at > org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155) > at > org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128) > at java.io.FilterInputStream.read(FilterInputStream.java:128) > at > org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:276) > at java.io.BufferedInputStream.fill(BufferedInputStream.java:230) > at java.io.BufferedInputStream.read(BufferedInputStream.java:249) > at java.io.DataInputStream.readInt(DataInputStream.java:382) > at > org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:501) > at org.apache.hadoop.ipc.Client$Connection.run(Client.java:446) > > > Question: > How to fix this issue? > Is there any step by step guide for configuring capacity scheduling? > > Let me know if you need more information about configuration. > > > Thanks and Regards, > -Shashank > > -- > View this message in context: > http://old.nabble.com/memory-management-of-capacity-scheduling-tp28993355p28993355.html > Sent from the Hadoop core-user mailing list archive at Nabble.com. > >