Re: memory management of capacity scheduling

Hemanth Yamijala Sat, 26 Jun 2010 11:10:26 -0700

Shashank,

> Hi,
>
> Setup Info:
> I have 2 node hadoop (20.2) cluster on Linux boxes.
> HW info: 16 CPU (Hyperthreaded)
> RAM: 32 GB
>
> I am trying to configure capacity scheduling. I want to use memory
> management provided by capacity scheduler. But I am facing few issues.
> I have added hadoop-0.20.2-capacity-scheduler.jar in lib. Also added
> ‘mapred.jobtracker.taskScheduler’ in hadoop-site.xml


First things first - the memory management implementation in the
capacity scheduler has seen significant improvements in Hadoop 0.21.
Specifically, the implementation in Hadoop 0.20 could cause a high
degree of cluster under utilization that was fixed in MAPREDUCE-516
and subsequent JIRAs in Hadoop 0.21.

> I have added below in capacity-scheduler.xml file, but I get error:
>  <property>
>    <name>mapred.tasktracker.vmem.reserved</name>
>    <value>26624m</value>
>    <description>A number, in bytes, that represents an offset. The total
> VMEM
>        on the machine, minus this offset, is the VMEM node-limit for all
>        tasks, and their descendants, spawned by the TT.
>    </description>
>  </property>
>  <property>
>    <name>mapred.task.default.maxvmem</name>
>    <value>512k</value>
>    <description>A number, in bytes, that represents the default VMEM
>        task-limit associated with a task. Unless overridden by a job's
>        setting, this number defines the VMEM task-limit.
>    </description>
>  </property>
>  <property>
>    <name>mapred.task.limit.maxvmem</name>
>    <value>4096m</value>
>    <description>A number, in bytes, that represents the upper VMEM
> task-limit
>        associated with a task. Users, when specifying a VMEM task-limit for
>        their tasks, should not specify a limit which exceeds this amount.
>    </description>
>  </property>
>  <property>
>    <name>mapred.tasktracker.pmem.reserved</name>
>    <value>26624m</value>
>    <description>Physical Memory
>    </description>
> </property>

IIRC, these parameters were removed and certain new parameters were
introduced. Trunk's documentation is now updated with the exact list
of these parameters, their descriptions and usage - but I fear if the
parameter names in Hadoop 20 and trunk would have changed. Your best
bet could be to use the parameters listed in http://bit.ly/97SDz2 and
try out.

>
> Error:
> 2010-06-25 08:02:06,026 ERROR org.apache.hadoop.mapred.TaskTracker: Can not
> start task tracker because java.io.IOException: Call to
> node1.hadoopcluster.com/192.168.1.241:9001 failed on local exception:
> java.io.IOException: Connection reset by peer
>        at org.apache.hadoop.ipc.Client.wrapException(Client.java:775)
>        at org.apache.hadoop.ipc.Client.call(Client.java:743)
>        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
>        at org.apache.hadoop.mapred.$Proxy1.getProtocolVersion(Unknown
> Source)
>        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359)
>        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:346)
>        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:383)
>        at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:314)
>        at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:291)
>        at
> org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:514)
>        at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:934)
>        at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:2833)
> Caused by: java.io.IOException: Connection reset by peer
>        at sun.nio.ch.FileDispatcher.read0(Native Method)
>        at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:33)
>        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:234)
>        at sun.nio.ch.IOUtil.read(IOUtil.java:207)
>        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:236)
>        at
> org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:55)
>        at
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
>        at
> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
>        at
> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
>        at java.io.FilterInputStream.read(FilterInputStream.java:128)
>        at
> org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:276)
>        at java.io.BufferedInputStream.fill(BufferedInputStream.java:230)
>        at java.io.BufferedInputStream.read(BufferedInputStream.java:249)
>        at java.io.DataInputStream.readInt(DataInputStream.java:382)
>        at
> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:501)
>        at org.apache.hadoop.ipc.Client$Connection.run(Client.java:446)
>
>
> Question:
> How to fix this issue?
> Is there any step by step guide for configuring capacity scheduling?
>
> Let me know if you need more information about configuration.
>
>
> Thanks and Regards,
> -Shashank
>
> --
> View this message in context: 
> http://old.nabble.com/memory-management-of-capacity-scheduling-tp28993355p28993355.html
> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>
>

Re: memory management of capacity scheduling

Reply via email to