Hi Arun,
Thanks a lot for the clarification.
I understand it like in *yarn-site.xml* first I can set the max. and min.
container size GLOBALLY for all the nodes in the cluster through:
* {yarn.scheduler.minimum-allocation-mb}
* {yarn.scheduler.maximum-allocation-mb}
Then at each of the node I can set the NM memory using:
* {yarn.nodemanager.resource.memory-mb}
My 2nd doubt is whether we can run Mappers\Reducers tasks with varying memory
options at each of the slave nodes.
That is, can we change the following properties in *mapred-site.xml* at each of
the slave nodes? This is because depending on the machine's power we can adjust
the memory options for the M\R tasks.
* {mapreduce.map.memory.mb}
* {mapreduce.map.java.opts}
* {mapreduce.reduce.memory.mb}
* {mapreduce.reduce.java.opts}
Thanks,
-Nirmal
From: Arun C Murthy [mailto:[email protected]]
Sent: Thursday, January 16, 2014 7:43 PM
To: [email protected]
Subject: Re: Doubts: Deployment and Configuration of YARN cluster
No, you can set resources available in each node to be different...
For e.g. Node A: 10G, Node B: 12G.
Now, if min. container size is 1G, the RM will allocate 10 containers to Node A
and 12 containers to Node B.
hth,
Arun
On Jan 15, 2014, at 11:03 PM, Nirmal Kumar
<[email protected]<mailto:[email protected]>> wrote:
Hi German,
I went through the links for memory configuration settings/best-practices.
It considers the cluster to be homogenous i.e. same RAM size in all the nodes.
Also on the Yarn whitepaper(Section 3.2 Page 6) I see:
This resource model serves current applications well
in homogeneous environments, but we expect it to
evolve over time as the ecosystem matures and new requirements
emerge.
Does that mean in YARN in order to configure processing capacity like Container
Size, No. of Containers, No. of Mappers\Reducers the cluster has to be
homogenous?
How about if I have a *heterogeneous cluster* with varying RAM, disks , cores?
Thanks,
-Nirmal
From: Nirmal Kumar
Sent: Wednesday, January 15, 2014 8:22 PM
To: [email protected]<mailto:[email protected]>
Subject: RE: Doubts: Deployment and Configuration of YARN cluster
Thanks a lot German.
Will go through the links and see if that answers my questions\doubts.
-Nirmal
From: German Florez-Larrahondo [mailto:[email protected]]
Sent: Wednesday, January 15, 2014 7:20 PM
To: [email protected]<mailto:[email protected]>
Subject: RE: Doubts: Deployment and Configuration of YARN cluster
Nirmal
-A good summary regarding memory configuration settings/best-practices can be
found here. Note that in YARN, the way you configure resource limits dictates
number of containers in the nodes and in the cluster:
http://dev.hortonworks.com.s3.amazonaws.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
-A good intro to YARN configuration is this:
http://www.thecloudavenue.com/2012/01/getting-started-with-nextgen-mapreduce_11.html
Regards
.g
From: Nirmal Kumar [mailto:[email protected]]
Sent: Wednesday, January 15, 2014 7:22 AM
To: [email protected]<mailto:[email protected]>
Subject: Doubts: Deployment and Configuration of YARN cluster
All,
I am new to YARN and have certain doubts regarding the deployment and
configuration of YARN on a cluster.
As per my understanding to deploy Hadoop 2.x using YARN on a cluster we need to
distribute the below files to all the slave nodes in the cluster:
* conf/core-site.xml
* conf/hdfs-site.xml
* conf/yarn-site.xml
* conf/mapred-site.xml
Also we need to ONLY change the following file on each slave nodes:
* conf/hdfs-site.xml
Need to mention the {dfs.datanode.name.dir} value
Do we need to change any other config file on the slave nodes?
Can I change {yarn.nodemanager.resource.memory-mb} for each NM running on the
slave nodes?
This is since I might have a *heterogeneous environment* i.e. different nodes
with different memory and cores. For NM1 I might have 40GB memory and for the
other say 20GB.
Also,
{mapreduce.map.memory.mb} specifies the *max. virtual memory* allowed by a
Hadoop task subprocess.
{mapreduce.map.java.opts} specify the *max. heap space* of the
allocated jvm. If you exceed the max heap size, the JVM throws an OOM.
{mapreduce.reduce.memory.mb}
{mapreduce.reduce.java.opts}
are the above properties applicable to all the Map\Reduce tasks(from different
Map Reduce applications) in general, running on different slave nodes?
or Can I change these for a particular slave node.? For e.g. say for a
SlaveNode1 I run the map task with 4GB and for other SlaveNode2 I run the map
task with 8GB. Same with the reduce task.
I need some understanding to *configure processing capacity* in the cluster
like Container Size, No. of Containers, No. of Mappers\Reducers.
Thanks,
-Nirmal
________________________________
NOTE: This message may contain information that is confidential, proprietary,
privileged or otherwise protected by law. The message is intended solely for
the named addressee. If received in error, please destroy and notify the
sender. Any use of this email is prohibited when received in error. Impetus
does not represent, warrant and/or guarantee, that the integrity of this
communication has been maintained nor that the communication is free of errors,
virus, interception or interference.
________________________________
NOTE: This message may contain information that is confidential, proprietary,
privileged or otherwise protected by law. The message is intended solely for
the named addressee. If received in error, please destroy and notify the
sender. Any use of this email is prohibited when received in error. Impetus
does not represent, warrant and/or guarantee, that the integrity of this
communication has been maintained nor that the communication is free of errors,
virus, interception or interference.
________________________________
NOTE: This message may contain information that is confidential, proprietary,
privileged or otherwise protected by law. The message is intended solely for
the named addressee. If received in error, please destroy and notify the
sender. Any use of this email is prohibited when received in error. Impetus
does not represent, warrant and/or guarantee, that the integrity of this
communication has been maintained nor that the communication is free of errors,
virus, interception or interference.
--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader of
this message is not the intended recipient, you are hereby notified that any
printing, copying, dissemination, distribution, disclosure or forwarding of
this communication is strictly prohibited. If you have received this
communication in error, please contact the sender immediately and delete it
from your system. Thank You.
________________________________
NOTE: This message may contain information that is confidential, proprietary,
privileged or otherwise protected by law. The message is intended solely for
the named addressee. If received in error, please destroy and notify the
sender. Any use of this email is prohibited when received in error. Impetus
does not represent, warrant and/or guarantee, that the integrity of this
communication has been maintained nor that the communication is free of errors,
virus, interception or interference.