Re: Yarn job runs in Local Mode even though the cluster is running in Distributed Mode

Harsh J Wed, 13 Jun 2012 22:42:24 -0700

Hey,

Moving this to CDH-users (cdh-u...@cloudera.org) list as its CDH4
packaging/deployment specific. (You can subscribe to it via
https://groups.google.com/a/cloudera.org/group/cdh-user). BCC'd
common-user and cc'd you and Marcos too.


MR jobs will see mapred-site.xml to determine what
'cluster'/'framework' they need to use. Hence, you will need to follow
these deployment instructions when using MR2 with YARN as your choice
of MR: 
https://ccp.cloudera.com/display/CDH4DOC/Deploying+MapReduce+v2+%28YARN%29+on+a+Cluster#DeployingMapReducev2%28YARN%29onaCluster-Step1

(Or if you use CM4, just visit YARN service in it and ask it to deploy
configs automatically to all hosts or get you a client config bundle
for self-deployment. Applying those would do this automatically and
remove pains :))

Can you try this and let us know if it works Anil?

On Thu, Jun 14, 2012 at 2:56 AM, anil gupta <anilgupt...@gmail.com> wrote:
> Hi Marcus,
>
> Sorry i forgot to mention that Job history server is installed and running
> and AFAIK resourcemanager is responsible for running MR jobs. Historyserver
> is only used to get info about MR jobs.
>
> Thanks,
> Anil
>
> On Wed, Jun 13, 2012 at 2:04 PM, Marcos Ortiz <mlor...@uci.cu> wrote:
>
>> According to the CDH 4 official documentation, you should install a
>> JobHistory server for your MRv2 (YARN)
>> cluster.
>> https://ccp.cloudera.com/**display/CDH4DOC/Deploying+**
>> MapReduce+v2+%28YARN%29+on+a+**Cluster<https://ccp.cloudera.com/display/CDH4DOC/Deploying+MapReduce+v2+%28YARN%29+on+a+Cluster>
>>
>> How to configure the HistoryServer
>> https://ccp.cloudera.com/**display/CDH4DOC/Deploying+**
>> MapReduce+v2+%28YARN%29+on+a+**Cluster#DeployingMapReducev2%**
>> 28YARN%29onaCluster-Step3<https://ccp.cloudera.com/display/CDH4DOC/Deploying+MapReduce+v2+%28YARN%29+on+a+Cluster#DeployingMapReducev2%28YARN%29onaCluster-Step3>
>>
>>
>>
>>
>> On 06/13/2012 03:16 PM, anil gupta wrote:
>>
>>> Hi All
>>>
>>> I am using cdh4 for running a HBase cluster on CentOs6.0. I have 5
>>> nodes in my cluster(2 Admin Node and 3 DN).
>>> My resourcemanager is up and running and showing that all three DN are
>>> running the nodemanager. HDFS is also working fine and showing 3 DN's.
>>>
>>> But when i fire the pi example job. It starts to run in Local mode.
>>> Here is the console output:
>>> sudo -u hdfs yarn jar /usr/lib/hadoop-mapreduce/**hadoop-mapreduce-
>>> examples.jar pi 10 1000000000
>>> Number of Maps  = 10
>>> Samples per Map = 1000000000
>>> Wrote input for Map #0
>>> Wrote input for Map #1
>>> Wrote input for Map #2
>>> Wrote input for Map #3
>>> Wrote input for Map #4
>>> Wrote input for Map #5
>>> Wrote input for Map #6
>>> Wrote input for Map #7
>>> Wrote input for Map #8
>>> Wrote input for Map #9
>>> Starting Job
>>> 12/06/13 12:03:27 WARN conf.Configuration: session.id is deprecated.
>>> Instead, use dfs.metrics.session-id
>>> 12/06/13 12:03:27 INFO jvm.JvmMetrics: Initializing JVM Metrics with
>>> processName=JobTracker, sessionId=
>>> 12/06/13 12:03:27 INFO util.NativeCodeLoader: Loaded the native-hadoop
>>> library
>>> 12/06/13 12:03:27 WARN mapred.JobClient: Use GenericOptionsParser for
>>> parsing the arguments. Applications should implement Tool for the
>>> same.
>>> 12/06/13 12:03:28 INFO mapred.FileInputFormat: Total input paths to
>>> process : 10
>>> 12/06/13 12:03:29 INFO mapred.JobClient: Running job: job_local_0001
>>> 12/06/13 12:03:29 INFO mapred.LocalJobRunner: OutputCommitter set in
>>> config null
>>> 12/06/13 12:03:29 INFO mapred.LocalJobRunner: OutputCommitter is
>>> org.apache.hadoop.mapred.**FileOutputCommitter
>>> 12/06/13 12:03:29 WARN mapreduce.Counters: Group
>>> org.apache.hadoop.mapred.Task$**Counter is deprecated. Use
>>> org.apache.hadoop.mapreduce.**TaskCounter instead
>>> 12/06/13 12:03:29 INFO util.ProcessTree: setsid exited with exit code
>>> 0
>>> 12/06/13 12:03:29 INFO mapred.Task:  Using ResourceCalculatorPlugin :
>>> org.apache.hadoop.util.**LinuxResourceCalculatorPlugin@**3d46e381
>>> 12/06/13 12:03:29 WARN mapreduce.Counters: Counter name
>>> MAP_INPUT_BYTES is deprecated. Use FileInputFormatCounters as group
>>> name and  BYTES_READ as counter name instead
>>> 12/06/13 12:03:29 INFO mapred.MapTask: numReduceTasks: 1
>>> 12/06/13 12:03:29 INFO mapred.MapTask: io.sort.mb = 100
>>> 12/06/13 12:03:30 INFO mapred.MapTask: data buffer = 79691776/99614720
>>> 12/06/13 12:03:30 INFO mapred.MapTask: record buffer = 262144/327680
>>> 12/06/13 12:03:30 INFO mapred.JobClient:  map 0% reduce 0%
>>> 12/06/13 12:03:35 INFO mapred.LocalJobRunner: Generated 95735000
>>> samples.
>>> 12/06/13 12:03:36 INFO mapred.JobClient:  map 100% reduce 0%
>>> 12/06/13 12:03:38 INFO mapred.LocalJobRunner: Generated 151872000
>>> samples.
>>>
>>> Here is the content of yarn-site.xml:
>>>
>>> <configuration>
>>>   <property>
>>>     <name>yarn.nodemanager.aux-**services</name>
>>>     <value>mapreduce.shuffle</**value>
>>>   </property>
>>>
>>>   <property>
>>>     <name>yarn.nodemanager.aux-**services.mapreduce.shuffle.**
>>> class</name>
>>>     <value>org.apache.hadoop.**mapred.ShuffleHandler</value>
>>>   </property>
>>>
>>>   <property>
>>>     <name>yarn.log-aggregation-**enable</name>
>>>     <value>true</value>
>>>   </property>
>>>
>>>   <property>
>>>     <description>List of directories to store localized files in.</
>>> description>
>>>     <name>yarn.nodemanager.local-**dirs</name>
>>>     <value>/disk/yarn/local</**value>
>>>   </property>
>>>
>>>   <property>
>>>     <description>Where to store container logs.</description>
>>>     <name>yarn.nodemanager.log-**dirs</name>
>>>     <value>/disk/yarn/logs</value>
>>>   </property>
>>>
>>>   <property>
>>>     <description>Where to aggregate logs to.</description>
>>>     <name>yarn.nodemanager.remote-**app-log-dir</name>
>>>     <value>/var/log/hadoop-yarn/**apps</value>
>>>   </property>
>>>
>>>   <property>
>>>     <description>Classpath for typical applications.</description>
>>>      <name>yarn.application.**classpath</name>
>>>      <value>
>>>         $HADOOP_CONF_DIR,
>>>         $HADOOP_COMMON_HOME/*,$HADOOP_**COMMON_HOME/lib/*,
>>>         $HADOOP_HDFS_HOME/*,$HADOOP_**HDFS_HOME/lib/*,
>>>         $HADOOP_MAPRED_HOME/*,$HADOOP_**MAPRED_HOME/lib/*,
>>>         $YARN_HOME/*,$YARN_HOME/lib/*
>>>      </value>
>>>   </property>
>>> <property>
>>>         <name>yarn.resourcemanager.**resource-tracker.address</**name>
>>>         <value>ihub-an-g1:8025</value>
>>> </property>
>>> <property>
>>>         <name>yarn.resourcemanager.**address</name>
>>>         <value>ihub-an-g1:8040</value>
>>> </property>
>>> <property>
>>>         <name>yarn.resourcemanager.**scheduler.address</name>
>>>         <value>ihub-an-g1:8030</value>
>>> </property>
>>> <property>
>>>         <name>yarn.resourcemanager.**admin.address</name>
>>>         <value>ihub-an-g1:8141</value>
>>> </property>
>>> <property>
>>>         <name>yarn.resourcemanager.**webapp.address</name>
>>>         <value>ihub-an-g1:8088</value>
>>> </property>
>>> <property>
>>>         <name>mapreduce.jobhistory.**intermediate-done-dir</name>
>>>         <value>/disk/mapred/**jobhistory/intermediate/done</**value>
>>> </property>
>>> <property>
>>>         <name>>mapreduce.jobhistory.**done-dir</name>
>>>         <value>/disk/mapred/**jobhistory/done</value>
>>> </property>
>>> </configuration>
>>>
>>> Can anyone tell me what is the problem over here? Appreciate your
>>> help.
>>> Thanks,
>>> Anil Gupta
>>>
>>>
>>>
>>>
>> --
>> Marcos Luis Ortíz Valmaseda
>>  Data Engineer&&  Sr. System Administrator at UCI
>>  http://marcosluis2186.**posterous.com<http://marcosluis2186.posterous.com>
>>  http://www.linkedin.com/in/**marcosluis2186<http://www.linkedin.com/in/marcosluis2186>
>>  Twitter: @marcosluis2186
>>
>>
>> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS
>> INFORMATICAS...
>> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
>>
>> http://www.uci.cu
>> http://www.facebook.com/**universidad.uci<http://www.facebook.com/universidad.uci>
>> http://www.flickr.com/photos/**universidad_uci<http://www.flickr.com/photos/universidad_uci>
>>
>
>
>
> --
> Thanks & Regards,
> Anil Gupta



-- 
Harsh J

Re: Yarn job runs in Local Mode even though the cluster is running in Distributed Mode

Reply via email to