Re: Implementing and running an applicationmaster

Rob Blah Thu, 05 Dec 2013 04:36:25 -0800

Hi

There is a way but it's not an easy one. You should overwrite the container
request code in MR_AM. As each container in MapReduce gets the same amount
of memory, the OOM shouldn't be problem as inner task "buffers" can be
spilled to disk. I am no MapReduce (code) specialist but I would start by
finding MR_Driver.class and MR_AM.class. Then overwrite the Driver.class to
execute your class Custom_MR_AM (C_MR_AM). C_MR_AM will be a copy of MR_AM
but you should change the container request code, so that you can allocate
N containers with X memory and M container with Y memory.


The hadoop-mapreduce-examples.jar is just a bunch of HelloWorld jobs. So a
new user can pick up and "learn" MR quickly.

Maybe some real MR specialist can give you better advice than me.

regards
tmp


2013/12/5 Yue Wang <[email protected]>

> Hi,
>
> Thank you for your answer. Now I understand the connection between the two
> ways.
>
> I asked this question because I want to take benefit from the YARN
> architecture.
> If I understood correctly, I can let my ApplicationMaster request
> containers more flexibly. For example, I can request two containers with
> 100MB memory and two containers with 200MB memory for my mappers on YARN.
> However, I cannot do that on MRv1.
>
> So if I execute a WordCount program by typing "yarn jar
> /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar wordcount
> wordcount/ wc-output/", such flexibility is gone.
>
> Is there a way to let my ApplicationMaster execute WordCount on HDFS on
> containers?
>
>
> Thanks!
>
>
> On Thu, Dec 5, 2013 at 4:28 AM, Rob Blah <[email protected]> wrote:
>
>> Hi
>>
>> If I understood you correctly, you would like to run your AM with YARN
>> Client from shell as oppose to run the Driver like in MRv1. But it's the
>> same thing (more or less). In the example you provided
>> (org.apache.hadoop.yarn.applications.DistributedShell) the Client.class is
>> the "driver". However since distributed-shell is a "simple" application you
>> do not need a lot of configuration (setting fields in Configuration.class,
>> I/O formats etc.). The same goes for any other application. As for the
>> second example (org.apache.hadoop.examples.WordCount) MapReduce AM requires
>> certain configuration, thus you have to to it the "old-way". The main
>> difference would be: MR -> end-user-config -> driver, DS -> driver (but you
>> still can create your own end-user-config). Hope this answers your question
>> and that I understood it correctly.
>>
>> regards
>> tmp
>>
>>
>> 2013/12/5 Yue Wang <[email protected]>
>>
>>> Hi,
>>>
>>> I took a look at the codes and found some examples on the web.
>>> One example is: http://wiki.opf-labs.org/display/SP/Resource+management
>>>
>>> It seems that users can run simple shell commands using Client of YARN.
>>> But when it comes to a practical MapReduce example like WordCount,
>>> people still run commands in the old way as in MRv1.
>>>
>>> How can I run WordCount using Client and ApplicationMaster of YARN so
>>> that I can request resources flexibly?
>>>
>>>
>>> Thanks!
>>>
>>>
>>> On Mon, Dec 2, 2013 at 11:26 AM, Rob Blah <[email protected]> wrote:
>>>
>>>> Hi
>>>>
>>>> Follow the example provided in
>>>> Yarn_dist/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell.
>>>>
>>>> regards
>>>> tmp
>>>>
>>>>
>>>> 2013/12/1 Yue Wang <[email protected]>
>>>>
>>>>> Hi,
>>>>>
>>>>> I found the page (
>>>>> http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html)
>>>>> and know how to write an ApplicationMaster.
>>>>>
>>>>> However, is there a complete example showing how to run this
>>>>> ApplicationMaster with a real Hadoop Program (e.g. WordCount) on YARN?
>>>>>
>>>>> Thanks!
>>>>>
>>>>>
>>>>>
>>>>> Yue
>>>>>
>>>>
>>>>
>>>
>>
>

Re: Implementing and running an applicationmaster

Reply via email to