Re: What configuration parameters cause a Hadoop 2.x job to run on the cluster

Steve Lewis Fri, 25 Apr 2014 10:15:03 -0700

I am using MR and know the job.setJar command - I can add all dependencies
to the jar in the lib directory but I was wondering if Hadoop would copy a
jar from my local machine to the cluster - also is I ran multiple jobs with
the same jar whether the jar would be copied N times (I typically chain 5
map-reduce jobs



On Fri, Apr 25, 2014 at 10:08 AM, Oleg Zhurakousky <
[email protected]> wrote:

> Are you talking about MR or plain YARN application?
> In MR you typically use one of the job.setJar* methods. That aside you may
> have more then your app JAR (dependencies). So you can copy the
> dependencies to all hadoop nodes classpath (e.g., shared dir)
>
> Oleg
>
>
> On Fri, Apr 25, 2014 at 1:02 PM, Steve Lewis <[email protected]>wrote:
>
>> so if I create a Hadoop jar file with referenced libraries in the lib
>> directory do I need to move it to hdfs or can it sit on my local machine?
>> if I move it to hdfs where does it live - which is to say how do I specify
>> the path?
>>
>>
>> On Fri, Apr 25, 2014 at 9:52 AM, Oleg Zhurakousky <
>> [email protected]> wrote:
>>
>>> Yes, if you are running MR
>>>
>>>
>>> On Fri, Apr 25, 2014 at 12:48 PM, Steve Lewis <[email protected]>wrote:
>>>
>>>> Thank you for your answer
>>>>
>>>> 1) I am using YARN
>>>> 2) So presumably dropping  core-site.xml, yarn-site into user.dir
>>>> works do I need mapred-site.xml as well?
>>>>
>>>>
>>>>
>>>> On Fri, Apr 25, 2014 at 9:00 AM, Oleg Zhurakousky <
>>>> [email protected]> wrote:
>>>>
>>>>> What version of Hadoop you are using? (YARN or no YARN)
>>>>> To answer your question; Yes its possible and simple. All you need to
>>>>> to is to have Hadoop JARs on the classpath with relevant configuration
>>>>> files on the same  classpath pointing to the Hadoop cluster. Most often
>>>>> people simply copy core-site.xml, yarn-site.xml etc from the actual 
>>>>> cluster
>>>>> to the application classpath and then you can run it straight from IDE.
>>>>>
>>>>> Not a windows user so not sure about that second part of the question.
>>>>>
>>>>> Cheers
>>>>> Oleg
>>>>>
>>>>>
>>>>> On Fri, Apr 25, 2014 at 11:46 AM, Steve Lewis 
>>>>> <[email protected]>wrote:
>>>>>
>>>>>> Assume I have a machine on the same network as a hadoop 2 cluster but
>>>>>> separate from it.
>>>>>>
>>>>>> My understanding is that by setting certain elements of the config
>>>>>> file or local xml files to point to the cluster I can launch a job 
>>>>>> without
>>>>>> having to log into the cluster, move my jar to hdfs and start the job 
>>>>>> from
>>>>>> the cluster's hadoop machine.
>>>>>>
>>>>>> Does this work?
>>>>>> What Parameters need I sat?
>>>>>> Where is the jar file?
>>>>>> What issues would I see if the machine is running Windows with cygwin
>>>>>> installed?
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Steven M. Lewis PhD
>>>> 4221 105th Ave NE
>>>> Kirkland, WA 98033
>>>> 206-384-1340 (cell)
>>>> Skype lordjoe_com
>>>>
>>>>
>>>
>>
>>
>> --
>> Steven M. Lewis PhD
>> 4221 105th Ave NE
>> Kirkland, WA 98033
>> 206-384-1340 (cell)
>> Skype lordjoe_com
>>
>>
>


-- 
Steven M. Lewis PhD
4221 105th Ave NE
Kirkland, WA 98033
206-384-1340 (cell)
Skype lordjoe_com

Re: What configuration parameters cause a Hadoop 2.x job to run on the cluster

Reply via email to