I am using MR and know the job.setJar command - I can add all dependencies to the jar in the lib directory but I was wondering if Hadoop would copy a jar from my local machine to the cluster - also is I ran multiple jobs with the same jar whether the jar would be copied N times (I typically chain 5 map-reduce jobs
On Fri, Apr 25, 2014 at 10:08 AM, Oleg Zhurakousky < [email protected]> wrote: > Are you talking about MR or plain YARN application? > In MR you typically use one of the job.setJar* methods. That aside you may > have more then your app JAR (dependencies). So you can copy the > dependencies to all hadoop nodes classpath (e.g., shared dir) > > Oleg > > > On Fri, Apr 25, 2014 at 1:02 PM, Steve Lewis <[email protected]>wrote: > >> so if I create a Hadoop jar file with referenced libraries in the lib >> directory do I need to move it to hdfs or can it sit on my local machine? >> if I move it to hdfs where does it live - which is to say how do I specify >> the path? >> >> >> On Fri, Apr 25, 2014 at 9:52 AM, Oleg Zhurakousky < >> [email protected]> wrote: >> >>> Yes, if you are running MR >>> >>> >>> On Fri, Apr 25, 2014 at 12:48 PM, Steve Lewis <[email protected]>wrote: >>> >>>> Thank you for your answer >>>> >>>> 1) I am using YARN >>>> 2) So presumably dropping core-site.xml, yarn-site into user.dir >>>> works do I need mapred-site.xml as well? >>>> >>>> >>>> >>>> On Fri, Apr 25, 2014 at 9:00 AM, Oleg Zhurakousky < >>>> [email protected]> wrote: >>>> >>>>> What version of Hadoop you are using? (YARN or no YARN) >>>>> To answer your question; Yes its possible and simple. All you need to >>>>> to is to have Hadoop JARs on the classpath with relevant configuration >>>>> files on the same classpath pointing to the Hadoop cluster. Most often >>>>> people simply copy core-site.xml, yarn-site.xml etc from the actual >>>>> cluster >>>>> to the application classpath and then you can run it straight from IDE. >>>>> >>>>> Not a windows user so not sure about that second part of the question. >>>>> >>>>> Cheers >>>>> Oleg >>>>> >>>>> >>>>> On Fri, Apr 25, 2014 at 11:46 AM, Steve Lewis >>>>> <[email protected]>wrote: >>>>> >>>>>> Assume I have a machine on the same network as a hadoop 2 cluster but >>>>>> separate from it. >>>>>> >>>>>> My understanding is that by setting certain elements of the config >>>>>> file or local xml files to point to the cluster I can launch a job >>>>>> without >>>>>> having to log into the cluster, move my jar to hdfs and start the job >>>>>> from >>>>>> the cluster's hadoop machine. >>>>>> >>>>>> Does this work? >>>>>> What Parameters need I sat? >>>>>> Where is the jar file? >>>>>> What issues would I see if the machine is running Windows with cygwin >>>>>> installed? >>>>>> >>>>>> >>>>> >>>> >>>> >>>> -- >>>> Steven M. Lewis PhD >>>> 4221 105th Ave NE >>>> Kirkland, WA 98033 >>>> 206-384-1340 (cell) >>>> Skype lordjoe_com >>>> >>>> >>> >> >> >> -- >> Steven M. Lewis PhD >> 4221 105th Ave NE >> Kirkland, WA 98033 >> 206-384-1340 (cell) >> Skype lordjoe_com >> >> > -- Steven M. Lewis PhD 4221 105th Ave NE Kirkland, WA 98033 206-384-1340 (cell) Skype lordjoe_com
