/lib is definitely the way to go. But adding gobs and gobs of stuff there makes jobs start slowly because you have to propagate a multi-megabyte blob to lots of worker nodes.
I would consider adding universally used jars to the hadoop class path on every node, but I would also expect to face configuration management nightmares (small ones, though) from doing this. On 1/7/08 11:50 AM, "Lars George" <[EMAIL PROTECTED]> wrote: > Arun, > > Ah yes, I see it now in JobClient. OK, then how are the required aux > libs handled? I assume a /lib inside the job jar is the only way to go? > > I saw the discussion on the Wiki about adding Hbase permanently to the > HADOOP_CLASSPATH, but then I also have to deploy the Lucene jar files, > Xerces etc. I guess it is better if I add everything non-Hadoop into the > job jar's lib directory? > > Thanks again for the help, > Lars > > > Arun C Murthy wrote: >> On Mon, Jan 07, 2008 at 08:24:36AM -0800, Lars George wrote: >> >>> Hi, >>> >>> Maybe someone here can help me with a rather noob question. Where do I >>> have to put my custom jar to run it as a map/reduce job? Anywhere and >>> then specifying the HADOOP_CLASSPATH variable in hadoop-env.sh? >>> >>> >> >> Once you have your jar and submit it for your job via the *hadoop jar* >> command the framework takes care of distributing the software for nodes on >> which your maps/reduces are scheduled: >> $ hadoop jar <custom_jar> <custom_args> >> >> The detail is that the framework copies your jar from the submission node to >> the HDFS and then copies it onto the execution node. >> >> Does http://lucene.apache.org/hadoop/docs/r0.15.1/mapred_tutorial.html#Usage >> help? >> >> Arun >> >> >>> Also, since I am using the Hadoop API already from our server code, it >>> seems natural to launch jobs from within our code. Are there any issue >>> with that? I assume I have to copy the jar files first and make them >>> available as per my question above, but then I am ready to start it from >>> my own code? >>> >>> I have read most Wiki entries and while the actual workings are >>> described quite nicely, I could not find an answer to the questions >>> above. The demos are already in place and can be started as is without >>> the need of making them available. >>> >>> Again, I apologize for being a noobie. >>> >>> Lars >>> >> >>