Hi To distribute application specific jars or files you can just do the same with 'hadoop jar command' like *hadoop jar* sample.jar com.test.Samples.Application *-files* *file1.txt, file2.csv* *-libjars* *custom_connector.jar, json_util.jar* input_dir output_dir. But this would happen for every time the job is run, if the job is more frequent, there are more number of jars to distribute and there are multiple jobs that would depend on the same jars then rather then distributing the jars every time you trigger the job it is better to pre distribute the same across your nodes and include the same in classpath of all the nodes AFAIK you dont use "hadoop job" to submit your MR job. It is used for playing around with your job(like setting priorities,killing, monitoring status etc) once your job is registered with job tracker( running jobs).
Hope it helps!... Regards Bejoy.K.S On Wed, Nov 16, 2011 at 12:09 PM, Something Something < mailinglist...@gmail.com> wrote: > Until now we were manually copying our Jars to all machines in a Hadoop > cluster. This used to work until our cluster size was small. Now our > cluster is getting bigger. What's the best way to start a Hadoop Job that > automatically distributes the Jar to all machines in a cluster? > > I read the doc at: > http://hadoop.apache.org/common/docs/current/commands_manual.html#jar > > Would -libjars do the trick? But we need to use 'hadoop job' for that, > right? Until now, we were using 'hadoop jar' to start all our jobs. > > Needless to say, we are getting our feet wet with Hadoop, so appreciate > your help with our dumb questions. > > Thanks. > > PS: We use Pig a lot, which automatically does this, so there must be a > clean way to do this. > >