Hi Praveen, in order to submit it to the cluster, you just need to have a core-site.xml on your classpath (or load it explicitly into your configuration object) that looks (at least) like this
<configuration> <property> <name>fs.default.name</name> <value>hdfs://${name:port of namenode}</value> </property> <property> <name>mapred.job.tracker</name> <value>${name:port of jobtracker}</value> </property> </configuration> If you want to wait for each job's completion, you can use job.waitForCompletion(true) rather than job.submit(). Good luck, henning On Mon, 2010-11-22 at 23:40 +0100, praveen.pe...@nokia.com wrote: > Hi Thanks for your reply. In my case I have a Driver that calls > multiple jobs one after the other. I am using the following code to > submit each job but it uses local hadoop jar files that is in the > classpath. Its not submitting the job to Hadoop cluster. I thought I > would need to specify where the master Hadoop is located on remote > machine. Example command I use from command line is as follows but I > need to do it from my Java program. > > $ hadoop-0.20.2/bin/hadoop > jar > /home/ppeddi/dev/Merchandising/RelevancyEngine/relevancy-core/dist/Relevancy4.jar > -i raw-downloads-input-10K -o reco-patterns-output-10K-1S -k 100 -method > mapreduce -g 500 -regex '[\ ]' -s 5 > > > I hope I made the question clear now. > > Praveen > > > ______________________________________________________________________ > From: ext Henning Blohm [mailto:henning.bl...@zfabrik.de] > Sent: Monday, November 22, 2010 5:07 PM > To: mapreduce-user@hadoop.apache.org > Subject: Re: Starting a Hadoop job programtically > > > > > Hi Praveen, > > we do. We are using the "new" org.apache.hadoop.mapreduce.* API in > Hadoop 0.20.2. > > Essentially the flow is: > > //---- > // assuming all config is on the class path > Configuration config = new Configuration(); > Job job = new Job(config, "some job name"); > > // set in/out types > job.setInputFormatClass(...); > job.setOutputFormatClass(...); > job.setMapOutputKeyClass(...); > job.setMapOutputValueClass(...); > job.setOutputKeyClass(...); > job.setOutputValueClass(...); > > // set implementations as required > job.setMapperClass(<your mapper implementation class object>); > job.setCombinerClass(<your combiner implementation class object>); > job.setReducerClass(<your reducer implementation class object>); > > // set the jar... this is often the tricky part! > job.setJarByClass(<some class that is in the job jar and not > elsewhere higher up on the class path>); > > job.submit(); > //---- > > Hope I didn't forget anything. > > Note: You need to give Hadoop something it can launch in a JVM that > has no more but the hadoop jars and whatever else you > configured statically in your hadoop-env.sh script. > > Can you describe your scenario in more detail? > > Henning > > > Am Montag, den 22.11.2010, 22:39 +0100 schrieb > praveen.pe...@nokia.com: > > > Hi all, > > I am trying to figure how I can start a hadoop job porgramatically > > from my Java application running in an app server. I was able to run > > my map reduce job using hadoop command from hadoop master machine > > but my goal is to run the same job from my java program (running on > > a different machine than master). I googled and could not find > > solution for this. All the examples I have seen so far are using > > hadoop from command line to start a job. > > 1. Has anyone called Hadoop job invocation from a Java application? > > 2. If so, could someone provide some sample code. > > 3. > > Thanks > > Praveen > > > > Henning Blohm > > ZFabrik Software KG > > henning.bl...@zfabrik.de > www.z2-environment.eu > > >