Hi Thanks for your reply. In my case I have a Driver that calls multiple jobs one after the other. I am using the following code to submit each job but it uses local hadoop jar files that is in the classpath. Its not submitting the job to Hadoop cluster. I thought I would need to specify where the master Hadoop is located on remote machine. Example command I use from command line is as follows but I need to do it from my Java program.
$ hadoop-0.20.2/bin/hadoop jar /home/ppeddi/dev/Merchandising/RelevancyEngine/relevancy-core/dist/Relevancy4.jar -i raw-downloads-input-10K -o reco-patterns-output-10K-1S -k 100 -method mapreduce -g 500 -regex '[\ ]' -s 5 I hope I made the question clear now. Praveen ________________________________ From: ext Henning Blohm [mailto:henning.bl...@zfabrik.de] Sent: Monday, November 22, 2010 5:07 PM To: mapreduce-user@hadoop.apache.org Subject: Re: Starting a Hadoop job programtically Hi Praveen, we do. We are using the "new" org.apache.hadoop.mapreduce.* API in Hadoop 0.20.2. Essentially the flow is: //---- // assuming all config is on the class path Configuration config = new Configuration(); Job job = new Job(config, "some job name"); // set in/out types job.setInputFormatClass(...); job.setOutputFormatClass(...); job.setMapOutputKeyClass(...); job.setMapOutputValueClass(...); job.setOutputKeyClass(...); job.setOutputValueClass(...); // set implementations as required job.setMapperClass(<your mapper implementation class object>); job.setCombinerClass(<your combiner implementation class object>); job.setReducerClass(<your reducer implementation class object>); // set the jar... this is often the tricky part! job.setJarByClass(<some class that is in the job jar and not elsewhere higher up on the class path>); job.submit(); //---- Hope I didn't forget anything. Note: You need to give Hadoop something it can launch in a JVM that has no more but the hadoop jars and whatever else you configured statically in your hadoop-env.sh script. Can you describe your scenario in more detail? Henning Am Montag, den 22.11.2010, 22:39 +0100 schrieb praveen.pe...@nokia.com: Hi all, I am trying to figure how I can start a hadoop job porgramatically from my Java application running in an app server. I was able to run my map reduce job using hadoop command from hadoop master machine but my goal is to run the same job from my java program (running on a different machine than master). I googled and could not find solution for this. All the examples I have seen so far are using hadoop from command line to start a job. 1. Has anyone called Hadoop job invocation from a Java application? 2. If so, could someone provide some sample code. 3. Thanks Praveen Henning Blohm ZFabrik Software KG henning.bl...@zfabrik.de<mailto:henning.bl...@zfabrik.de> www.z2-environment.eu<http://www.z2-environment.eu>