Re: Starting a Hadoop job programtically

Henning Blohm Mon, 22 Nov 2010 14:07:35 -0800

Hi Praveen,

  we do. We are using the "new" org.apache.hadoop.mapreduce.* API in
Hadoop 0.20.2.


  Essentially the flow is:

  //----
  // assuming all config is on the class path
  Configuration config = new Configuration(); 
  Job job = new Job(config, "some job name");

  // set in/out types
  job.setInputFormatClass(...);
  job.setOutputFormatClass(...);
  job.setMapOutputKeyClass(...);
  job.setMapOutputValueClass(...);
  job.setOutputKeyClass(...);
  job.setOutputValueClass(...);

  // set implementations as required
  job.setMapperClass(<your mapper implementation class object>);
  job.setCombinerClass(<your combiner implementation class object>);
  job.setReducerClass(<your reducer implementation class object>);

  // set the jar... this is often the tricky part!
  job.setJarByClass(<some class that is in the job jar and not elsewhere
higher up on the class path>);

  job.submit();
  //----

Hope I didn't forget anything.  

Note: You need to give Hadoop something it can launch in a JVM that has
no more but the hadoop jars and whatever else you
configured statically in your hadoop-env.sh script.

Can you describe your scenario in more detail?

Henning


Am Montag, den 22.11.2010, 22:39 +0100 schrieb praveen.pe...@nokia.com:
> Hi all,
> I am trying to figure how I can start a hadoop job porgramatically
> from my Java application running in an app server. I was able to run
> my map reduce job using hadoop command from hadoop master machine but
> my goal is to run the same job from my java program (running on a
> different machine than master). I googled and could not find solution
> for this. All the examples I have seen so far are using hadoop from
> command line to start a job. 
>  
> 1. Has anyone called Hadoop job invocation from a Java application?
> 2. If so, could someone provide some sample code.
> 3. 
>  
> Thanks
> Praveen
>  

Henning Blohm

ZFabrik Software KG

henning.bl...@zfabrik.de
www.z2-environment.eu

Re: Starting a Hadoop job programtically

Reply via email to