Take a look at Yahoo's Oozie, it's fairly trivial to build a workflow for a map reduce job and submit it via the web service for processing, it's a lot easier than using ProcessBuilder also.
Jon. On 24 Jun 2011, at 22:47, Andre Reiter <a.rei...@web.de> wrote: > Hi Doug, > > thanks a lot for your reply > the point is clear hoe to create a job instance and to configure it using the > TableMapReduceUtil.initTableMapperJob > actually our job is working just perfectly, even the third party libs are > simple to import using TableMapReduceUtil.addDependencyJars > > the problem is about the starting the MR job... > > at the moment we do it this way: > - set HADOOP_CLASSPATH with hbase, zookeeper, and all third party jars > - execute "./bin/hadoop jar /tmp/map_reduce_v1.jar package1.MRDriver1" > > that works like a charm, the question is now, how to start the job from our > web application running on tomcat ??? > > one option is may be to fork a new process, like this: > ProcessBuilder pb = new ProcessBuilder("/opt/hadoop/bin/hadoop", "jar", > "/tmp/map_reduce_v1.jar", "package1.MRDriver1"); > ... > // configure ProcessBuilder > Process p = processBuilder.start(); > > but this does not seem to be very elegant to us... does it? > > so how to start a job from a running app, in the same process without forking > > andre > > > > Doug Meil wrote: >> >> Hi there- >> >> Take a look at this for starters... >> >> http://hbase.apache.org/book.html#mapreduce >> >> >> if you do job.waitForCompletion(true); it will execute synchronously. If >> you do job.waitForCompletion(false) it will fire and forget. A simple >> pattern is to spin off a thread where it executes job.waitFor..(true) and >> then you can pick up the results. >> >> >> -----Original Message----- >> From: Andre Reiter [mailto:a.rei...@web.de] >> Sent: Friday, June 24, 2011 12:41 AM >> To: user@hbase.apache.org >> Subject: Re: Running MapReduce from a web application >> >> Hi everybody, >> >> no suggestiona about that questions? >> how to submit a MR out of my application, and not manually from a shell >> useing ./bin/hadoop jar ... ? >> >> best regards >> andre >> >> >> >> Andre Reiter wrote: >>> now i would like to start MR jobs from my web application running on a >>> tomcat, is there an elegant way to do it? >>> >>> the second question: at the moment i use the TextOutputFormatis the >>> output format, which creates a file in the specified dfs directory: >>> part-r-00000 so i can read id using ./bin/hadoop fs -cat >>> /tmp/requests/part-r-00000 on the shell >>> >>> how can i get the path to this output file after my job is finished, to >>> process it however... is there another way to collect results of a MR job, >>> a text file is good for humans, but IMHO parsing a text file for results is >>> not the preferable way... >>> >>> thanks in advance >>> andre >>> >>> PS: >>> versions: >>> - Linux version 2.6.26-2-amd64 (Debian 2.6.26-25lenny1) >>> - hadoop-0.20.2-CDH3B4 >>> - hbase-0.90.1-CDH3B4 >>> - zookeeper-3.3.2-CDH3B4 >> >> > >