Have you looked at the PigServer class?

http://pig.apache.org/docs/r0.9.2/api/org/apache/pig/PigServer.html

Have you looked at the Job class in Hadoop?

http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapreduce/Job.html

I assume you'll want to do this asynchronously?

On Jul 1, 2013, at 7:31 PM, Huy Pham <[email protected]> wrote:

> I have a tomcat server, have several servlets, a mapreduce job (written using 
> hadoop), also have pig installed, all sit in the same cluster as where hadoop 
> is.
> 
> Now I need my servlet to be able to execute a mapreduce program (or a pig 
> script), and display the results returned by the mapreduce program. Is there 
> anyway to make a servlet to execute a mapreduce job and get back the results?
> 
> ++ I think it is possible to make my servlet execute a mapreduce job (or a 
> pig script) by simply calling exec or ProcessBuilder. If I am wrong, please 
> correct me here.
> 
> ++ However, a mapreduce job (or a pig script) produces results in HDFS, which 
> is where I am unsure about how to get back the results and feed them back to 
> the servlet. One solution, which seems to be amateur and inefficient to me, 
> is to use ProcessBuilder (or exec) again to copy results from HDFS to local, 
> and read results from there.
> 
> Would very much appreciate any suggestion you might share.

Reply via email to