Have you looked at the PigServer class? http://pig.apache.org/docs/r0.9.2/api/org/apache/pig/PigServer.html
Have you looked at the Job class in Hadoop? http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapreduce/Job.html I assume you'll want to do this asynchronously? On Jul 1, 2013, at 7:31 PM, Huy Pham <[email protected]> wrote: > I have a tomcat server, have several servlets, a mapreduce job (written using > hadoop), also have pig installed, all sit in the same cluster as where hadoop > is. > > Now I need my servlet to be able to execute a mapreduce program (or a pig > script), and display the results returned by the mapreduce program. Is there > anyway to make a servlet to execute a mapreduce job and get back the results? > > ++ I think it is possible to make my servlet execute a mapreduce job (or a > pig script) by simply calling exec or ProcessBuilder. If I am wrong, please > correct me here. > > ++ However, a mapreduce job (or a pig script) produces results in HDFS, which > is where I am unsure about how to get back the results and feed them back to > the servlet. One solution, which seems to be amateur and inefficient to me, > is to use ProcessBuilder (or exec) again to copy results from HDFS to local, > and read results from there. > > Would very much appreciate any suggestion you might share.
