I have a tomcat server, have several servlets, a mapreduce job (written using 
hadoop), also have pig installed, all sit in the same cluster as where hadoop 
is.

Now I need my servlet to be able to execute a mapreduce program (or a pig 
script), and display the results returned by the mapreduce program. Is there 
anyway to make a servlet to execute a mapreduce job and get back the results?

++ I think it is possible to make my servlet execute a mapreduce job (or a pig 
script) by simply calling exec or ProcessBuilder. If I am wrong, please correct 
me here.

++ However, a mapreduce job (or a pig script) produces results in HDFS, which 
is where I am unsure about how to get back the results and feed them back to 
the servlet. One solution, which seems to be amateur and inefficient to me, is 
to use ProcessBuilder (or exec) again to copy results from HDFS to local, and 
read results from there.

Would very much appreciate any suggestion you might share.

Reply via email to