Hi Huy, On Thu, Jun 25, 2009 at 6:02 PM, Huy Phan<dac...@gmail.com> wrote: > I'm wondering if there's any performance killer in this approach, I posted > the question to IRC channel and someone told me that there may be a > bottleneck.
There may be some communication errors to block your MapReduce job when you post your output data. So I think it's better to do this after the job is done. > I wonder if there is any way to spawn a process directly from Hadoop after > all the MapReduce tasks finish ? > How do you submit your jobs? You can block the job submit process by running job using job.waitForCompletion(true) in your main driver class. Then the two processes are synchronous. -- Zhong Wang