Hi Zhou, I look at the source code, it seems it is the JobTracker initiate the setup and cleanup task. And why do you think the setup and cleanup phases consume a lot of time, actually the time cost is depend on the OutputCommitter
On Thu, Mar 11, 2010 at 11:04 AM, Min Zhou <coderp...@gmail.com> wrote: > Hi all, > > Why hadoop jobs need setup and cleanup phases which would consume a > lot of time ? Why could not us archieve it like a distributed RDBMS > does a master process coordinates all salve nodes through socket. > I think that will save plenty of time if there won't be any setups and > cleanups. What's hadoop philosophy on this? > > Thanks, > Min > -- > My research interests are distributed systems, parallel computing and > bytecode based virtual machine. > > My profile: > http://www.linkedin.com/in/coderplay > My blog: > http://coderplay.javaeye.com > -- Best Regards Jeff Zhang