>From our test of hadoop-0.20.1 on 10 nodes, we find the setup period is longer as more jobs are submitted. I don't know why maptask for setup is needed, why not jobtracker or one thread takes over this work?
2010/3/11 Jeff Zhang <zjf...@gmail.com> > Hi Zhou, > > I look at the source code, it seems it is the JobTracker initiate the > setup > and cleanup task. > And why do you think the setup and cleanup phases consume a lot of time, > actually the time cost is depend on the OutputCommitter > > > > > On Thu, Mar 11, 2010 at 11:04 AM, Min Zhou <coderp...@gmail.com> wrote: > > > Hi all, > > > > Why hadoop jobs need setup and cleanup phases which would consume a > > lot of time ? Why could not us archieve it like a distributed RDBMS > > does a master process coordinates all salve nodes through socket. > > I think that will save plenty of time if there won't be any setups and > > cleanups. What's hadoop philosophy on this? > > > > Thanks, > > Min > > -- > > My research interests are distributed systems, parallel computing and > > bytecode based virtual machine. > > > > My profile: > > http://www.linkedin.com/in/coderplay > > My blog: > > http://coderplay.javaeye.com > > > > > > -- > Best Regards > > Jeff Zhang >