Packaging the job and config and sending it to the JobTracker and various nodes also adds a few seconds overhead.
On Thu, Apr 8, 2010 at 10:37 AM, Jeff Zhang <zjf...@gmail.com> wrote: > By default, for each task hadoop will create a new jvm process which will > be > the major cost in my opinion. You can customize configuration to let > tasktracker reuse the jvm to eliminate the overhead to some extend. > > On Thu, Apr 8, 2010 at 8:55 PM, Aleksandar Stupar < > stupar.aleksan...@yahoo.com> wrote: > > > Hi all, > > > > As I realize hadoop is mainly used for tasks that take long > > time to execute. I'm considering to use hadoop for task > > whose lower bound in distributed execution is like 5 to 10 > > seconds. Am wondering what would the overhead be with > > using hadoop. > > > > Does anyone have an idea? Any link where I can find this out? > > > > Thanks, > > Aleksandar. > > > > > > > > > > > -- > Best Regards > > Jeff Zhang >