This is quite expected on Hadoop. The overhead of small tasks is very high - it's been improved a bit over time but it's still a minimum of around 24 seconds even for a no-op job.
Hadoop scales up but does not scale down to miniscule datasets. -Todd On Mon, Sep 27, 2010 at 12:10 PM, Pete Tyler <[email protected]>wrote: > Thanks for the offer, much appreciated I have a very simple mapreduce job > on a pseudo distributed system. I have a very small amount of persisted > data. > > Running locally the mapreduce job runs very quickly, less than three > seconds. > > When I run the job against the pseudo distributed hadoop, still on the same > machine, as the client then I see the following, > - the map and reduce classes run very quickly, a matter of mills in total > ... sweet > - the client, blocks waiting for the job to finish for about 20 seconds ... > very slow > > I'm trying to understand why I have this 20 second overhead and what I can > do about it. > > My map and reduce classes are in my Hadoop classpath. > > On Sep 27, 2010, at 11:32 AM, Jean-Daniel Cryans <[email protected]> > wrote: > > > Using 0.21.0 may reveal newer bugs rather than fixing your older ones. > > Maybe we can help you debugging 0.20.2, what are you seeing? > > > > J-D > > > > On Sun, Sep 26, 2010 at 7:03 PM, Pete Tyler <[email protected]> > wrote: > >> I believe my issue is within Hadoop, not HBase, and as such I was hoping > to run with the very latest version of Hadoop before putting in serious > debugging time. If I am reading the docs correctly it looks like there is > currently no version if HBase that allows me to use Hadoop 0.21. > >> > >> On Sep 25, 2010, at 3:27 PM, Ryan Rawson <[email protected]> wrote: > >> > >>> No You can not. I would recommend against it as well... if you want an > >>> upgrade check out cdh3b2 or hadoop-append branch which has patches for > data > >>> durability. The RC of 89 is very usable as well. > >>> On Sep 25, 2010 1:35 PM, "Pete Tyler" <[email protected]> > wrote: > >>>> > >>>> Apologies if I've missed this information elsewhere but I'm unclear if > I > >>> can > >>>> upgrade to Hadoop 0.21 while still running HBase 0.20.6. > >>>> -- > >>>> View this message in context: > >>> > http://old.nabble.com/Can-I-run-HBase-0.20.6-on-Hadoop-0.21--tp29808199p29808199.html > >>>> Sent from the HBase User mailing list archive at Nabble.com. > >>>> > >> > -- Todd Lipcon Software Engineer, Cloudera
