Re: HDFS on trunk is now quite slow

Todd Lipcon Wed, 06 Jul 2011 10:20:40 -0700

On Wed, Jul 6, 2011 at 10:16 AM, Eric Payne <er...@yahoo-inc.com> wrote:


> No, it's not that easy to set up the environment. I have 10 nodes, 1 is the
> NN and 9 are running MiniDFSCluster to simulate about 100 datanodes each.
> There are a lot of specific configurations that need to be set, as well as
> the HDFS-1875 patch to MiniDFSCluster.
>
> Once the environment is set up, the
> org.apache.hadoop.fs.loadGenerator.LoadGenerator test program
> (hadoop-hdfs-test.jar) is run on the client with 6 threads that randomly
> create, write, read, and delete 'simulated' files.
>

Gotcha. Another possibility is the protocol-buffer based data pipeline which
is in trunk but I don't think merged into yahoo-merge. I only measured ~3%
increase in CPU usage in my tests for that, which wouldn't explain what
you're seeing. But, you could try selectively reverting that patch as well
to see if it's the cause.

-Todd


> > -----Original Message-----
> > From: Todd Lipcon [mailto:t...@cloudera.com]
> > Sent: Wednesday, July 06, 2011 11:12 AM
> > To: hdfs-dev@hadoop.apache.org
> > Subject: Re: HDFS on trunk is now quite slow
> >
> > On Wed, Jul 6, 2011 at 9:00 AM, Eric Payne <er...@yahoo-inc.com> wrote:
> >
> > > I will attempt to recreate the tests on 20.203.
> > >
> > > Currently, I'm comparing trunk against branches/MR-279/, and the
> > slowdown
> > > is many times slower. I have run several tests (45 or 50) with
> different
> > > variables, and they all seem to be slower on trunk.
> > >
> > > Just for example, in one test here are my findings:
> > >
> > > Operation                       Trunk     branches/MR-279/
> > > -------------------             -----     ----------------
> > > Average operations per second:   24       200
> > > Average open execution time:     41ms       5ms
> > > Average deletion time:           43ms       5ms
> > > Average creation time:           47ms       9ms
> > > Average write close time:       658ms      100ms
> > >
> > >
> > Seems pretty bad. Which test case is it that you're running? Something
> > that's easy for others to reproduce? How many concurrent threads access
> > the
> > NN? If you jstack the NN do you see some particular lock causing lots of
> > contention?
> >
> > -Todd
> >
> >
> > > > -----Original Message-----
> > > > From: Todd Lipcon [mailto:t...@cloudera.com]
> > > > Sent: Wednesday, July 06, 2011 10:26 AM
> > > > To: hdfs-dev@hadoop.apache.org
> > > > Subject: Re: HDFS on trunk is now quite slow
> > > >
> > > > On Wed, Jul 6, 2011 at 6:54 AM, Eric Payne <er...@yahoo-inc.com>
> > wrote:
> > > >
> > > > > Thanks Todd.
> > > > >
> > > > > Yes, the stress test is NN-only. The simulated datanodes (using
> > > > > MiniDFSCluster) don't read or write actual data, only log the
> > metadata.
> > > > >
> > > > > So, it sounds like the slowdown on the NN is to be expected,
> > correct?
> > > > The
> > > > > race condition I was experiencing before is no longer happening, so
> > the
> > > > > benefit of correct locking has resulted in an acceptable slowdown
> on
> > > the
> > > > > namenode. Is that correct?
> > > > >
> > > >
> > > > How does the slowdown compare to 0.20.203 for example? We may have
> > made
> > > > the
> > > > locking _too_ coarse -- ie overcompensated for the bug.
> > > >
> > > > -Todd
> > > >
> > > >
> > > > >
> > > > > Thanks,
> > > > > -Eric
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Todd Lipcon [mailto:t...@cloudera.com]
> > > > > > Sent: Friday, July 01, 2011 7:49 PM
> > > > > > To: hdfs-dev@hadoop.apache.org
> > > > > > Subject: Re: HDFS on trunk is now quite slow
> > > > > >
> > > > > > My guess is HDFS-988 caused the slowdown by coarsening some
> > locking
> > > > that
> > > > > > was
> > > > > > previously incorrect. Your stress test is NN-only (metadata ops),
> > not
> > > > an
> > > > > > I/O
> > > > > > benchmark, right? I/O should be faster in trunk than ever before.
> > > > > >
> > > > > > -Todd
> > > > > >
> > > > > > On Fri, Jul 1, 2011 at 8:23 AM, Eric Payne
> > <eric.payne1...@yahoo.com
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi gang,
> > > > > > >
> > > > > > > I ran some stress tests on the latest HDFS trunk yesterday, and
> > the
> > > > > > > performance
> > > > > > > is a lot slower (sometimes 10 times slower) when compared with
> > the
> > > > HDFS
> > > > > > in
> > > > > > > MR-279. The HDFS in MR-279 is slightly behind trunk. The
> > stability
> > > > of
> > > > > > > HDFS trunk
> > > > > > > seems to be better than HDFS MR-279, but I'm not sure if the
> > > > slowness
> > > > > is
> > > > > > > just
> > > > > > > avoiding the race contitions or if they are actually fixed in
> > > trunk.
> > > > > > >
> > > > > > > At this point, I'm not sure what is causing this performance
> > > > disparity.
> > > > > > I
> > > > > > > notice
> > > > > > > that Block management has recently undergone significant
> changes
> > in
> > > > > > trunk.
> > > > > > > It
> > > > > > > has some new locking and it is now in its own package. Could
> > this
> > > be
> > > > > > part
> > > > > > > of the
> > > > > > > cause?
> > > > > > >
> > > > > > > Thanks,
> > > > > > > -Eric
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Todd Lipcon
> > > > > > Software Engineer, Cloudera
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Todd Lipcon
> > > > Software Engineer, Cloudera
> > >
> >
> >
> >
> > --
> > Todd Lipcon
> > Software Engineer, Cloudera
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Re: HDFS on trunk is now quite slow

Reply via email to