Tp clarify, these utilization % numbers were during a job running on some
number of clients that was I/O bound? The server side was/is not CPU bound,
right?

When you LVM'd the two RAID together did you specify number stripes and
stripe width of the logical volumes? Specifically, did you use the --stripes
and --stripesize options to lvcreate? Or none? I would expect that you did
not based on the behavior you're seeing.

I know originally you said you were getting 30MBps when doing a 1MB block
size dd. Could you do that same test now in a directory with the stripe size
set to 1M as I mentioned in previous e-mails.

What's the network latency between a compute node and PVFS server when doing
a ping. I would expect something in the ballpark of:
rtt min/avg/max/mdev = 0.126/0.159/0.178/0.019 ms

Michael

On Tue, Oct 11, 2011 at 2:33 PM, Jim Kusznir <[email protected]> wrote:

> I finally did manage to do this, and the results were a bit
> interesting.  First, the highest amount I saw in the %utilization
> column was 16% on one server, and that was only there for 1
> measurement period.  Typical maximums were 7%.
>
> The interesting part was that my second server was rarely over 1%, my
> first server was 4-7% and my 3rd server was 5-9%.
>
> The other interesting part was where the I/O was principally
> happening.  Originally, I had 8TB of 750GB SATA disks (in a hardware
> raid-6), and then I added a second RAID-6 of 2TB disks which has the
> majority of the disk space.  The two are lvm'ed together.  So far,
> nearly all the %utilization numbers were showing up on the 750GB
> disks.
>
> I have been running xfs_fsr to get the fragmentation down.  My 3rd
> node is still at 17%; the first node is at 5%, and the 2nd node is at
> 0.7%.  I've put in a cron job to run xfs_fsr for 4 hours each Sunday
> night starting at midnight (when my cluster is usually idle anyway) to
> try and improve/manage that.  I'm not sure if there is actually a
> causality relationship here, but the load% seems to follow the frag%
> (higher frag, higher load).
>
> Still, the fact that ti peaks out so low has me questioning what's going
> on...
>
> Watching it a bit longer into another workload, and I do see %use
> spike up to 35%, but network I/O (as measured by bwm-ng) still peaks
> at 8MB/s on pure gig-e (which should be capable of 90MB/s).
>
> --Jim
>
> On Thu, Oct 6, 2011 at 1:36 PM, Emmanuel Florac <[email protected]>
> wrote:
> > Le Wed, 5 Oct 2011 08:44:11 -0700 vous écriviez:
> >
> >>  I don't
> >> know how to watch actual IOPS or other more direct metrics.
> >
> > Use the iostat command, something like
> >
> > iostat -mx 4
> >
> > you'll have a very detailed report on disk activity. The percentage of
> > usage (last column to the right) might be interesting. Let it run for a
> > while and see if there's a pattern.
> >
> > --
> > ------------------------------------------------------------------------
> > Emmanuel Florac     |   Direction technique
> >                    |   Intellique
> >                    |   <[email protected]>
> >                    |   +33 1 78 94 84 02
> > ------------------------------------------------------------------------
> >
>
> _______________________________________________
> Pvfs2-users mailing list
> [email protected]
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
>
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Reply via email to