To answer Phil's question: just restarting IOR is enough, yes. Not PVFS. 
For the rest, I'll do some experiments when I have the chance and get back to 
you. 

Thanks all 

Matthieu 

----- Mail original -----

> De: "Becky Ligon" <[email protected]>
> À: "Matthieu Dorier" <[email protected]>
> Cc: "Rob Latham" <[email protected]>, "pvfs2-users"
> <[email protected]>, "ofs-support"
> <[email protected]>
> Envoyé: Mardi 2 Avril 2013 17:22:17
> Objet: Re: [Pvfs2-users] Strange performance behavior with IOR

> Matthieu:

> Are you seeing any 100% CPU utilizations on the client? We have seen
> this with the client core (which you are not using) on a multicore
> system; however, both the client core and the PVFS interface do use
> the same request structures, etc.

> Becky

> On Tue, Apr 2, 2013 at 11:11 AM, Becky Ligon < [email protected] >
> wrote:

> > Matthieu:
> 

> > I have asked Phil Carns to help you since he is more familiar with
> > the benchmark and MPIIO. I think Rob Latham or Rob Ross may be
> > helping too. I continue to look at your data in the mean time.
> 

> > Becky
> 

> > Phil/Rob:
> 

> > Thanks so much for helping Matthieu. I am digging into the matter
> > but
> > MPI is still new to me and I'm not familiar with the PVFS interface
> > that accompanies ROMIO.
> 

> > Becky
> 

> > PS. Can we keep this on the pvfs2-users list so I can see how
> > things
> > progress?
> 

> > On Tue, Apr 2, 2013 at 10:47 AM, Matthieu Dorier <
> > [email protected] > wrote:
> 

> > > Hi Rob and Phil,
> > 
> 

> > > This thread moved to the ofs-support mailing list (probably
> > > because
> > > the first personne to answer was part of this team), but I didn't
> > > get much answer to my problem, so I'll try to summarize here what
> > > I
> > > have done.
> > 
> 

> > > First to answer Phil, here is the PVFS config file attached, and
> > > here
> > > is the script file used for IOR:
> > 
> 

> > > IOR START
> > 
> 
> > > testFile = pvfs2:/mnt/pvfs2/testfileA
> > 
> 
> > > filePerProc=0
> > 
> 
> > > api=MPIIO
> > 
> 
> > > repetitions=100
> > 
> 

> > > verbose=2
> > 
> 
> > > blockSize=4m
> > 
> 
> > > transferSize=4m
> > 
> 
> > > collective=1
> > 
> 
> > > writeFile=1
> > 
> 
> > > interTestDelay=60
> > 
> 
> > > readFile=0
> > 
> 
> > > RUN
> > 
> 
> > > IOR STOP
> > 
> 

> > > Besides the tests I was describing on my first mail, I also did
> > > the
> > > same experiments on another cluster also with TCP over IB, and
> > > then
> > > on Ethernet, with 336 clients and 672 clients, with 2, 4 and 8
> > > storage servers. In every cases, this behavior appears.
> > 
> 

> > > I benchmarked the local disk attached to the storage servers and
> > > got
> > > 42MB/s, so the high throughput of over 2GB/s I get obviously
> > > benefits from some caching mechanisme and the periodic behavior
> > > observed at high output frequency could be explained by that. Yet
> > > this does not explain why, overall, the performance decreases
> > > over
> > > time.
> > 
> 

> > > I attach a set of graphics summarizing the experiments (on the x
> > > axis
> > > it's the iteration number and on the y axis the aggregate
> > > throughput
> > > obtained for this iteration, 100 consecutive iterations are
> > > performed).
> > 
> 
> > > It seems that the performance follows the law D = a*T+b where D
> > > is
> > > the duration of the write, T is the wallclock time since the
> > > beginning of the experiment, and "a" and "b" are constants.
> > 
> 

> > > When I stop IOR and immediately restart it, I get the good
> > > performance back, it does not continue at the reduced performance
> > > the previous instance finished.
> > 
> 

> > > I also thought it could come from the fact that the same file is
> > > re-written at every iteration, and tried with the multiFile=1
> > > option
> > > to have one new file at every iteration instead, but this didn't
> > > help.
> > 
> 

> > > Last thing I can mention: I'm using mpich 3.0.2, compiled with
> > > PVFS
> > > support.
> > 
> 

> > > Matthieu
> > 
> 

> > > ----- Mail original -----
> > 
> 
> > > > De: "Rob Latham" < [email protected] >
> > 
> 
> > > > À: "Matthieu Dorier" < [email protected] >
> > 
> 
> > > > Cc: "pvfs2-users" < [email protected] >
> > 
> 
> > > > Envoyé: Mardi 2 Avril 2013 15:57:54
> > 
> 
> > > > Objet: Re: [Pvfs2-users] Strange performance behavior with IOR
> > 
> 

> > > >
> > 
> 
> > > > On Sat, Mar 23, 2013 at 03:31:22PM +0100, Matthieu Dorier
> > > > wrote:
> > 
> 
> > > > > I've installed PVFS (orangeFS 2.8.7) on a small cluster (2
> > > > > PVFS
> > 
> 
> > > > > nodes, 28 compute nodes of 24 cores each, everything
> > > > > connected
> > 
> 
> > > > > through infiniband but using an IP stack on top of it, so the
> > 
> 
> > > > > protocol for PVFS is TCP), and I witness some strange
> > > > > performance
> > 
> 
> > > > > behaviors with IOR (using ROMIO compiled against PVFS, no
> > > > > kernel
> > 
> 
> > > > > support):
> > 
> 
> > > >
> > 
> 
> > > > > IOR is started on 336 processes (14 nodes), writing
> > > > > 4MB/process
> > > > > in
> > 
> 
> > > > > a
> > 
> 
> > > > > single shared file using MPI-I/O (4MB transfer size also). It
> > 
> 
> > > > > completes 100 iterations.
> > 
> 
> > > >
> > 
> 
> > > > OK, so you have one pvfs client per core. All these are talking
> > > > to
> > 
> 
> > > > two servers.
> > 
> 
> > > >
> > 
> 
> > > > > First every time I start an instance of IOR, the first I/O
> > 
> 
> > > > > operation
> > 
> 
> > > > > is extremely slow. I'm guessing this is because ROMIO has to
> > 
> 
> > > > > initialize everything, get the list of PVFS servers, etc. Is
> > > > > there
> > 
> 
> > > > > a
> > 
> 
> > > > > way to speed this up?
> > 
> 
> > > >
> > 
> 
> > > > ROMIO isn't doing a whole lot here, but there is one thing
> > > > different
> > 
> 
> > > > about ROMIO's 1st call vs the Nth call. The 1st call (first
> > > > time
> > > > any
> > 
> 
> > > > pvfs2 file is opened or deleted), ROMIO will call the function
> > 
> 
> > > > PVFS_util_init_defaults().
> > 
> 
> > > >
> > 
> 
> > > > If you have 336 clients banging away on just two servers, I bet
> > > > that
> > 
> 
> > > > could explain some slowness. In the old days, the PVFS server
> > > > had
> > > > to
> > 
> 
> > > > service these requests one at a time.
> > 
> 
> > > >
> > 
> 
> > > > I don't think this restriction has been relaxed? Since it is a
> > 
> 
> > > > read-only operation, though, it sure seems like one could just
> > > > have
> > 
> 
> > > > servers shovel out pvfs2 configuration information as fast as
> > 
> 
> > > > possible.
> > 
> 
> > > >
> > 
> 
> > > >
> > 
> 
> > > > > Then, I set some delay between each iteration, to better
> > > > > reflect
> > 
> 
> > > > > the
> > 
> 
> > > > > behavior of an actual scientific application.
> > 
> 
> > > >
> > 
> 
> > > > Fun! this is kind of like what MADNESS does. "computes" by
> > > > sleeping
> > 
> 
> > > > for a bit. I think Phil's questions will help us understand the
> > 
> 
> > > > highly variable performance.
> > 
> 
> > > >
> > 
> 
> > > > Can you experiment with IORs collective I/O? by default,
> > > > collective
> > 
> 
> > > > I/O will select one client per node as an "i/o aggregator". The
> > > > IOR
> > 
> 
> > > > workload will not benefit from ROMIO's two-phase optimization,
> > > > but
> > 
> 
> > > > you've got 336 clients banging away on two servers. When I last
> > 
> 
> > > > studied pvfs scalability, 100x more clients than servers wasn't
> > > > a
> > 
> 
> > > > big
> > 
> 
> > > > deal, but 5-6 years ago nodes did not have 24 way parallelism.
> > 
> 
> > > >
> > 
> 
> > > > ==rob
> > 
> 
> > > >
> > 
> 
> > > > --
> > 
> 
> > > > Rob Latham
> > 
> 
> > > > Mathematics and Computer Science Division
> > 
> 
> > > > Argonne National Lab, IL USA
> > 
> 
> > > >
> > 
> 

> > > _______________________________________________
> > 
> 
> > > Pvfs2-users mailing list
> > 
> 
> > > [email protected]
> > 
> 
> > > http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
> > 
> 

> > --
> 
> > Becky Ligon
> 
> > OrangeFS Support and Development
> 
> > Omnibond Systems
> 
> > Anderson, South Carolina
> 

> --
> Becky Ligon
> OrangeFS Support and Development
> Omnibond Systems
> Anderson, South Carolina
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Reply via email to