On Tue, Apr 02, 2013 at 04:47:08PM +0200, Matthieu Dorier wrote: > Hi Rob and Phil, > > This thread moved to the ofs-support mailing list (probably because the first > personne to answer was part of this team), but I didn't get much answer to my > problem, so I'll try to summarize here what I have done. > > First to answer Phil, here is the PVFS config file attached, and here is the > script file used for IOR: > > IOR START > testFile = pvfs2:/mnt/pvfs2/testfileA > filePerProc=0 > api=MPIIO > repetitions=100 > verbose=2 > blockSize=4m > transferSize=4m > collective=1 > writeFile=1 > interTestDelay=60 > readFile=0 > RUN > IOR STOP
OH! you are already using collective i/o. Huh. Then I wonder what would happen if you turned it off? > I attach a set of graphics summarizing the experiments (on the x axis it's > the iteration number and on the y axis the aggregate throughput obtained for > this iteration, 100 consecutive iterations are performed). > It seems that the performance follows the law D = a*T+b where D is the > duration of the write, T is the wallclock time since the beginning of the > experiment, and "a" and "b" are constants. > > When I stop IOR and immediately restart it, I get the good performance back, > it does not continue at the reduced performance the previous instance > finished. Huh. Fascinating. Probably why no one has observed this -- typically we benchmark PVFS by running multiple IOR instances. So, what could be going on at the client? This smells like a select() or poll() is getting ever-larger arrays of file descriptors to manage. What else on the client might grow or get slower over time? Could the ever-larger array of file descriptors have something to do with TCP over IB? I guess not, since you already experimented on a pure Ethernet environment. ==rob -- Rob Latham Mathematics and Computer Science Division Argonne National Lab, IL USA _______________________________________________ Pvfs2-users mailing list [email protected] http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
