Roger: I have also seen some codes that read/write one byte at a time, which is not appropriate for a parallel filesystem. Try this: While the user's process is running, attach to it with strace and see what kind of read/writes are being issued.
Becky On Mon, Dec 16, 2013 at 1:57 PM, Becky Ligon <[email protected]> wrote: > Roger: > > Are all of your filesystem servers ALSO metadata servers? > > Becky > > > On Mon, Dec 16, 2013 at 1:18 PM, Kyle Schochenmaier <[email protected]>wrote: > >> There are some tuning params that you can look into here, by default >> there is a round robin loading on the servers and that is done in chunks of >> FlowBufferSize (iirc?), you can set this in your config file but by default >> the size is quite small (64k) and I've pushed it up over 1-2MB and seen >> drastic improvements in bandwidth for larger requests; but if you're doing >> tons of small requests this obviously wont help. >> >> Can you attach your config file so we can see how things are setup? >> >> >> >> Kyle Schochenmaier >> >> >> On Mon, Dec 16, 2013 at 11:57 AM, Moye,Roger V <[email protected]>wrote: >> >>> >>> >>> Over the past weekend one of my users reported that his compute jobs >>> running on a server with local disks usually takes about 5 hours. However, >>> running the same jobs on our small Linux cluster using a PVFS filesystem >>> exceeded 24 hours. >>> >>> >>> >>> Here is the environment we are using: >>> >>> 1. RHEL 6.4 on PVFS servers and clients. >>> >>> 2. Computations are performed on any of 16 Linux clients, all >>> running RHEL 6.4. >>> >>> 3. We are running Orangefs-2.8.7. >>> >>> 4. We have 4 PVFS servers, each with an XFS filesystem on a ~35TB >>> RAID 6. Total PVFS filesystem is 146TB. >>> >>> 5. All components are connected via a 10GigE network. >>> >>> >>> >>> I started looking for the source of the problem. For the user(s) >>> showing this poor performance, I found that pvfs-client is using about 65% >>> of the CPU while the compute jobs themselves are using only 4% each. >>> Thus the compute nodes are very lightly loaded and the compute jobs are >>> hardly doing anything. The pvfs2-server process on each PVFS server is >>> using about 140% CPU. No time is being spent in the wait state (so I >>> assume the speed of the disks are not an issue). While the system was >>> exhibiting poor performance I tried to read/write some 10GB files myself >>> and found the performance to be normal for this system (around 450MB/s). >>> I used ‘iperf’ to measure the network bandwidth between the affected >>> nodes and the PVFS serves and found it normal at 9.38Gb/s. The directories >>> that the users are reading/writing only have a few files in each. >>> >>> >>> >>> Iostat shows that the disk system is being constantly read by something >>> as shown by ‘iostat –d 2’ on the PVFS servers: >>> >>> Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn >>> >>> sda 0.00 0.00 0.00 0 0 >>> >>> sdb 19.00 4864.00 0.00 9728 0 >>> >>> dm-0 0.00 0.00 0.00 0 0 >>> >>> dm-1 0.00 0.00 0.00 0 0 >>> >>> >>> >>> iostat has looked like over the last 48 hours (since Saturday). >>> >>> >>> >>> I can not find any documentation on how to get stats directly from pvfs2 >>> so I tried this command: >>> >>> pvfs2-statfs –m /pvfs2-mnt >>> >>> >>> >>> I received these results: >>> >>> I/O server statistics: >>> >>> --------------------------------------- >>> >>> >>> >>> server: tcp://dqspvfs01:3334 >>> >>> RAM bytes total : 33619419136 >>> >>> RAM bytes free : 284790784 >>> >>> uptime (seconds) : 14499577 >>> >>> load averages : 0 0 0 >>> >>> handles available: 2305843009213589192 >>> >>> handles total : 2305843009213693950 >>> >>> bytes available : 31456490479616 >>> >>> bytes total : 40000112558080 >>> >>> mode: serving both metadata and I/O data >>> >>> >>> >>> server: tcp://dqspvfs02:3334 >>> >>> RAM bytes total : 33619419136 >>> >>> RAM bytes free : 217452544 >>> >>> uptime (seconds) : 14499840 >>> >>> load averages : 0 0 0 >>> >>> handles available: 2305843009213589104 >>> >>> handles total : 2305843009213693950 >>> >>> bytes available : 31456971476992 >>> >>> bytes total : 40000112558080 >>> >>> mode: serving both metadata and I/O data >>> >>> >>> >>> server: tcp://dqspvfs03:3334 >>> >>> RAM bytes total : 33619419136 >>> >>> RAM bytes free : 428965888 >>> >>> uptime (seconds) : 5437269 >>> >>> load averages : 320 192 0 >>> >>> handles available: 2305843009213588929 >>> >>> handles total : 2305843009213693950 >>> >>> bytes available : 31439132123136 >>> >>> bytes total : 40000112558080 >>> >>> mode: serving both metadata and I/O data >>> >>> >>> >>> server: tcp://dqspvfs04:3334 >>> >>> RAM bytes total : 33619419136 >>> >>> RAM bytes free : 223281152 >>> >>> uptime (seconds) : 10089825 >>> >>> load averages : 1664 3072 0 >>> >>> handles available: 2305843009213588989 >>> >>> handles total : 2305843009213693950 >>> >>> bytes available : 31452933193728 >>> >>> bytes total : 40000112558080 >>> >>> mode: serving both metadata and I/O data >>> >>> >>> >>> Notice that the ‘load averages’ are 0 for servers #1 and #2 but not #3 >>> and #4. Earlier this morning only #4 showed a non-zero load average. The >>> other three were 0. What does this number mean? >>> >>> >>> >>> My two theories about the source of the problem are: >>> >>> 1. Someone is doing ‘a lot’ of tiny reads. >>> >>> 2. Or, based on the load averages the PVFS filesystem is somehow >>> not balanced. All of the load is on a single server. >>> >>> >>> >>> How can I prove either of these? Or what other types of diagnostics can >>> I do? >>> >>> >>> >>> Thank you! >>> >>> -Roger >>> >>> >>> >>> ----------------------------------------------- >>> >>> Roger V. Moye >>> >>> Systems Analyst III >>> >>> XSEDE Campus Champion >>> >>> University of Texas - MD Anderson Cancer Center >>> >>> Division of Quantitative Sciences >>> >>> Pickens Academic Tower - FCT4.6109 >>> >>> Houston, Texas >>> >>> (713) 792-2134 >>> >>> ----------------------------------------------------------- >>> >>> >>> >>> _______________________________________________ >>> Pvfs2-users mailing list >>> [email protected] >>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users >>> >>> >> >> _______________________________________________ >> Pvfs2-users mailing list >> [email protected] >> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users >> >> > > > -- > Becky Ligon > OrangeFS Support and Development > Omnibond Systems > Anderson, South Carolina > > -- Becky Ligon OrangeFS Support and Development Omnibond Systems Anderson, South Carolina
_______________________________________________ Pvfs2-users mailing list [email protected] http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
