Over the past weekend one of my users reported that his compute jobs running on
a server with local disks usually takes about 5 hours. However, running the
same jobs on our small Linux cluster using a PVFS filesystem exceeded 24 hours.
Here is the environment we are using:
1. RHEL 6.4 on PVFS servers and clients.
2. Computations are performed on any of 16 Linux clients, all running
RHEL 6.4.
3. We are running Orangefs-2.8.7.
4. We have 4 PVFS servers, each with an XFS filesystem on a ~35TB RAID 6.
Total PVFS filesystem is 146TB.
5. All components are connected via a 10GigE network.
I started looking for the source of the problem. For the user(s) showing this
poor performance, I found that pvfs-client is using about 65% of the CPU while
the compute jobs themselves are using only 4% each. Thus the compute nodes
are very lightly loaded and the compute jobs are hardly doing anything. The
pvfs2-server process on each PVFS server is using about 140% CPU. No time is
being spent in the wait state (so I assume the speed of the disks are not an
issue). While the system was exhibiting poor performance I tried to
read/write some 10GB files myself and found the performance to be normal for
this system (around 450MB/s). I used 'iperf' to measure the network bandwidth
between the affected nodes and the PVFS serves and found it normal at 9.38Gb/s.
The directories that the users are reading/writing only have a few files in
each.
Iostat shows that the disk system is being constantly read by something as
shown by 'iostat -d 2' on the PVFS servers:
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 0.00 0.00 0.00 0 0
sdb 19.00 4864.00 0.00 9728 0
dm-0 0.00 0.00 0.00 0 0
dm-1 0.00 0.00 0.00 0 0
iostat has looked like over the last 48 hours (since Saturday).
I can not find any documentation on how to get stats directly from pvfs2 so I
tried this command:
pvfs2-statfs -m /pvfs2-mnt
I received these results:
I/O server statistics:
---------------------------------------
server: tcp://dqspvfs01:3334
RAM bytes total : 33619419136
RAM bytes free : 284790784
uptime (seconds) : 14499577
load averages : 0 0 0
handles available: 2305843009213589192
handles total : 2305843009213693950
bytes available : 31456490479616
bytes total : 40000112558080
mode: serving both metadata and I/O data
server: tcp://dqspvfs02:3334
RAM bytes total : 33619419136
RAM bytes free : 217452544
uptime (seconds) : 14499840
load averages : 0 0 0
handles available: 2305843009213589104
handles total : 2305843009213693950
bytes available : 31456971476992
bytes total : 40000112558080
mode: serving both metadata and I/O data
server: tcp://dqspvfs03:3334
RAM bytes total : 33619419136
RAM bytes free : 428965888
uptime (seconds) : 5437269
load averages : 320 192 0
handles available: 2305843009213588929
handles total : 2305843009213693950
bytes available : 31439132123136
bytes total : 40000112558080
mode: serving both metadata and I/O data
server: tcp://dqspvfs04:3334
RAM bytes total : 33619419136
RAM bytes free : 223281152
uptime (seconds) : 10089825
load averages : 1664 3072 0
handles available: 2305843009213588989
handles total : 2305843009213693950
bytes available : 31452933193728
bytes total : 40000112558080
mode: serving both metadata and I/O data
Notice that the 'load averages' are 0 for servers #1 and #2 but not #3 and #4.
Earlier this morning only #4 showed a non-zero load average. The other three
were 0. What does this number mean?
My two theories about the source of the problem are:
1. Someone is doing 'a lot' of tiny reads.
2. Or, based on the load averages the PVFS filesystem is somehow not
balanced. All of the load is on a single server.
How can I prove either of these? Or what other types of diagnostics can I do?
Thank you!
-Roger
-----------------------------------------------
Roger V. Moye
Systems Analyst III
XSEDE Campus Champion
University of Texas - MD Anderson Cancer Center
Division of Quantitative Sciences
Pickens Academic Tower - FCT4.6109
Houston, Texas
(713) 792-2134
-----------------------------------------------------------
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users