On May 13, 2008, at 4:22 PM, belcampo wrote:
Rob Ross wrote:
Hi Henk,
Please be sure to CC pvfs2-users on future emails.
Sorry stupid I didin't do that.
Easy to do; no worries.
Without any additional information, my guess is that every
application you're using in this workflow performs very small I/Os.
These operations are passed into the kernel, back out to pvfs2-
client, across the network and received by the PVFS server, who
then performs I/O on your application's behalf. If operations are
particularly small, this can be a lot of overhead.
Top tells me that server-side 99.6% idle client-side 95% idle, how
could I determine what is causing the abnormal delays.
Starting to play a dvd takes about 12 secs. After a few seconds it
starts stuttering.
By nfs of 1 of the servers it takes about 1 sec to start and never
stutters.
Heh well that's a little different -- that's a read workload. The NFS
client is reading ahead.
Have a look at this:
http://www.pvfs.org/cvs/pvfs-2-7-branch.build/doc/pvfs2-faq/pvfs2-faq.php#SECTION00074000000000000000
and this:
http://www.pvfs.org/cvs/pvfs-2-7-branch.build/doc/pvfs2-faq/pvfs2-faq.php#SECTION00077000000000000000
Also this email and specifically the immutable option; you could set
this on your files after you are done ripping and encoding:
http://www.beowulf-underground.org/pipermail/pvfs2-developers/2006-September/002688.html
You'd probably want to use the pvfs2-xattr utility to set the
attribute so you don't have to sudo it.
Other networked file systems can hide some of this latency by
caching data (either coherently or not) on the client. PVFS does
not do this, so each little operation goes across the wire.
Can this be investigated with some networktool and if yes, how ?
There's really no advantage to using a parallel file system for the
workload you have described,
But should the disadvantage be in this order of magnitude ?
Apparently :). You could strace the app to see how big/small the IOs
are. Some apps have options for block sizes for IO that can be used to
improve performance. Also, there's no reason to bother with striping
files in this case, since you're accessing serially. You should set
the the number of datafiles (objects holding data) to 1 on the
directory you're storing into:
setfattr -n "user.pvfs2.num_dfiles" -v "1" /mnt/pvfs2/directory
unless you're planning on having a lot of systems doing this
process in parallel and want a single place to store the output.
What sort of network do you have in this system? What sort of nodes
are you using for the PVFS servers?
All AMD 4000+ systems with 1Gb networkcards and 320GB disk in each
or them.
Copying from to clients to these 3 servers is > 100MB/sec pretty
close to what Gb ethernet can do.
In what context do you get that performance? How do tools like pvfs2-
cp compare in performance?
Rob
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users