Re: [Pvfs2-users] Expected/ToExpect vs. real performance ??

Phil Carns Wed, 13 Feb 2008 06:51:32 -0800

Michael Will wrote:

I had a few hours to play on a cluster with 4TB I/O nodes.
Straight dd read / write of a single large file locally on a softwareraid0 of thetwo 6-drive raid5 volumes in each i/o node would give 345 MB/s read and205 MB/s
write throughput.
PVFS single client and single server over single gigabit ethernet wouldresult in 84 MB/s read, 77 MB/s write.
Now I set it up with 8 I/O nodes and 8 clients, the resulting PVFS2filesystem was 35TB large. However when
doing my benchmarking runs I got these errors:

pvfs2: pvfs2_get_sb -- wait timed out; aborting attempt.
pvfs2_get_sb: mount request failed with -110
/var/log/messages:Feb 9 05:59:27 10.54.1.100 n100 pvfs2-server[8620]:segfault at 0000000000000010 rip 0000003b56e6960d rsp 0000007fbffff160error 6/var/log/messages:Feb 9 05:59:38 10.54.1.117 .117 pvfs2_file_read:error in vectored read from handle 1048571, FILE: largefile..117.1/var/log/messages:Feb 9 05:59:38 10.54.1.117 .117 pvfs2_file_read:error in vectored read from handle 1048571, FILE: largefile..117.1/var/log/messages:Feb 9 05:59:38 10.54.1.111 .111 pvfs2_file_read:error in vectored read from handle 1048570, FILE: largefile..111.1/var/log/messages:Feb 9 05:59:38 10.54.1.107 .107 pvfs2_file_read:error in vectored read from handle 1048579, FILE: largefile..107.1/var/log/messages:Feb 9 05:59:38 10.54.1.111 .111 pvfs2_file_read:error in vectored read from handle 1048570, FILE: largefile..111.1/var/log/messages:Feb 9 05:59:38 10.54.1.107 .107 pvfs2_file_read:error in vectored read from handle 1048579, FILE: largefile..107.1/var/log/messages:Feb 9 05:59:38 10.54.1.119 .119 pvfs2_file_read:error in vectored read from handle 1048574, FILE: largefile..119.1/var/log/messages:Feb 9 05:59:38 10.54.1.106 .106 pvfs2_file_write:error in vectored write to handle 1048581, FILE: largefile..106.1/var/log/messages:Feb 9 05:59:38 10.54.1.104 .104 pvfs2_file_read:error in vectored read from handle 1048580, FILE: largefile..104.1/var/log/messages:Feb 9 05:59:38 10.54.1.104 .104 pvfs2_file_read:error in vectored read from handle 1048580, FILE: largefile..104.1/var/log/messages:Feb 9 05:59:38 10.54.1.103 .103 pvfs2_file_read:error in vectored read from handle 1048573, FILE: largefile..103.1/var/log/messages:Feb 9 05:59:38 10.54.1.103 .103 pvfs2_file_read:error in vectored read from handle 1048573, FILE: largefile..103.1/var/log/messages:Feb 9 05:59:38 10.54.1.109 .109 pvfs2_file_read:error in vectored read from handle 1048572, FILE: largefile..109.1/var/log/messages:Feb 9 05:59:38 10.54.1.109 .109 pvfs2_file_read:error in vectored read from handle 1048572, FILE: largefile..109.1/var/log/messages:Feb 9 05:59:38 10.54.1.106 .106 pvfs2_file_write:error in vectored write to handle 1048581, FILE: largefile..106.1/var/log/messages:Feb 9 06:08:18 10.54.1.118 .118 pvfs2:pvfs2_fs_umount -- wait timed out; aborting attempt.
Unfortunately I did not get to play with this anymore since these wherecustomers systems that needed to be cleaned up and shipped,so I cannot do any additional troubleshooting or find out whypvfs2_server died with a segfault on node n100, but it could have to dowith the naming scheme on the cluster (node 100 has hostname .100 and analias n100, n101 is .101 etc.).
The size of the filesystem should not have been an issue, right?
Michael


Hi Michael,

The size of the file system is fine, and the naming scheme should be oktoo. Unfortunately I don't think there is much way to tell whathappened to the server in this case. If you see this again in thefuture you may need to try to either get a stack trace from a servercore file or else turn on verbose eventlogging in the serverconfiguration to see what happened.


-Phil
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Re: [Pvfs2-users] Expected/ToExpect vs. real performance ??

Reply via email to