It appears the error described below will span across other mounted file
systems on a client when encountered, until the client software is reloaded.
I've got a client with 2 pvfs2 file systems mounted:
/mnt/pvfs2
/mnt/pvfs2-tmp
Both PVFS2 file system configurations contained the following when mounted:
ServerJobBMITimeoutSecs 30
ServerJobFlowTimeoutSecs 30
ClientJobBMITimeoutSecs 300
ClientJobFlowTimeoutSecs 300
ClientRetryLimit 5
ClientRetryDelayMilliSecs 2000
I've dynamically changed the clients timeout settings after the mounts:
[EMAIL PROTECTED] root]# /sbin/sysctl -w pvfs2.op-timeout-secs=5
A pvfs2 server node lost power on the /mnt/pvfs2 file system. After issuing
a "df -h /mnt/pvfs2", the client received a "connection timed-out" error.
[EMAIL PROTECTED] root]# df -h /mnt/pvfs2
Filesystem Size Used Avail Use% Mounted on
df: `/mnt/pvfs2': Connection timed out
An immediate subsequent "df -h /mnt/pvfs2-tmp" also returned "connection
timed out"
[EMAIL PROTECTED] root]# df -h /mnt/pvfs2-tmp
df: `/mnt/pvfs2-tmp': Connection timed out
An unmount of the /mnt/pvfs2 shared works fine.
[EMAIL PROTECTED] root]# umount /mnt/pvfs2
Another subsequent ""df -h /mnt/pvfs2-tmp" still returns "connection timed
out"
[EMAIL PROTECTED] root]# df -h /mnt/pvfs2-tmp
df: `/mnt/pvfs2-tmp': Connection timed out
After unloading the userspace and kernel module, restarting pvfs2 software,
and remounting the /mnt/pvfs2-tmp filesystem, a "df -h /mnt/pvfs2-tmp"
successfully completed
[EMAIL PROTECTED] root]# df -h /mnt/pvfs2-tmp
Filesystem Size Used Avail Use% Mounted on
hostname:3334/pvfs2-fs
1.9T 381G 1.6T 20% /mnt/pvfs2-tmp
The pvfs2 client log contained:
[E 02/22 11:28] msgpair failed, will retry:: Connection refused
[E 02/22 11:28] msgpair failed, will retry:: Connection refused
[E 02/22 11:28] msgpair failed, will retry:: Connection refused
[E 02/22 11:29] msgpair failed, will retry:: Connection refused
[E 02/22 11:29] msgpair failed, will retry:: Connection refused
[E 02/22 11:29] msgpair failed, will retry:: Connection refused
[E 02/22 11:29] *** msgpairarray_completion_fn: msgpair to server
tcp://hvcwydev0329:3334 failed: Connection
refused
[E 02/22 11:29] *** Out of retries.
[E 02/22 11:29] Statfs failed: Connection refused
[E 02/22 11:36] msgpair failed, will retry:: Operation cancelled (possibly
due to timeout)
[E 02/22 11:39] msgpair failed, will retry:: Connection timed out
[E 02/22 11:42] msgpair failed, will retry:: Connection timed out
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers