Hi David,
I tried to reproduce your results with the 2.6 kernel, and wasn't
able to. Are you using 2.4? Also, I didn't actually pull the plug
on one of the nodes, I just killed the server, but that should be
close enough to your test case unless you're routing stuff through
that node ;-).
-sam
On Feb 22, 2006, at 12:16 PM, David Metheny wrote:
It appears the error described below will span across other mounted
file
systems on a client when encountered, until the client software is
reloaded.
I've got a client with 2 pvfs2 file systems mounted:
/mnt/pvfs2
/mnt/pvfs2-tmp
Both PVFS2 file system configurations contained the following when
mounted:
ServerJobBMITimeoutSecs 30
ServerJobFlowTimeoutSecs 30
ClientJobBMITimeoutSecs 300
ClientJobFlowTimeoutSecs 300
ClientRetryLimit 5
ClientRetryDelayMilliSecs 2000
I've dynamically changed the clients timeout settings after the
mounts:
[EMAIL PROTECTED] root]# /sbin/sysctl -w pvfs2.op-timeout-secs=5
A pvfs2 server node lost power on the /mnt/pvfs2 file system. After
issuing
a "df -h /mnt/pvfs2", the client received a "connection timed-out"
error.
[EMAIL PROTECTED] root]# df -h /mnt/pvfs2
Filesystem Size Used Avail Use% Mounted on
df: `/mnt/pvfs2': Connection timed out
An immediate subsequent "df -h /mnt/pvfs2-tmp" also returned
"connection
timed out"
[EMAIL PROTECTED] root]# df -h /mnt/pvfs2-tmp
df: `/mnt/pvfs2-tmp': Connection timed out
An unmount of the /mnt/pvfs2 shared works fine.
[EMAIL PROTECTED] root]# umount /mnt/pvfs2
Another subsequent ""df -h /mnt/pvfs2-tmp" still returns
"connection timed
out"
[EMAIL PROTECTED] root]# df -h /mnt/pvfs2-tmp
df: `/mnt/pvfs2-tmp': Connection timed out
After unloading the userspace and kernel module, restarting pvfs2
software,
and remounting the /mnt/pvfs2-tmp filesystem, a "df -h /mnt/pvfs2-tmp"
successfully completed
[EMAIL PROTECTED] root]# df -h /mnt/pvfs2-tmp
Filesystem Size Used Avail Use% Mounted on
hostname:3334/pvfs2-fs
1.9T 381G 1.6T 20% /mnt/pvfs2-tmp
The pvfs2 client log contained:
[E 02/22 11:28] msgpair failed, will retry:: Connection refused
[E 02/22 11:28] msgpair failed, will retry:: Connection refused
[E 02/22 11:28] msgpair failed, will retry:: Connection refused
[E 02/22 11:29] msgpair failed, will retry:: Connection refused
[E 02/22 11:29] msgpair failed, will retry:: Connection refused
[E 02/22 11:29] msgpair failed, will retry:: Connection refused
[E 02/22 11:29] *** msgpairarray_completion_fn: msgpair to server
tcp://hvcwydev0329:3334 failed: Connection
refused
[E 02/22 11:29] *** Out of retries.
[E 02/22 11:29] Statfs failed: Connection refused
[E 02/22 11:36] msgpair failed, will retry:: Operation cancelled
(possibly
due to timeout)
[E 02/22 11:39] msgpair failed, will retry:: Connection timed out
[E 02/22 11:42] msgpair failed, will retry:: Connection timed out
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers