Hi David,

I tried to reproduce your results with the 2.6 kernel, and wasn't able to. Are you using 2.4? Also, I didn't actually pull the plug on one of the nodes, I just killed the server, but that should be close enough to your test case unless you're routing stuff through that node ;-).

-sam

On Feb 22, 2006, at 12:16 PM, David Metheny wrote:

It appears the error described below will span across other mounted file systems on a client when encountered, until the client software is reloaded.


I've got a client with 2 pvfs2 file systems mounted:

        /mnt/pvfs2
        /mnt/pvfs2-tmp

Both PVFS2 file system configurations contained the following when mounted:
        ServerJobBMITimeoutSecs 30
        ServerJobFlowTimeoutSecs 30
        ClientJobBMITimeoutSecs 300
        ClientJobFlowTimeoutSecs 300
        ClientRetryLimit 5
        ClientRetryDelayMilliSecs 2000

I've dynamically changed the clients timeout settings after the mounts:
        [EMAIL PROTECTED] root]# /sbin/sysctl -w pvfs2.op-timeout-secs=5

A pvfs2 server node lost power on the /mnt/pvfs2 file system. After issuing a "df -h /mnt/pvfs2", the client received a "connection timed-out" error.

        [EMAIL PROTECTED] root]# df -h /mnt/pvfs2
        Filesystem            Size  Used Avail Use% Mounted on
        df: `/mnt/pvfs2': Connection timed out

An immediate subsequent "df -h /mnt/pvfs2-tmp" also returned "connection
timed out"
        [EMAIL PROTECTED] root]# df -h /mnt/pvfs2-tmp
        df: `/mnt/pvfs2-tmp': Connection timed out

An unmount of the /mnt/pvfs2 shared works fine.
        [EMAIL PROTECTED] root]# umount /mnt/pvfs2

Another subsequent ""df -h /mnt/pvfs2-tmp" still returns "connection timed
out"
        [EMAIL PROTECTED] root]# df -h /mnt/pvfs2-tmp
        df: `/mnt/pvfs2-tmp': Connection timed out

After unloading the userspace and kernel module, restarting pvfs2 software,
and remounting the /mnt/pvfs2-tmp filesystem, a "df -h /mnt/pvfs2-tmp"
successfully completed
[EMAIL PROTECTED] root]# df -h /mnt/pvfs2-tmp
Filesystem            Size  Used Avail Use% Mounted on
hostname:3334/pvfs2-fs
                      1.9T  381G  1.6T  20% /mnt/pvfs2-tmp


The pvfs2 client log contained:
[E 02/22 11:28] msgpair failed, will retry:: Connection refused
[E 02/22 11:28] msgpair failed, will retry:: Connection refused
[E 02/22 11:28] msgpair failed, will retry:: Connection refused
[E 02/22 11:29] msgpair failed, will retry:: Connection refused
[E 02/22 11:29] msgpair failed, will retry:: Connection refused
[E 02/22 11:29] msgpair failed, will retry:: Connection refused
[E 02/22 11:29] *** msgpairarray_completion_fn: msgpair to server
tcp://hvcwydev0329:3334 failed: Connection
 refused
[E 02/22 11:29] *** Out of retries.
[E 02/22 11:29] Statfs failed: Connection refused
[E 02/22 11:36] msgpair failed, will retry:: Operation cancelled (possibly
due to timeout)
[E 02/22 11:39] msgpair failed, will retry:: Connection timed out
[E 02/22 11:42] msgpair failed, will retry:: Connection timed out

_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers


_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Reply via email to