Hi David,
Just to clarify your setup a bit, you are running two separate sets
of metadata and io servers for the two different mountpoints, so that
for /mnt/pvfs2 you have one set of nodes running the servers, and
for /mnt/pvfs2-tmp you have a completely separate set of server
nodes? And when one of the servers dies from one filesystem, the
other filesystem is unresponsive as well? What does pvfs2-ping
tell you about the status of the servers for /mnt/pvfs2-tmp?
-sam
On Feb 22, 2006, at 12:16 PM, David Metheny wrote:
It appears the error described below will span across other mounted
file
systems on a client when encountered, until the client software is
reloaded.
I've got a client with 2 pvfs2 file systems mounted:
/mnt/pvfs2
/mnt/pvfs2-tmp
Both PVFS2 file system configurations contained the following when
mounted:
ServerJobBMITimeoutSecs 30
ServerJobFlowTimeoutSecs 30
ClientJobBMITimeoutSecs 300
ClientJobFlowTimeoutSecs 300
ClientRetryLimit 5
ClientRetryDelayMilliSecs 2000
I've dynamically changed the clients timeout settings after the
mounts:
[EMAIL PROTECTED] root]# /sbin/sysctl -w pvfs2.op-timeout-secs=5
A pvfs2 server node lost power on the /mnt/pvfs2 file system. After
issuing
a "df -h /mnt/pvfs2", the client received a "connection timed-out"
error.
[EMAIL PROTECTED] root]# df -h /mnt/pvfs2
Filesystem Size Used Avail Use% Mounted on
df: `/mnt/pvfs2': Connection timed out
An immediate subsequent "df -h /mnt/pvfs2-tmp" also returned
"connection
timed out"
[EMAIL PROTECTED] root]# df -h /mnt/pvfs2-tmp
df: `/mnt/pvfs2-tmp': Connection timed out
An unmount of the /mnt/pvfs2 shared works fine.
[EMAIL PROTECTED] root]# umount /mnt/pvfs2
Another subsequent ""df -h /mnt/pvfs2-tmp" still returns
"connection timed
out"
[EMAIL PROTECTED] root]# df -h /mnt/pvfs2-tmp
df: `/mnt/pvfs2-tmp': Connection timed out
After unloading the userspace and kernel module, restarting pvfs2
software,
and remounting the /mnt/pvfs2-tmp filesystem, a "df -h /mnt/pvfs2-tmp"
successfully completed
[EMAIL PROTECTED] root]# df -h /mnt/pvfs2-tmp
Filesystem Size Used Avail Use% Mounted on
hostname:3334/pvfs2-fs
1.9T 381G 1.6T 20% /mnt/pvfs2-tmp
The pvfs2 client log contained:
[E 02/22 11:28] msgpair failed, will retry:: Connection refused
[E 02/22 11:28] msgpair failed, will retry:: Connection refused
[E 02/22 11:28] msgpair failed, will retry:: Connection refused
[E 02/22 11:29] msgpair failed, will retry:: Connection refused
[E 02/22 11:29] msgpair failed, will retry:: Connection refused
[E 02/22 11:29] msgpair failed, will retry:: Connection refused
[E 02/22 11:29] *** msgpairarray_completion_fn: msgpair to server
tcp://hvcwydev0329:3334 failed: Connection
refused
[E 02/22 11:29] *** Out of retries.
[E 02/22 11:29] Statfs failed: Connection refused
[E 02/22 11:36] msgpair failed, will retry:: Operation cancelled
(possibly
due to timeout)
[E 02/22 11:39] msgpair failed, will retry:: Connection timed out
[E 02/22 11:42] msgpair failed, will retry:: Connection timed out
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers