> -----Original Message----- > From: Sam Lang [mailto:[EMAIL PROTECTED] > Sent: Wednesday, February 22, 2006 12:47 PM > To: [EMAIL PROTECTED] > Cc: [email protected] > Subject: Re: [Pvfs2-developers] Problem with multiple pvfs2 > file systems mounted on a single client > > > Hi David, > > Just to clarify your setup a bit, you are running two > separate sets of metadata and io servers for the two > different mountpoints, so that for /mnt/pvfs2 you have one > set of nodes running the servers, and for /mnt/pvfs2-tmp you > have a completely separate set of server nodes? Yes. The two file systems have unique sets of nodes.
> And when one > of the servers dies from one filesystem, the > other filesystem is unresponsive as well? Yes. > What does pvfs2-ping > tell you about the status of the servers for /mnt/pvfs2-tmp? The pvfs2-ping reports the system is up and running: [EMAIL PROTECTED] root]# pvfs2-ping -m /mnt/pvfs2-tmp (1) Parsing tab file... (2) Initializing system interface... (3) Initializing each file system found in tab file: /etc/mtab... /mnt/pvfs2-tmp: Ok /mnt/pvfs2: Ok (4) Searching for /mnt/pvfs2-tmp in pvfstab... PVFS2 servers: tcp://hvcwydev0380:3334 Storage name: pvfs2-fs Local mount point: /mnt/pvfs2-tmp PVFS2 servers: tcp://hvcwydev0328:3334 Storage name: pvfs2-fs Local mount point: /mnt/pvfs2 meta servers: tcp://hvcwydev0380:3334 tcp://hvcwydev0381:3334 tcp://hvcwydev0382:3334 tcp://hvcwydev0383:3334 tcp://hvcwydev0384:3334 tcp://hvcwydev0385:3334 tcp://hvcwydev0386:3334 tcp://hvcwydev0387:3334 tcp://hvcwydev0388:3334 tcp://hvcwydev0389:3334 tcp://hvcwydev0390:3334 tcp://hvcwydev0391:3334 tcp://hvcwydev0392:3334 tcp://hvcwydev0393:3334 tcp://hvcwydev0394:3334 data servers: tcp://hvcwydev0380:3334 tcp://hvcwydev0381:3334 tcp://hvcwydev0382:3334 tcp://hvcwydev0383:3334 tcp://hvcwydev0384:3334 tcp://hvcwydev0385:3334 tcp://hvcwydev0386:3334 tcp://hvcwydev0387:3334 tcp://hvcwydev0388:3334 tcp://hvcwydev0389:3334 tcp://hvcwydev0390:3334 tcp://hvcwydev0391:3334 tcp://hvcwydev0392:3334 tcp://hvcwydev0393:3334 tcp://hvcwydev0394:3334 (5) Verifying that all servers are responding... meta servers: tcp://hvcwydev0380:3334 Ok tcp://hvcwydev0381:3334 Ok tcp://hvcwydev0382:3334 Ok tcp://hvcwydev0383:3334 Ok tcp://hvcwydev0384:3334 Ok tcp://hvcwydev0385:3334 Ok tcp://hvcwydev0386:3334 Ok tcp://hvcwydev0387:3334 Ok tcp://hvcwydev0388:3334 Ok tcp://hvcwydev0389:3334 Ok tcp://hvcwydev0390:3334 Ok tcp://hvcwydev0391:3334 Ok tcp://hvcwydev0392:3334 Ok tcp://hvcwydev0393:3334 Ok tcp://hvcwydev0394:3334 Ok data servers: tcp://hvcwydev0380:3334 Ok tcp://hvcwydev0381:3334 Ok tcp://hvcwydev0382:3334 Ok tcp://hvcwydev0383:3334 Ok tcp://hvcwydev0384:3334 Ok tcp://hvcwydev0385:3334 Ok tcp://hvcwydev0386:3334 Ok tcp://hvcwydev0387:3334 Ok tcp://hvcwydev0388:3334 Ok tcp://hvcwydev0389:3334 Ok tcp://hvcwydev0390:3334 Ok tcp://hvcwydev0391:3334 Ok tcp://hvcwydev0392:3334 Ok tcp://hvcwydev0393:3334 Ok tcp://hvcwydev0394:3334 Ok (6) Verifying that fsid 115831708 is acceptable to all servers... Ok; all servers understand fs_id 115831708 (7) Verifying that root handle is owned by one server... Root handle: 1048576 Ok; root handle is owned by exactly one server. ============================================================= The PVFS filesystem at /mnt/pvfs2-tmp appears to be correctly configured. > > -sam > > On Feb 22, 2006, at 12:16 PM, David Metheny wrote: > > > It appears the error described below will span across other mounted > > file systems on a client when encountered, until the client > software > > is reloaded. > > > > > > I've got a client with 2 pvfs2 file systems mounted: > > > > /mnt/pvfs2 > > /mnt/pvfs2-tmp > > > > Both PVFS2 file system configurations contained the following when > > mounted: > > ServerJobBMITimeoutSecs 30 > > ServerJobFlowTimeoutSecs 30 > > ClientJobBMITimeoutSecs 300 > > ClientJobFlowTimeoutSecs 300 > > ClientRetryLimit 5 > > ClientRetryDelayMilliSecs 2000 > > > > I've dynamically changed the clients timeout settings after the > > mounts: > > [EMAIL PROTECTED] root]# /sbin/sysctl -w pvfs2.op-timeout-secs=5 > > > > A pvfs2 server node lost power on the /mnt/pvfs2 file system. After > > issuing a "df -h /mnt/pvfs2", the client received a "connection > > timed-out" > > error. > > > > [EMAIL PROTECTED] root]# df -h /mnt/pvfs2 > > Filesystem Size Used Avail Use% Mounted on > > df: `/mnt/pvfs2': Connection timed out > > > > An immediate subsequent "df -h /mnt/pvfs2-tmp" also returned > > "connection timed out" > > [EMAIL PROTECTED] root]# df -h /mnt/pvfs2-tmp > > df: `/mnt/pvfs2-tmp': Connection timed out > > > > An unmount of the /mnt/pvfs2 shared works fine. > > [EMAIL PROTECTED] root]# umount /mnt/pvfs2 > > > > Another subsequent ""df -h /mnt/pvfs2-tmp" still returns > "connection > > timed out" > > [EMAIL PROTECTED] root]# df -h /mnt/pvfs2-tmp > > df: `/mnt/pvfs2-tmp': Connection timed out > > > > After unloading the userspace and kernel module, restarting pvfs2 > > software, and remounting the /mnt/pvfs2-tmp filesystem, a "df -h > > /mnt/pvfs2-tmp" > > successfully completed > > [EMAIL PROTECTED] root]# df -h /mnt/pvfs2-tmp > > Filesystem Size Used Avail Use% Mounted on > > hostname:3334/pvfs2-fs > > 1.9T 381G 1.6T 20% /mnt/pvfs2-tmp > > > > > > The pvfs2 client log contained: > > [E 02/22 11:28] msgpair failed, will retry:: Connection refused [E > > 02/22 11:28] msgpair failed, will retry:: Connection > refused [E 02/22 > > 11:28] msgpair failed, will retry:: Connection refused [E > 02/22 11:29] > > msgpair failed, will retry:: Connection refused [E 02/22 11:29] > > msgpair failed, will retry:: Connection refused [E 02/22 11:29] > > msgpair failed, will retry:: Connection refused [E 02/22 11:29] *** > > msgpairarray_completion_fn: msgpair to server > > tcp://hvcwydev0329:3334 failed: Connection refused [E 02/22 11:29] > > *** Out of retries. > > [E 02/22 11:29] Statfs failed: Connection refused [E 02/22 11:36] > > msgpair failed, will retry:: Operation cancelled (possibly due to > > timeout) [E 02/22 11:39] msgpair failed, will retry:: > Connection timed > > out [E 02/22 11:42] msgpair failed, will retry:: Connection > timed out > > > > _______________________________________________ > > Pvfs2-developers mailing list > > [email protected] > > http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers > > > _______________________________________________ Pvfs2-developers mailing list [email protected] http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
