Hi Andrea, Hmm..Nothing looks out of the ordinary from the config files.. Since you mentioned that the VFS interface does not work, could you confirm if the pvfs system interface based tools work or not? (i.e. pvfs2-fs-dump, pvfs2-ping, pvfs2-ls etc under src/apps/admin) It would be good to narrow down which component(s) is/are causing all these failures...Any other information from the logs (or running all the components with extra verbose logging) could also help narrow down what the issue might be.. BTW, are you using pvfs2 1.4.0 or CVS head? Thanks, Murali
> cat /home/Application/pvfs/conf/pvfs2-fs.conf > <Defaults> > UnexpectedRequests 50 > LogFile /tmp/pvfs2-server.log > EventLogging none > LogStamp usec > BMIModules bmi_tcp > FlowModules flowproto_multiqueue > PerfUpdateInterval 1000 > ServerJobBMITimeoutSecs 30 > ServerJobFlowTimeoutSecs 30 > ClientJobBMITimeoutSecs 300 > ClientJobFlowTimeoutSecs 300 > ClientRetryLimit 5 > ClientRetryDelayMilliSecs 2000 > </Defaults> > > <Aliases> > Alias dom1 tcp://dom1:3334 > Alias dom2 tcp://dom2:3334 > Alias dom3 tcp://dom3:3334 > Alias dom4 tcp://dom4:3334 > Alias om1 tcp://om1:3334 > Alias om2 tcp://om2:3334 > Alias om3 tcp://om3:3334 > Alias om4 tcp://om4:3334 > Alias om5 tcp://om5:3334 > </Aliases> > > <Filesystem> > Name pvfs2-fs > ID 1869706856 > RootHandle 1048576 > <MetaHandleRanges> > Range om1 4-429496732 > </MetaHandleRanges> > <DataHandleRanges> > Range dom1 429496733-858993461 > Range dom2 858993462-1288490190 > Range dom3 1288490191-1717986919 > Range dom4 1717986920-2147483648 > Range om1 2147483649-2576980377 > Range om2 2576980378-3006477106 > Range om3 3006477107-3435973835 > Range om4 3435973836-3865470564 > Range om5 3865470565-4294967293 > </DataHandleRanges> > <StorageHints> > TroveSyncMeta yes > TroveSyncData no > AttrCacheKeywords datafile_handles,metafile_dist > AttrCacheKeywords dir_ent, symlink_target > AttrCacheSize 4093 > AttrCacheMaxNumElems 32768 > </StorageHints> > </Filesystem> > > Om1 is the server/client hostname > cat /home/Application/pvfs/conf/pvfs2-server.conf-om1 > StorageSpace /pvfs2-storage-space > HostID "tcp://om1:3334" > > Om2 is a client hostname > cat /home/Application/pvfs/conf/pvfs2-server.conf-om2 > StorageSpace /pvfs2-storage-space > HostID "tcp://om2:3334" > > > Let me know if you need more informations. > Thanks > Andrea > > ----- Original Message ----- > From: "Murali Vilayannur" <[EMAIL PROTECTED]> > To: "Andrea Carotti" <[EMAIL PROTECTED]> > Cc: <[email protected]> > Sent: Monday, May 22, 2006 5:45 PM > Subject: Re: [Pvfs2-users] pvfs2 stability > > > > Hi Andrea, > > It does look a bit strange to see these messages and yet have the FS > > working.. > > Could you post your fs.conf and server.conf files? > > thanks, > > Murali > > > > On Mon, 22 May 2006, Andrea Carotti wrote: > > > >> Hi all, > >> I'm new to this list and to the pvfs2 program. I'm using it on our home > >> made > >> cluster (9 nodes) running an openMosix kernel 2.4.22-3 and Fedora Core2. > >> I've installed it with one node running as meta server , PVFS2 server > >> and > >> data servers and all the others like data servers. > >> I've also compiled and installed the module. > >> This is my actual configuration: > >> 1)on all nodes I've an entry in /etc/fstab like this: > >> tcp://om1:3334/pvfs2-fs /mnt/pvfs2 pvfs2 default,noauto 0 0 > >> 2)i've added at the rc.local these lines: > >> insmod /lib/modules/2.4.22-oM3src/kernel/fs/pvfs2/pvfs2.o > >> /home/Application/pvfs/sbin/pvfs2-client -p > >> /home/Application/pvfs/sbin/pvfs2-client-core > >> mount -t pvfs2 tcp://om1:3334/pvfs2-fs /mnt/pvfs2 > >> 3) I've enbled the default service for the startup on all the nodes > >> /etc/init.d/pvfs2-server > >> > >> I'm encountering some problems with its usage: > >> if I start the server (/etc/init.d/pvfs2-server start) everything seems > >> ok > >> but on the server the /tmp/pvfs2-client.log appears with this errors: > >> > >> [E 16:57:50.651742] msgpair failed, will retry:: Broken pipe > >> [E 16:57:52.691656] msgpair failed, will retry:: Connection refused > >> [E 16:57:54.731666] msgpair failed, will retry:: Connection refused > >> [E 16:57:56.771657] msgpair failed, will retry:: Connection refused > >> [E 16:57:58.811658] msgpair failed, will retry:: Connection refused > >> [E 16:58:00.851658] msgpair failed, will retry:: Connection refused > >> [E 16:58:00.851731] *** msgpairarray_completion_fn: msgpair to server > >> tcp://om1:3334 failed: Connection refused > >> [E 16:58:00.851750] *** Out of retries. > >> [E 16:58:00.851769] getattr_object_getattr_failure : Connection refused > >> > >> However it seems to work: i can write on the /mnt/pvfs2 , make dirs, and > >> so > >> on with the normal commands cp,mkdir and so on . > >> > >> But during the day something go wrong infact the next day I never can see > >> the /mnt/pvfs2 without restarting the server and looking on the > >> /var/log/messages > >> i see: > >> May 18 23:21:20 om1 kernel: pvfs2: pvfs2_statfs -- wait timed out and > >> retries exhausted. aborting attempt. > >> May 18 23:27:20 om1 kernel: pvfs2: pvfs2_statfs -- wait timed out and > >> retries exhausted. aborting attempt. > >> May 19 01:06:07 om1 kernel: pvfs2: pvfs2_inode_getattr -- wait timed out > >> and > >> retries exhausted. aborting attempt. > >> May 19 04:08:26 om1 kernel: pvfs2: pvfs2_statfs -- wait timed out and > >> retries exhausted. aborting attempt. > >> May 19 04:15:40 om1 kernel: pvfs2: pvfs2_inode_getattr -- wait timed out > >> and > >> retries exhausted. aborting attempt. > >> May 19 23:20:48 om1 kernel: pvfs2: pvfs2_statfs -- wait timed out and > >> retries exhausted. aborting attempt. > >> May 19 23:26:48 om1 kernel: pvfs2: pvfs2_statfs -- wait timed out and > >> retries exhausted. aborting attempt. > >> May 20 01:06:04 om1 kernel: pvfs2: pvfs2_inode_getattr -- wait timed out > >> and > >> retries exhausted. aborting attempt. > >> May 20 04:08:25 om1 kernel: pvfs2: pvfs2_statfs -- wait timed out and > >> retries exhausted. aborting attempt. > >> May 20 04:15:34 om1 kernel: pvfs2: pvfs2_inode_getattr -- wait timed out > >> and > >> retries exhausted. aborting attempt. > >> May 20 23:21:09 om1 kernel: pvfs2: pvfs2_statfs -- wait timed out and > >> retries exhausted. aborting attempt. > >> May 20 23:27:09 om1 kernel: pvfs2: pvfs2_statfs -- wait timed out and > >> retries exhausted. aborting attempt. > >> May 21 01:06:05 om1 kernel: pvfs2: pvfs2_inode_getattr -- wait timed out > >> and > >> retries exhausted. aborting attempt. > >> May 21 04:08:31 om1 kernel: pvfs2: pvfs2_statfs -- wait timed out and > >> retries exhausted. aborting attempt. > >> May 21 04:15:41 om1 kernel: pvfs2: pvfs2_inode_getattr -- wait timed out > >> and > >> retries exhausted. aborting attempt. > >> May 21 23:24:05 om1 kernel: pvfs2: pvfs2_statfs -- wait timed out and > >> retries exhausted. aborting attempt. > >> May 22 01:06:03 om1 kernel: pvfs2: pvfs2_inode_getattr -- wait timed out > >> and > >> retries exhausted. aborting attempt. > >> May 22 04:08:33 om1 kernel: pvfs2: pvfs2_statfs -- wait timed out and > >> retries exhausted. aborting attempt. > >> May 22 04:15:41 om1 kernel: pvfs2: pvfs2_inode_getattr -- wait timed out > >> and > >> retries exhausted. aborting attempt. > >> > >> Same errors at the same time. > >> Sorry for the long message...Hope for someone help > >> Thanks > >> Andrea > >> > >> > >> > >> _______________________________________________ > >> Pvfs2-users mailing list > >> [email protected] > >> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users > >> > >> > > _______________________________________________ > Pvfs2-users mailing list > [email protected] > http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users > > _______________________________________________ Pvfs2-users mailing list [email protected] http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
