Hi all and thanks a lot for your hints..
However the fact that the errors occured more or less at the same time made me think to a problem related to some cron activities...
Fedora comes with some cron jobs activated, in particular the cron.daily:
00-logwatch
0anacron
logrotate
makewhatis.cron
prelink
rpm
slocate.cron
tetex.cron
tmpwatch
yum.cron
So now I've removed all this activities and this morning the server is alive and all the related errors are disappeared. The fortran errors i suppose are of some wrong settings in the root .tcshrc.. If you can confirm me that the cron activities could be the cause of my problems, i will be happy to hear it...
thanks
Andrea

----- Original Message ----- From: "Murali Vilayannur" <[EMAIL PROTECTED]>
To: "Andrea Carotti" <[EMAIL PROTECTED]>
Cc: <[email protected]>
Sent: Monday, May 22, 2006 6:40 PM
Subject: Re: [Pvfs2-users] pvfs2 stability



Hi Andrea,
I have no clue what these messages are.. Perhaps related to some fortran
program error messages or something..
Can you try CVS head and see if these errors persists?
Thanks,
Murali

On Mon, 22 May 2006, Andrea Carotti wrote:

Dear Mr Murali,
I'm using the 1.4.0 version.
everything seems to work: pvfs2-fs-dump, pvfs2-ping, pvfs2-ls...and also the
typical commands like mkdir and cp works well on all the nodes and on the
server.
One strange thing is also happening at the login:
[EMAIL PROTECTED]:~]su
Password:
open: No such file or directory
apparent state: unit 27 named -e
lately writing direct unformatted external IO
Segnale di annullamento
open: No such file or directory
apparent state: unit 27 named -f
lately writing direct unformatted external IO
Segnale di annullamento
open: No such file or directory
apparent state: unit 27 named -f
lately writing direct unformatted external IO
Segnale di annullamento
open: No such file or directory
apparent state: unit 27 named -f
lately writing direct unformatted external IO
Segnale di annullamento
open: No such file or directory
apparent state: unit 27 named -f
lately writing direct unformatted external IO
Segnale di annullamento
open: No such file or directory
apparent state: unit 27 named -e
lately writing direct unformatted external IO
Segnale di annullamento

thanks a lot
Andrea
----- Original Message -----
From: "Murali Vilayannur" <[EMAIL PROTECTED]>
To: "Andrea Carotti" <[EMAIL PROTECTED]>
Cc: <[email protected]>
Sent: Monday, May 22, 2006 6:07 PM
Subject: Re: [Pvfs2-users] pvfs2 stability


> Hi Andrea,
> Hmm..Nothing looks out of the ordinary from the config files..
> Since you mentioned that the VFS interface does not work, could you
> confirm if the pvfs system interface based tools work or not?
> (i.e. pvfs2-fs-dump, pvfs2-ping, pvfs2-ls etc under src/apps/admin)
> It would be good to narrow down which component(s) is/are causing all
> these failures...Any other information from the logs (or running all > the
> components with extra verbose logging) could also help narrow down what
> the issue might be.. BTW, are you using pvfs2 1.4.0 or CVS head?
> Thanks,
> Murali
>
>> cat /home/Application/pvfs/conf/pvfs2-fs.conf
>> <Defaults>
>>         UnexpectedRequests 50
>>         LogFile /tmp/pvfs2-server.log
>>         EventLogging none
>>         LogStamp usec
>>         BMIModules bmi_tcp
>>         FlowModules flowproto_multiqueue
>>         PerfUpdateInterval 1000
>>         ServerJobBMITimeoutSecs 30
>>         ServerJobFlowTimeoutSecs 30
>>         ClientJobBMITimeoutSecs 300
>>         ClientJobFlowTimeoutSecs 300
>>         ClientRetryLimit 5
>>         ClientRetryDelayMilliSecs 2000
>> </Defaults>
>>
>> <Aliases>
>>         Alias dom1 tcp://dom1:3334
>>         Alias dom2 tcp://dom2:3334
>>         Alias dom3 tcp://dom3:3334
>>         Alias dom4 tcp://dom4:3334
>>         Alias om1 tcp://om1:3334
>>         Alias om2 tcp://om2:3334
>>         Alias om3 tcp://om3:3334
>>         Alias om4 tcp://om4:3334
>>         Alias om5 tcp://om5:3334
>> </Aliases>
>>
>> <Filesystem>
>>         Name pvfs2-fs
>>         ID 1869706856
>>         RootHandle 1048576
>>         <MetaHandleRanges>
>>                 Range om1 4-429496732
>>         </MetaHandleRanges>
>>         <DataHandleRanges>
>>                 Range dom1 429496733-858993461
>>                 Range dom2 858993462-1288490190
>>                 Range dom3 1288490191-1717986919
>>                 Range dom4 1717986920-2147483648
>>                 Range om1 2147483649-2576980377
>>                 Range om2 2576980378-3006477106
>>                 Range om3 3006477107-3435973835
>>                 Range om4 3435973836-3865470564
>>                 Range om5 3865470565-4294967293
>>         </DataHandleRanges>
>>         <StorageHints>
>>                 TroveSyncMeta yes
>>                 TroveSyncData no
>>                 AttrCacheKeywords datafile_handles,metafile_dist
>>                 AttrCacheKeywords dir_ent, symlink_target
>>                 AttrCacheSize 4093
>>                 AttrCacheMaxNumElems 32768
>>         </StorageHints>
>> </Filesystem>
>>
>> Om1 is the server/client hostname
>> cat /home/Application/pvfs/conf/pvfs2-server.conf-om1
>> StorageSpace /pvfs2-storage-space
>> HostID "tcp://om1:3334"
>>
>> Om2 is a client hostname
>> cat /home/Application/pvfs/conf/pvfs2-server.conf-om2
>> StorageSpace /pvfs2-storage-space
>> HostID "tcp://om2:3334"
>>
>>
>> Let me know if you need more informations.
>> Thanks
>> Andrea
>>
>> ----- Original Message -----
>> From: "Murali Vilayannur" <[EMAIL PROTECTED]>
>> To: "Andrea Carotti" <[EMAIL PROTECTED]>
>> Cc: <[email protected]>
>> Sent: Monday, May 22, 2006 5:45 PM
>> Subject: Re: [Pvfs2-users] pvfs2 stability
>>
>>
>> > Hi Andrea,
>> > It does look a bit strange to see these messages and yet have the FS
>> > working..
>> > Could you post your fs.conf and server.conf files?
>> > thanks,
>> > Murali
>> >
>> > On Mon, 22 May 2006, Andrea Carotti wrote:
>> >
>> >> Hi all,
>> >> I'm new to this list and to the pvfs2 program. I'm using it on our
>> >> home
>> >> made
>> >> cluster (9 nodes) running an openMosix kernel 2.4.22-3 and Fedora
>> >> Core2.
>> >> I've installed it with one node running as meta server , PVFS2 >> >> server
>> >> and
>> >> data servers and all the others like data servers.
>> >> I've also compiled and installed the module.
>> >> This is my actual configuration:
>> >> 1)on all nodes I've an entry in /etc/fstab like this:
>> >> tcp://om1:3334/pvfs2-fs /mnt/pvfs2 pvfs2 default,noauto 0 0
>> >> 2)i've added at the rc.local these lines:
>> >> insmod /lib/modules/2.4.22-oM3src/kernel/fs/pvfs2/pvfs2.o
>> >> /home/Application/pvfs/sbin/pvfs2-client -p
>> >> /home/Application/pvfs/sbin/pvfs2-client-core
>> >> mount -t pvfs2 tcp://om1:3334/pvfs2-fs /mnt/pvfs2
>> >> 3) I've enbled the default service for the startup on all the nodes
>> >> /etc/init.d/pvfs2-server
>> >>
>> >> I'm encountering some problems with its usage:
>> >> if I start the server (/etc/init.d/pvfs2-server start) everything
>> >> seems
>> >> ok
>> >> but on the server the /tmp/pvfs2-client.log appears with this >> >> errors:
>> >>
>> >> [E 16:57:50.651742] msgpair failed, will retry:: Broken pipe
>> >> [E 16:57:52.691656] msgpair failed, will retry:: Connection refused
>> >> [E 16:57:54.731666] msgpair failed, will retry:: Connection refused
>> >> [E 16:57:56.771657] msgpair failed, will retry:: Connection refused
>> >> [E 16:57:58.811658] msgpair failed, will retry:: Connection refused
>> >> [E 16:58:00.851658] msgpair failed, will retry:: Connection refused
>> >> [E 16:58:00.851731] *** msgpairarray_completion_fn: msgpair to >> >> server
>> >> tcp://om1:3334 failed: Connection refused
>> >> [E 16:58:00.851750] *** Out of retries.
>> >> [E 16:58:00.851769] getattr_object_getattr_failure : Connection
>> >> refused
>> >>
>> >> However it seems to work: i can write on the /mnt/pvfs2 , make >> >> dirs,
>> >> and
>> >> so
>> >> on with the normal commands cp,mkdir and so on .
>> >>
>> >> But during the day something go wrong infact the next day I never >> >> can
>> >> see
>> >> the /mnt/pvfs2 without restarting the server and looking on the
>> >> /var/log/messages
>> >> i see:
>> >> May 18 23:21:20 om1 kernel: pvfs2: pvfs2_statfs -- wait timed out >> >> and
>> >> retries exhausted. aborting attempt.
>> >> May 18 23:27:20 om1 kernel: pvfs2: pvfs2_statfs -- wait timed out >> >> and
>> >> retries exhausted. aborting attempt.
>> >> May 19 01:06:07 om1 kernel: pvfs2: pvfs2_inode_getattr -- wait >> >> timed
>> >> out
>> >> and
>> >> retries exhausted. aborting attempt.
>> >> May 19 04:08:26 om1 kernel: pvfs2: pvfs2_statfs -- wait timed out >> >> and
>> >> retries exhausted. aborting attempt.
>> >> May 19 04:15:40 om1 kernel: pvfs2: pvfs2_inode_getattr -- wait >> >> timed
>> >> out
>> >> and
>> >> retries exhausted. aborting attempt.
>> >> May 19 23:20:48 om1 kernel: pvfs2: pvfs2_statfs -- wait timed out >> >> and
>> >> retries exhausted. aborting attempt.
>> >> May 19 23:26:48 om1 kernel: pvfs2: pvfs2_statfs -- wait timed out >> >> and
>> >> retries exhausted. aborting attempt.
>> >> May 20 01:06:04 om1 kernel: pvfs2: pvfs2_inode_getattr -- wait >> >> timed
>> >> out
>> >> and
>> >> retries exhausted. aborting attempt.
>> >> May 20 04:08:25 om1 kernel: pvfs2: pvfs2_statfs -- wait timed out >> >> and
>> >> retries exhausted. aborting attempt.
>> >> May 20 04:15:34 om1 kernel: pvfs2: pvfs2_inode_getattr -- wait >> >> timed
>> >> out
>> >> and
>> >> retries exhausted. aborting attempt.
>> >> May 20 23:21:09 om1 kernel: pvfs2: pvfs2_statfs -- wait timed out >> >> and
>> >> retries exhausted. aborting attempt.
>> >> May 20 23:27:09 om1 kernel: pvfs2: pvfs2_statfs -- wait timed out >> >> and
>> >> retries exhausted. aborting attempt.
>> >> May 21 01:06:05 om1 kernel: pvfs2: pvfs2_inode_getattr -- wait >> >> timed
>> >> out
>> >> and
>> >> retries exhausted. aborting attempt.
>> >> May 21 04:08:31 om1 kernel: pvfs2: pvfs2_statfs -- wait timed out >> >> and
>> >> retries exhausted. aborting attempt.
>> >> May 21 04:15:41 om1 kernel: pvfs2: pvfs2_inode_getattr -- wait >> >> timed
>> >> out
>> >> and
>> >> retries exhausted. aborting attempt.
>> >> May 21 23:24:05 om1 kernel: pvfs2: pvfs2_statfs -- wait timed out >> >> and
>> >> retries exhausted. aborting attempt.
>> >> May 22 01:06:03 om1 kernel: pvfs2: pvfs2_inode_getattr -- wait >> >> timed
>> >> out
>> >> and
>> >> retries exhausted. aborting attempt.
>> >> May 22 04:08:33 om1 kernel: pvfs2: pvfs2_statfs -- wait timed out >> >> and
>> >> retries exhausted. aborting attempt.
>> >> May 22 04:15:41 om1 kernel: pvfs2: pvfs2_inode_getattr -- wait >> >> timed
>> >> out
>> >> and
>> >> retries exhausted. aborting attempt.
>> >>
>> >> Same errors at the same time.
>> >> Sorry for the long message...Hope for someone help
>> >> Thanks
>> >> Andrea
>> >>
>> >>
>> >>
>> >> _______________________________________________
>> >> Pvfs2-users mailing list
>> >> [email protected]
>> >> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
>> >>
>> >>
>>
>> _______________________________________________
>> Pvfs2-users mailing list
>> [email protected]
>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
>>
>>

_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users



_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Reply via email to