Jeffrey Altman <[EMAIL PROTECTED]> writes: > The Volserver is trying to establish a connection with the Fileserver > and it can't. As a result after five retries it exits with an assertion > failure. > > What is the state of the File Server? (FileLog)
Thu Jul 24 01:00:26 2008 File server starting Thu Jul 24 01:00:26 2008 afs_krb_get_lrealm failed, using itp.tugraz.at. Thu Jul 24 02:25:27 2008 Set Debug On level = 1 Thu Jul 24 02:25:53 2008 [0] Set Debug On level = 5 After a few restarts the verbose mode is of course no longer active and the Log File has been moved away. I have now rectivated more verbose logging, but besides the message of increased logging as above I cannot remember anything unusual. Dump again; differnt to my latest attempts there is now some reaction on the command line: [EMAIL PROTECTED]:~# "vos dump -id user.zuzi -file /tmp/backup -localauth -verbose" Full Dump ... Starting transaction on volume 536872626... And thats all after 10 Minutes - in my latest full archival backup this dump file is about 300 Mbytes. The log files have not changed much, esp. FileLog stays the same: <http://itp.tugraz.at/~ahi/openafs/> > What version of OpenAFS are you using? 1.4.7.dfsg1-1 - a backport of Sam Hartmans Debian packages (unstable) to Debian stable, Kernel is a custom linux 2.6.25.11. There are 3 DB servers and 3 file servers all running this version. This is the only machine acting as a file and db server. DB server binds to virtual ethernet socket provided by fake (arp poisioning). This worked with very little problems for about a year now, but I had to reboot because we had a sheduled power outage last friday and yesterday (wednesday) Another file server (on big UPS, so not rebooted, but also running 1.4.7) is much more verbose after startup: Sun Jul 20 04:00:11 2008 File server starting Sun Jul 20 04:00:11 2008 afs_krb_get_lrealm failed, using itp.tugraz.at. Sun Jul 20 04:00:49 2008 VL_RegisterAddrs rpc failed; will retry periodically (code=5377, err=0) Sun Jul 20 04:00:49 2008 Set thread id 11 for FSYNC_sync Sun Jul 20 04:00:49 2008 FSYNC_sync: bind failed with (98), removed bogus /var/lib/openafs/local/fssync.sock Sun Jul 20 04:00:49 2008 Partition /vicepa: attaching volumes Sun Jul 20 04:01:15 2008 Partition /vicepa: attached 362 volumes; 0 volumes not attached Sun Jul 20 04:01:15 2008 Getting FileServer name... Sun Jul 20 04:01:15 2008 FileServer host name is 'faepsv07' Sun Jul 20 04:01:15 2008 Getting FileServer address... Sun Jul 20 04:01:15 2008 FileServer faepsv07 has address 129.27.161.111 (0x6fa11b81 or 0x811ba16f in host byte order) Sun Jul 20 04:01:15 2008 File Server started Sun Jul 20 04:01:15 2008 Sun Jul 20 04:01:15 2008 Set thread id 15 for 'FiveMinuteCheckLWP' Sun Jul 20 04:01:15 2008 Set thread id 16 for 'HostCheckLWP' Sun Jul 20 04:01:15 2008 Set thread id 17 for 'FsyncCheckLWP' Sun Jul 20 20:04:29 2008 CB: ProbeUuid for 78.104.3.214:51209 failed -01 Sun Jul 20 20:08:56 2008 CB: ProbeUuid for 78.104.3.214:51227 failed -01 ..... I now really suspect those problems stem from the file server and db server listening on different IP addresses on the same machine. Thanks for caring! Andreas -- Andreas Hirczy <[EMAIL PROTECTED]> http://itp.tugraz.at/~ahi/ Graz University of Technology phone: +43/316/873- 8190 Institute of Theoretical and Computational Physics fax: +43/316/873-10 8190 Petersgasse 16, A-8010 Graz mobile: +43/664/859 23 57 _______________________________________________ OpenAFS-info mailing list [email protected] https://lists.openafs.org/mailman/listinfo/openafs-info
