Hi,


I am running an OpenAFS 1.2.10 test site with two db servers under RedHat 8.0, site is up since November 2002.

Today, since this morning no administration is possible, error message (Win client) is "error, no quorum elected (0x00001500)" when I try to create a volume, replica or release.

I read in the archives and found: the time sync seens to be ok, all volumes are mounted, servers are in hosts file and DNS (no changes made), the FileLog says:

Sun Jan 11 16:51:35 2004 File server starting
Sun Jan 11 16:51:35 2004 afs_krb_get_lrealm failed, using ombbln.de.
Sun Jan 11 16:52:34 2004 VL_RegisterAddrs rpc failed; will retry periodically (code=5376, err=0)
Sun Jan 11 16:52:34 2004 Set thread id 14 for FSYNC_sync
Sun Jan 11 16:52:34 2004 Partition /vicepe: attached 1 volumes; 0 volumes not attached
Sun Jan 11 16:52:52 2004 Partition /vicepa: attached 401 volumes; 0 volumes not attached
Sun Jan 11 16:53:00 2004 Partition /vicepb: attached 133 volumes; 0 volumes not attached
Sun Jan 11 16:53:04 2004 Partition /vicepc: attached 49 volumes; 0 volumes not attached
Sun Jan 11 16:53:07 2004 Partition /vicepd: attached 33 volumes; 0 volumes not attached
Sun Jan 11 16:53:07 2004 Set thread id 15 for 'FiveMinuteCheckLWP'
Sun Jan 11 16:53:07 2004 Set thread id 16 for 'HostCheckLWP'
Sun Jan 11 16:53:07 2004 Getting FileServer name...
Sun Jan 11 16:53:07 2004 FileServer host name is 'afs1'
Sun Jan 11 16:53:07 2004 Getting FileServer address...
Sun Jan 11 16:53:07 2004 FileServer afs1 has address 192.168.9.7 (0x709a8c0 or 0xc0a80907 in host byte order)
Sun Jan 11 16:53:07 2004 File Server started Sun Jan 11 16:53:07 2004
Sun Jan 11 16:58:07 2004 VL_RegisterAddrs rpc failed; will retry periodically (code=5376, err=0)


last message continuously repeated ...

I made a udebug on both servers and got:

[EMAIL PROTECTED] root]# udebug afs1 7003 -long
Host's addresses are: 192.168.9.7
Host's 192.168.9.7 time is Sun Jan 11 19:01:35 2004
Local time is Sun Jan 11 19:01:36 2004 (time differential 1 secs)
Last yes vote for 192.168.9.7 was 0 secs ago (not sync site);
Last vote started 0 secs ago (at Sun Jan 11 19:01:36 2004)
Local db version is 1073185535.95
I am not sync site
Lowest host 192.168.9.7 was set 0 secs ago
Sync host 0.0.0.0 was set 1073844095 secs ago
Sync site's db version is 1073185535.95
0 locked pages, 0 of them for write

Server (192.168.9.8): (db 0.0)
last vote rcvd 1 secs ago (at Sun Jan 11 19:01:35 2004),
last beacon sent 0 secs ago (at Sun Jan 11 19:01:36 2004), last vote was yes
dbcurrent=0, up=1 beaconSince=1



[EMAIL PROTECTED] root]# udebug afs2 7003 -long Host's addresses are: 192.168.9.8 Host's 192.168.9.8 time is Sun Jan 11 19:02:42 2004 Local time is Sun Jan 11 19:02:45 2004 (time differential 3 secs) Last yes vote for 192.168.9.7 was 8 secs ago (not sync site); Last vote started 7 secs ago (at Sun Jan 11 19:02:38 2004) Local db version is 1073185535.95 I am not sync site Lowest host 192.168.9.7 was set 8 secs ago Sync host 0.0.0.0 was set 1073844162 secs ago Sync site's db version is 1073185535.95 0 locked pages, 0 of them for write

Server (192.168.9.7): (db 0.0)
last vote rcvd 6263 secs ago (at Sun Jan 11 17:18:22 2004),
last beacon sent 6263 secs ago (at Sun Jan 11 17:18:22 2004), last vote was no
dbcurrent=0, up=1 beaconSince=1


The problem seems to be that none of the two servers is the sync site, but address 0.0.0.0 (which is really the lowest possible ip-address) is beeing held to be the sync site.

Any help appreciated!

Michael Braun

_______________________________________________
OpenAFS-info mailing list
[EMAIL PROTECTED]
https://lists.openafs.org/mailman/listinfo/openafs-info

Reply via email to