Hey folks. Odd problem. I have gone over many things, and I am stumped. I
am tempted to just destroy and rebuild my errant vlserver, but I'd like to
know what's going on and I know I am missing something :-)

I have three vlservers, one in each of my colos. Lets call them A
(10.33.10.43), B(10.36.10.7) and C(10.38.10.7)

A and B sync with each other
B and C sync with each other (or try to) but fail
A and C do not.

They do know about each other, as a udebug -long shows that. However
between A and C I see this in the long form:

Server (10.38.10.7): (db 0.0)
    last vote never rcvd
    last beacon never sent
    dbcurrent=0, up=0 beaconSince=0

Same is true for C to A.

The database version is completely off between A and C.

This is the complete udebug from server C, which is the one that is acting
up:

Host's addresses are: 10.38.10.7
Host's 10.38.10.7 time is Sun Jan 12 23:04:13 2014
Local time is Sun Jan 12 23:04:16 2014 (time differential 3 secs)
Last yes vote for 10.38.10.7 was 2 secs ago (not sync site);
Last vote started 2 secs ago (at Sun Jan 12 23:04:14 2014)
Local db version is 1388991001.15777
I am not sync site
Lowest host 10.38.10.7 was set 2 secs ago
Sync host 0.0.0.0 was set 1389567853 secs ago
The last trans I handled was 0.46
Sync site's db version is 1388991001.15777
0 locked pages, 0 of them for write

Server (10.36.10.7): (db 0.0)
    last vote rcvd 2 secs ago (at Sun Jan 12 23:04:14 2014),
    last beacon sent 2 secs ago (at Sun Jan 12 23:04:14 2014), last vote
was no
    dbcurrent=0, up=1 beaconSince=1

Server (10.33.10.43): (db 0.0)
    last vote never rcvd
    last beacon never sent
    dbcurrent=0, up=0 beaconSince=0

So it thinks it is not the sync site, but 0.0.0.0 is. In reality, server A
is the sync site.

I have run tcpdump to make absolutely certain there are packets running
between the two hosts on all AFS ports, and there are.

KeyFiles are fine, and checked with md5sum.

Set debug to 25 on the affected vlserver (10.38.10.7, Server C) and got
this:

Sun Jan 12 23:19:24 2014 Using 10.38.10.7 as my primary address
Sun Jan 12 23:19:40 2014 beacon: amSyncSite is 0
Sun Jan 12 23:19:40 2014 Starting AFS vlserver 4 (/usr/lib/openafs/vlserver
-rxbind -d 25)
Sun Jan 12 23:19:40 2014 no vote from 10.36.10.7
@(#) OpenAFS 1.6.1-1+ubuntu0.2-debian built  2013-07-24
12 23:19:40 2014 Received beacon type 0 from host 10.38.10.7
Sun Jan 12 23:19:44 2014 recovery running in state 0
Sun Jan 12 23:19:55 2014 beacon: amSyncSite is 0
Sun Jan 12 23:19:55 2014 no vote from 10.36.10.7
Sun Jan 12 23:19:55 2014 beacon: amSyncSite is 0
Sun Jan 12 23:19:55 2014 Received beacon type 0 from host 10.38.10.7
Sun Jan 12 23:20:10 2014 beacon: amSyncSite is 0
Sun Jan 12 23:20:10 2014 no vote from 10.36.10.7
Sun Jan 12 23:20:10 2014 beacon: amSyncSite is 0
Sun Jan 12 23:20:10 2014 Received beacon type 0 from host 10.38.10.7
Sun Jan 12 23:20:25 2014 beacon: amSyncSite is 0
Sun Jan 12 23:20:25 2014 no vote from 10.36.10.7
Sun Jan 12 23:20:25 2014 beacon: amSyncSite is 0
Sun Jan 12 23:20:25 2014 Received beacon type 0 from host 10.38.10.7
Sun Jan 12 23:20:40 2014 ubik:server 10.33.10.43 still down
Sun Jan 12 23:20:40 2014 beacon: amSyncSite is 0
Sun Jan 12 23:20:40 2014 beacon: amSyncSite is 0
Sun Jan 12 23:20:40 2014 no vote from 10.36.10.7
Sun Jan 12 23:20:40 2014 beacon: amSyncSite is 0
Sun Jan 12 23:20:40 2014 Received beacon type 0 from host 10.38.10.7
Sun Jan 12 23:20:40 2014 Ubik: vote 'yes' for 10.38.10.7 (NOT in quorum)
Sun Jan 12 23:20:44 2014 recovery running in state 0
Sun Jan 12 23:20:44 2014 beacon: amSyncSite is 0
Sun Jan 12 23:20:48 2014 recovery running in state 0
Sun Jan 12 23:20:48 2014 beacon: amSyncSite is 0
Sun Jan 12 23:20:52 2014 recovery running in state 0
Sun Jan 12 23:20:52 2014 beacon: amSyncSite is 0
Sun Jan 12 23:20:55 2014 beacon: amSyncSite is 0
Sun Jan 12 23:20:55 2014 no vote from 10.36.10.7
Sun Jan 12 23:20:55 2014 beacon: amSyncSite is 0
Sun Jan 12 23:20:55 2014 Received beacon type 0 from host 10.38.10.7
Sun Jan 12 23:20:56 2014 recovery running in state 0
Sun Jan 12 23:20:56 2014 beacon: amSyncSite is 0
Sun Jan 12 23:21:00 2014 recovery running in state 0
Sun Jan 12 23:21:00 2014 beacon: amSyncSite is 0
Sun Jan 12 23:21:04 2014 recovery running in state 0
Sun Jan 12 23:21:04 2014 beacon: amSyncSite is 0
Sun Jan 12 23:21:07 2014 allbetter checking
Sun Jan 12 23:21:07 2014 allbetter: returning 1
Sun Jan 12 23:21:07 2014 allbetter checking
Sun Jan 12 23:21:07 2014 allbetter: returning 1
Sun Jan 12 23:21:07 2014 allbetter checking
Sun Jan 12 23:21:07 2014 allbetter: returning 1
Sun Jan 12 23:21:07 2014 allbetter checking
Sun Jan 12 23:21:07 2014 allbetter: returning 1
Sun Jan 12 23:21:07 2014 allbetter checking
Sun Jan 12 23:21:07 2014 allbetter: returning 1
Sun Jan 12 23:21:07 2014 GetVolumeByID 536870913 (2) 10.38.10.83 noauth
Sun Jan 12 23:21:07 2014 allbetter checking
Sun Jan 12 23:21:07 2014 allbetter: returning 1
Sun Jan 12 23:21:07 2014 allbetter checking
Sun Jan 12 23:21:07 2014 allbetter: returning 1
Sun Jan 12 23:21:07 2014 allbetter checking
Sun Jan 12 23:21:07 2014 allbetter: returning 1
Sun Jan 12 23:21:07 2014 allbetter checking
Sun Jan 12 23:21:07 2014 allbetter: returning 1
Sun Jan 12 23:21:07 2014 allbetter checking
Sun Jan 12 23:21:07 2014 allbetter: returning 1
Sun Jan 12 23:21:07 2014 allbetter checking
Sun Jan 12 23:21:07 2014 allbetter: returning 1
Sun Jan 12 23:21:07 2014 GetVolumeByID 536870913 (2) 10.38.10.83 noauth
Sun Jan 12 23:21:07 2014 allbetter checking
Sun Jan 12 23:21:07 2014 allbetter: returning 1
Sun Jan 12 23:21:07 2014 allbetter checking
Sun Jan 12 23:21:07 2014 allbetter: returning 1
Sun Jan 12 23:21:07 2014 allbetter checking
Sun Jan 12 23:21:07 2014 allbetter: returning 1
Sun Jan 12 23:21:07 2014 allbetter checking
Sun Jan 12 23:21:07 2014 allbetter: returning 1
Sun Jan 12 23:21:07 2014 allbetter checking
Sun Jan 12 23:21:07 2014 allbetter: returning 1
Sun Jan 12 23:21:08 2014 recovery running in state 0
Sun Jan 12 23:21:08 2014 beacon: amSyncSite is 0
Sun Jan 12 23:21:10 2014 beacon: amSyncSite is 0
Sun Jan 12 23:21:10 2014 no vote from 10.36.10.7
Sun Jan 12 23:21:10 2014 beacon: amSyncSite is 0
Sun Jan 12 23:21:10 2014 Received beacon type 0 from host 10.38.10.7
Sun Jan 12 23:21:12 2014 recovery running in state 0
Sun Jan 12 23:21:25 2014 beacon: amSyncSite is 0
Sun Jan 12 23:21:25 2014 no vote from 10.36.10.7
Sun Jan 12 23:21:25 2014 beacon: amSyncSite is 0
Sun Jan 12 23:21:25 2014 Received beacon type 0 from host 10.38.10.7
Sun Jan 12 23:21:29 2014 allbetter checking
Sun Jan 12 23:21:29 2014 allbetter: returning 1
Sun Jan 12 23:21:29 2014 GetVolumeByName <snipped>
Sun Jan 12 23:21:29 2014 allbetter checking
Sun Jan 12 23:21:29 2014 allbetter: returning 1
Sun Jan 12 23:21:29 2014 allbetter checking
Sun Jan 12 23:21:29 2014 allbetter: returning 1
Sun Jan 12 23:21:29 2014 allbetter checking
Sun Jan 12 23:21:29 2014 allbetter: returning 1
Sun Jan 12 23:21:30 2014 allbetter checking
Sun Jan 12 23:21:30 2014 allbetter: returning 1
Sun Jan 12 23:21:30 2014 GetVolumeByName *<snipped>*
Sun Jan 12 23:21:30 2014 allbetter checking
Sun Jan 12 23:21:30 2014 allbetter: returning 1
Sun Jan 12 23:21:30 2014 allbetter checking
Sun Jan 12 23:21:30 2014 allbetter: returning 1
Sun Jan 12 23:21:30 2014 allbetter checking
Sun Jan 12 23:21:30 2014 allbetter: returning 1
Sun Jan 12 23:21:30 2014 allbetter checking
Sun Jan 12 23:21:30 2014 allbetter: returning 1
Sun Jan 12 23:21:30 2014 allbetter checking
Sun Jan 12 23:21:30 2014 allbetter: returning 1
Sun Jan 12 23:21:30 2014 allbetter checking
Sun Jan 12 23:21:30 2014 allbetter: returning 1
Sun Jan 12 23:21:30 2014 GetVolumeByName *<snipped>*
Sun Jan 12 23:21:30 2014 allbetter checking
Sun Jan 12 23:21:30 2014 allbetter: returning 1
Sun Jan 12 23:21:30 2014 allbetter checking
Sun Jan 12 23:21:30 2014 allbetter: returning 1
Sun Jan 12 23:21:30 2014 allbetter checking
Sun Jan 12 23:21:30 2014 allbetter: returning 1
Sun Jan 12 23:21:30 2014 allbetter checking
Sun Jan 12 23:21:30 2014 allbetter: returning 1
Sun Jan 12 23:21:30 2014 allbetter checking
Sun Jan 12 23:21:30 2014 allbetter: returning 1
Sun Jan 12 23:21:30 2014 allbetter checking
Sun Jan 12 23:21:30 2014 allbetter: returning 1
Sun Jan 12 23:21:30 2014 GetVolumeByName *<snipped>*
Sun Jan 12 23:21:30 2014 allbetter checking
Sun Jan 12 23:21:30 2014 allbetter: returning 1
Sun Jan 12 23:21:30 2014 allbetter checking
Sun Jan 12 23:21:30 2014 allbetter: returning 1
Sun Jan 12 23:21:30 2014 allbetter checking
Sun Jan 12 23:21:30 2014 allbetter: returning 1
Sun Jan 12 23:21:40 2014 beacon: amSyncSite is 0
Sun Jan 12 23:21:40 2014 no vote from 10.36.10.7
Sun Jan 12 23:21:40 2014 beacon: amSyncSite is 0
Sun Jan 12 23:21:40 2014 Received beacon type 0 from host 10.38.10.7
Sun Jan 12 23:21:55 2014 beacon: amSyncSite is 0
Sun Jan 12 23:21:55 2014 no vote from 10.36.10.7
Sun Jan 12 23:21:55 2014 beacon: amSyncSite is 0
Sun Jan 12 23:21:55 2014 Received beacon type 0 from host 10.38.10.7

Any help is appreciated! I have already upgraded and rebooted all three
servers, lowest to highest, to no avail. I also attempted moving the data
files out of the way and starting the affected vlserver, but the same
symptoms remain.

-- 
Timothy Balcer / IT Services
Telmate / San Francisco, CA
Direct / (415) 300-4313
Customer Service / (800) 205-5510

Reply via email to