Hi,
I'm having intermittent problems with OSPF running on OpenBSD 4.2. I
have two firewalls in an ospf area conversing with a number of Juniper
routers. Both OpenBSD boxes are VMs. (This is a test setup hence the use
of VMs.) Occassionally and fairly unpredictably I get the following
behaviour:
Original neighbours:
[EMAIL PROTECTED]:/root>ospfctl show neigh
ID Pri State DeadTime Address Iface Uptime
10.0.0.159 1 2-WAY/OTHER 00:00:01 10.0.0.214 pcn2 -
10.0.0.249 128 FULL/DR 00:00:01 10.0.0.210 pcn2 00:03:04
10.0.0.248 128 FULL/BCKUP 00:00:01 10.0.0.209 pcn2 00:03:02
10.0.0.157 128 FULL/DR 00:00:01 10.0.0.42 pcn0 00:03:07
10.0.0.156 128 FULL/BCKUP 00:00:01 10.0.0.41 pcn0 00:03:07
10.0.0.159 1 2-WAY/OTHER 00:00:01 10.0.0.44 pcn0 -
This unexpectedly changes to the following (with no link state changes
or anything else physical to trigger it, as far as I can see):
[EMAIL PROTECTED]:/root>ospfctl show neigh
ID Pri State DeadTime Address Iface Uptime
10.0.0.159 1 EXSTA/OTHER 00:00:01 10.0.0.214 pcn2 -
10.0.0.249 128 FULL/OTHER 00:00:01 10.0.0.210 pcn2 00:00:32
10.0.0.248 128 FULL/OTHER 00:00:01 10.0.0.209 pcn2 00:00:32
10.0.0.157 128 FULL/OTHER 00:00:01 10.0.0.42 pcn0 00:00:32
10.0.0.156 128 FULL/OTHER 00:00:01 10.0.0.41 pcn0 00:00:32
10.0.0.159 1 EXSTA/OTHER 00:00:01 10.0.0.44 pcn0 -
It then stays with the 10.0.0.159 host (firewall2 - the other openbsd
box) stuck in EXSTA state forever. The other box, remains oblivious to
the change, and its neighbours look like:
[EMAIL PROTECTED]:/root>ospfctl show neigh
ID Pri State DeadTime Address Iface Uptime
10.0.0.158 1 2-WAY/OTHER 00:00:01 10.0.0.213 pcn2 -
10.0.0.249 128 FULL/DR 00:00:01 10.0.0.210 pcn2 01:42:57
10.0.0.248 128 FULL/BCKUP 00:00:01 10.0.0.209 pcn2 01:12:58
10.0.0.158 1 2-WAY/OTHER 00:00:01 10.0.0.43 pcn0 -
10.0.0.156 128 FULL/BCKUP 00:00:01 10.0.0.41 pcn0 00:43:01
10.0.0.157 128 FULL/DR 00:00:01 10.0.0.42 pcn0 00:43:01
This particular change happened at around 9:30, the messages on
firewall1 and 2 from around that time are:
Aug 8 05:25:29 firewall1 ospfd[7469]: recv_db_description: seq num
mismatch, bad flags
Aug 8 06:25:25 firewall1 ospfd[7469]: recv_db_description: seq num
mismatch, bad flags
Aug 8 06:25:29 firewall1 last message repeated 2 times
Aug 8 06:25:30 firewall1 ospfd[7469]: nbr_fsm: neighbor ID 10.0.0.156,
event LOADING_DONE not expected in state EXCHG
Aug 8 06:30:30 firewall2 ospfd[20863]: nbr_adj_timer: failed to form
adjacency with 10.0.0.158
Aug 8 07:55:27 firewall1 ospfd[7469]: nbr_fsm: neighbor ID 10.0.0.157,
event LOADING_DONE not expected in state EXCHG
Aug 8 08:15:30 firewall2 ospfd[20863]: recv_db_description: seq num
mismatch, bad flags
Aug 8 09:26:25 firewall1 ospfd[7469]: nbr_adj_timer: failed to form
adjacency with 10.0.0.159
Aug 8 09:29:49 firewall1 ospfd[7469]: nbr_adj_timer: failed to form
adjacency with 10.0.0.159
Aug 8 09:31:49 firewall1 last message repeated 5 times
Finally my ospf config:
=======================
Firewall1:
router-id 10.0.0.158
hello-interval 1
metric 10
retransmit-interval 5
router-dead-time 2
# areas
area 0.0.0.1 {
interface pcn0 {
}
interface lo1 {
passive
}
interface pcn2 {
}
}
and similarly for firewall2.
router-id 10.0.0.159
hello-interval 1
metric 100
retransmit-interval 5
router-dead-time 2
# areas
area 0.0.0.1 {
interface pcn0 {
}
interface lo1 {
passive
}
interface pcn2 {
}
}
=======================
I've had some real openbsd boxes running with OSPF for a while now, and
never had anything like this happen before, so my suspicion is that it
is either something to do with running as a vm or some kind of problem
with the interaction between Juniper's OSPF implementation and
OpenBSD's. Alternatively I could have something wrong with my config.
Has anyone else come across anything like this before, or do you have
any ideas what could be causing it? Any advice?
Thanks in advance!
Cliff.