Hi,

Not sure what the etiquette for this list is, so apologies if this is not 
appropriate as it's not a confirmed bug...

I have a whole bunch of subnets which are static routed to a HSRP address, 
provided by a pair of Cisco routers, on a linknet VLAN. Actually, there is two 
VLANs, vlan209 and vlan409. In the case of v6, the HSRP IP is fe80::1, so I 
have routes to fe80::1%vlan209 and fe80::1%vlan409.

This has worked fine for many weeks. On Wednesday evening I upgraded to 7.1.

On Friday morning, I woke up to nearly 2,000 alerts, because some v6 had 
started flapping during the night.

It turns out that fe80::1%vlan409 had randomly become unreachable.

Every few minutes, it would become reachable again for 8 echo replies, then 
goes unreachable again.

This is strange, because we use this same HSRP config / fe80::1 addresses for 
all of our VLANs and have done for years, without issue.

Throughout this, the other OpenBSD host (still on 7.0), can access that address 
with no problem.

Oddly, this host can still access fe80::1%vlan209 no problem.

What seems to happen is, a stale ND entry appears and 8 pings succeed...
the-gw1# ndp -a |grep vlan409 | grep fe80
fe80::1%vlan409                      00:05:73:a0:00:01 vlan409 23h57m56s S R
..

Then this happens:
the-gw1# ndp -a |grep vlan409 | grep fe80
ndp: ioctl(SIOCGNBRINFO_IN6): Invalid argument
ndp: failed to get neighbor information
ndp: ioctl(SIOCGNBRINFO_IN6): Invalid argument
ndp: failed to get neighbor information
ndp: ioctl(SIOCGNBRINFO_IN6): Invalid argument
ndp: failed to get neighbor information
ndp: ioctl(SIOCGNBRINFO_IN6): Invalid argument
ndp: failed to get neighbor information
fe80::1%vlan409                      (incomplete)      vlan409 1s        I  2
Check again, and the entry has disappeared.

A few mins later, the process repeats - 8 pings suddenly succeed and it 
disappears again.

As I say though, fe80::1%vlan209 continues to work fine, as does 
fe80::1%vlan409 from the other host.

fe80::1%vlan209                      00:05:73:a0:00:01 vlan209 10s       R R

Interestingly, I did see a neighbour entry for fe80::1 on vlan409 on the Cisco 
which is the HSRP master which had a MAC address of the-gw1, which implied that 
the-gw1 is some how responding to ND requests for that IP.... but I am not able 
to find those replies in a tcpdump.

As a workaround, i've added another HSRP address, fe80::2 on the Ciscos and 
changed the static routes on this box to use that. After a few hours, that's 
still reachable ok.

It might be total coincidence that this is after a 7.0 -> 7.1 upgrade, but 
thought i'd report it and see if anyone else is seeing any similar issues.

Thanks,

Ian

Reply via email to