On 3 May 2016, at 10:03, Nick Hilliard wrote:

> The interface names look odd: freebsd doesn't use eth*.
> Did you set up aliases for these

Yes, I did. Setting up aliases was much easier than modifying my
test scripts on the tester (I use same scripts for Linux/FreeBSD)

I rename them in rc.local:
  ifconfig em0 name eth0
  ifconfig em1 name eth1
  ifconfig em2 name eth2
  ifconfig em3 name eth3
  ifconfig em4 name eth4

> and if so, what underlying virtual hardware are you
> using?

I see the issue on KVM (QEMU 1.5.3 on CentOS 7) and I see the same
issue on Parallels (on my MacBook). I haven’t tested it on any
other platform (yet)
I don’t test on physical hardware - but if this is the best guess,
I can setup something and run it against real hardware.

> Freebsd is known to have some race conditions relating to
> virtual interface deletion, but I'm not aware of any relating to
> interface creation.

There is no interface creation/deletion involved here. This is
in the middle of “normal” interface use.

However, there is a “no shutdown” just a bit earlier. I’ve played
by adding delays after bringing the interfaces up and it currently
seems like 30 sec wait after bringing the interface up solves the
issue in “most” or all cases
(20 sec was not enough - tried approx 40 times with a 30s delay so
far)

Kind of scares me to think that on FreeBSD, I can bring up an
interface (“no shutdown”), form a neighbor with ISIS or OSPF
and send the route to the kernel _BEFORE_ the kernel is ready
to accept it - and will drop it in this case.

So thanks… got some workaround and basic idea what the issue is,
but I’m think (if possible) a proper fix would be good. No idea
how this can be fixed… - or if it should be declared a kernel bug.
(Currently tested 10.2 and 10.3 FreeBSD - both have same issue)

It’s also worthwhile to mention that I see _A_LOT_ of these
RTM_MISS messages in many of my tests (never really looked for
them before) and most of them seem to pass anyway. Maybe some
of them drop non-essential messages or some of these RTM_MISS
are bogus?

Thanks for the help. Getting on the right track here…

- Martin


> Martin Winter wrote:
>> I haven’t seen anyone answering this, so please excuse me for bringing
>> this up again…
>>
>> Any help from someone familiar with FreeBSD is highly appreciated.
>>
>> =This issue breaks Quagga in many test cases (actually making it
>> unpredictable)
>> Whenever this “RTM_MISS” happens, the kernel routing table misses the
>> updates
>> and ends up in an inconsistent shape.
>> A really bad and serious issue.
>>
>> - Martin
>>
>> On 14 Apr 2016, at 16:06, Martin Winter wrote:
>>
>>> Some issue which is a mystery where to even begin troubleshooting:
>>>
>>> On FreeBSD 10.2 (Not testing other FreeBSD), I have at least one ISIS
>>> IPv4 test
>>> sometimes failing by not updating the kernel table.
>>>
>>> ISIS log looks fine in all cases.
>>>
>>> When I look at Zebra log, I see the following in the failed case:
>>>
>>> 2016/04/13 01:28:06 ZEBRA: vty[??]@# end
>>> 2016/04/13 01:28:06 ZEBRA: vty[??]@# exit
>>> 2016/04/13 01:28:06 ZEBRA: vty[??]@> enable
>>> 2016/04/13 01:28:06 ZEBRA: vty[??]@# config ter
>>> 2016/04/13 01:28:06 ZEBRA: vty[??]@(config)# interface eth2
>>> 2016/04/13 01:28:06 ZEBRA: vty[??]@(config-if)# no shutdown
>>> 2016/04/13 01:28:06 ZEBRA: vty[??]@(config-if)# end
>>> 2016/04/13 01:28:06 ZEBRA: vty[??]@# end
>>> 2016/04/13 01:28:06 ZEBRA: vty[??]@# exit
>>> 2016/04/13 01:28:06 ZEBRA: vty[??]@> enable
>>> 2016/04/13 01:28:06 ZEBRA: vty[??]@# config ter
>>> 2016/04/13 01:28:06 ZEBRA: vty[??]@(config)# interface eth3
>>> 2016/04/13 01:28:06 ZEBRA: vty[??]@(config-if)# no shutdown
>>> 2016/04/13 01:28:06 ZEBRA: vty[??]@(config-if)# end
>>> 2016/04/13 01:28:06 ZEBRA: vty[??]@# end
>>> 2016/04/13 01:28:06 ZEBRA: vty[??]@# exit
>>> 2016/04/13 01:28:23 ZEBRA: Kernel: Len: 168 Type: RTM_MISS
>>> 2016/04/13 01:28:23 ZEBRA: Kernel: DONE
>>> 2016/04/13 01:28:23 ZEBRA: Kernel: message seq 0
>>> 2016/04/13 01:28:23 ZEBRA: Kernel: pid 0, rtm_addrs 0x1
>>> 2016/04/13 01:28:23 ZEBRA: Unprocessed RTM_type: 7
>>> 2016/04/13 01:28:26 ZEBRA: vty[??]@> enable
>>> 2016/04/13 01:28:26 ZEBRA: vty[??]@# end
>>> 2016/04/13 01:28:26 ZEBRA: vty[??]@# exit
>>> 2016/04/13 01:29:02 ZEBRA: vty[??]@> enable
>>>
>>> While a good case looks like this:
>>>
>>> 2016/04/13 01:28:03 ZEBRA: vty[??]@# end
>>> 2016/04/13 01:28:03 ZEBRA: vty[??]@# exit
>>> 2016/04/13 01:28:03 ZEBRA: vty[??]@> enable
>>> 2016/04/13 01:28:03 ZEBRA: vty[??]@# config ter
>>> 2016/04/13 01:28:03 ZEBRA: vty[??]@(config)# interface eth2
>>> 2016/04/13 01:28:03 ZEBRA: vty[??]@(config-if)# no shutdown
>>> 2016/04/13 01:28:03 ZEBRA: vty[??]@(config-if)# end
>>> 2016/04/13 01:28:03 ZEBRA: vty[??]@# end
>>> 2016/04/13 01:28:03 ZEBRA: vty[??]@# exit
>>> 2016/04/13 01:28:03 ZEBRA: vty[??]@> enable
>>> 2016/04/13 01:28:03 ZEBRA: vty[??]@# config ter
>>> 2016/04/13 01:28:03 ZEBRA: vty[??]@(config)# interface eth3
>>> 2016/04/13 01:28:03 ZEBRA: vty[??]@(config-if)# no shutdown
>>> 2016/04/13 01:28:03 ZEBRA: vty[??]@(config-if)# end
>>> 2016/04/13 01:28:03 ZEBRA: vty[??]@# end
>>> 2016/04/13 01:28:03 ZEBRA: vty[??]@# exit
>>> 2016/04/13 01:28:08 ZEBRA: zebra message comes from socket [13]
>>> 2016/04/13 01:28:08 ZEBRA: zebra message received
>>> [ZEBRA_IPV4_ROUTE_ADD] 20 in VRF 0
>>> 2016/04/13 01:28:08 ZEBRA: rib_link: 172.16.0.0/28 vrf 0: rn
>>> 0x8020ffbf0, rib 0x8020ffb00
>>> 2016/04/13 01:28:08 ZEBRA: rib_link: 172.16.0.0/28 vrf 0: adding dest
>>> to table
>>> 2016/04/13 01:28:08 ZEBRA: rib_add_ipv4_multipath: called rib_addnode
>>> (0x8020ffbf0, 0x8020ffb00) on new RIB entry
>>> 2016/04/13 01:28:08 ZEBRA: rib_add_ipv4_multipath: dumping RIB entry
>>> 0x8020ffb00 for 172.16.0.0/28 vrf 0
>>> 2016/04/13 01:28:08 ZEBRA: rib_add_ipv4_multipath: refcnt == 0, uptime
>>> == 1460536088, type == 8, table == 0
>>> 2016/04/13 01:28:08 ZEBRA: rib_add_ipv4_multipath: metric == 11,
>>> distance == 115, flags == 0, status == 0
>>> 2016/04/13 01:28:08 ZEBRA: rib_add_ipv4_multipath: nexthop_num == 1,
>>> nexthop_active_num == 0, nexthop_fib_num == 0
>>> 2016/04/13 01:28:08 ZEBRA: rib_add_ipv4_multipath: NH 192.168.1.1 with
>>> flags
>>> 2016/04/13 01:28:08 ZEBRA: rib_add_ipv4_multipath: dump complete
>>> 2016/04/13 01:28:08 ZEBRA: kernel_rtm_ipv4: 172.16.0.0/28:
>>> successfully did NH 192.168.1.1
>>> 2016/04/13 01:28:08 ZEBRA: Kernel: Len: 200 Type: RTM_ADD
>>> 2016/04/13 01:28:08 ZEBRA: Kernel: UP GATEWAY DONE PROTO1
>>> 2016/04/13 01:28:08 ZEBRA: Kernel: message seq 0
>>> 2016/04/13 01:28:08 ZEBRA: Kernel: pid 75135, rtm_addrs 0x7
>>> 2016/04/13 01:28:08 ZEBRA: rtm_read: got rtm of type 1 (RTM_ADD)
>>> 2016/04/13 01:28:08 ZEBRA: rtm_read: RTM_ADD 172.16.0.0/28: done Ok
>>> 2016/04/13 01:28:08 ZEBRA: rib_lookup_and_dump: rn 0x8020ffbf0, rib
>>> 0x8020ffb00: NOT removed, selected
>>> 2016/04/13 01:28:08 ZEBRA: rib_lookup_and_dump: dumping RIB entry
>>> 0x8020ffb00 for 172.16.0.0/28 vrf 0
>>> 2016/04/13 01:28:08 ZEBRA: rib_lookup_and_dump: refcnt == 0, uptime ==
>>> 1460536088, type == 8, table == 0
>>> 2016/04/13 01:28:08 ZEBRA: rib_lookup_and_dump: metric == 11, distance
>>> == 115, flags == 16, status == 4
>>> 2016/04/13 01:28:08 ZEBRA: rib_lookup_and_dump: nexthop_num == 1,
>>> nexthop_active_num == 1, nexthop_fib_num == 0
>>> 2016/04/13 01:28:08 ZEBRA: rib_lookup_and_dump: NH 192.168.1.1 with
>>> flags ACTIVE FIB
>>> 2016/04/13 01:28:08 ZEBRA: rib_lookup_and_dump: dump complete
>>> […]
>>>
>>> Notice the following lines:
>>>
>>> 2016/04/13 01:28:06 ZEBRA: vty[??]@# exit
>>>     [17 sec of no messages here…]
>>> 2016/04/13 01:28:23 ZEBRA: Kernel: Len: 168 Type: RTM_MISS
>>> 2016/04/13 01:28:23 ZEBRA: Kernel: DONE
>>> 2016/04/13 01:28:23 ZEBRA: Kernel: message seq 0
>>> 2016/04/13 01:28:23 ZEBRA: Kernel: pid 0, rtm_addrs 0x1
>>> 2016/04/13 01:28:23 ZEBRA: Unprocessed RTM_type: 7
>>>
>>> It seems somehow zebra drops or misses several messages from isis
>>> (including the route which it supposed to install)
>>>
>>> Outcome is random. I can run it multiple times and sometimes it may
>>> work, other times it fails.
>>>
>>> Anyone having any guess what to look for or how to troubleshoot this?
>>> My FreeBSD knowhow is close to zero…  (Linux has no issues at all)
>>>
>>> Best guess so far is that this is an issue with my KVM Hypervisor (the
>>> tests run in VMs) - might be hypervisor overloaded?
>>> But surprised why this would only affect FreeBSD and not Linux.
>>>
>>> - Martin
>>>
>>>
>>>
>>> _______________________________________________
>>> Quagga-dev mailing list
>>> [email protected]
>>> https://lists.quagga.net/mailman/listinfo/quagga-dev
>>
>> _______________________________________________
>> Quagga-dev mailing list
>> [email protected]
>> https://lists.quagga.net/mailman/listinfo/quagga-dev
>>
>
>
> _______________________________________________
> Quagga-dev mailing list
> [email protected]
> https://lists.quagga.net/mailman/listinfo/quagga-dev

_______________________________________________
Quagga-dev mailing list
[email protected]
https://lists.quagga.net/mailman/listinfo/quagga-dev

Reply via email to