On Wed, Oct 3, 2018 at 12:20 PM Dave Taht <[email protected]> wrote: > > OK. Normally throughout the rest of my original network of 3+ years > back, I routed everything. > > The new install is ethernet bridged to the wifi. Without specifying > the wired parameter for the interface, *eventually* - not on the > initial route exchange - the txcost goes up to 256, which changes the > ref_metric. > > As elsewhere on the lan, things are routed, this sucks the default > route out to my old (routed) 1.7.1 instance. > > in /etc/config/babeld > > interface br-lan > option wired true fixes it for the new bridged boxes. > > add neighbour 2390e30 address fe80::f6f2:6dff:feb6:a01c if eno1 reach > ffff ureach 0000 rxcost 96 txcost 96 rtt 2.419 rttcost 0 cost 96 > > add route 23922f0 prefix 0.0.0.0/0 from 0.0.0.0/0 installed yes id > f6:f2:6d:ff:fe:b6:a0:1d metric 96 refmetric 0 via > fe80::f6f2:6dff:feb6:a01c if eno1 > add route 23930d0 prefix 0.0.0.0/0 from 0.0.0.0/0 installed no id > f6:f2:6d:ff:fe:b6:a0:1d metric 192 refmetric 96 via > fe80::46d9:e7ff:fe93:822e if eno1 > add route 2393140 prefix 0.0.0.0/0 from 0.0.0.0/0 installed no id > f6:f2:6d:ff:fe:b6:a0:1d metric 192 refmetric 96 via > fe80::230:18ff:fec9:de9c if eno1 > > The sad thing is that I'd encountered this problem before, and normal users > who > will probably default bridge their default gw ethernet and wifi > together won't see it either. Just me, with my ancient 6+ year old > network. > > Now, I'm still puzzled as to why the ref_metric got wau bigger than > 256 on 1.8.3. I'm a bit puzzled about the various treatments of > refmetric in the code as to a short > INFINITY and so on and what I'm > currently running is 1.7.1, bird, and a version with 1.8.3 with > refmetric universally promoted to an unsigned int. And it's lunchtime. > 2 days later. > > And I don't know why the 1.7.1 couch gateway is falling off entirely, > here present, > > d@dancer:~/git/babeld-refmetric$ echo dump | nc ::1 33123 | grep > 'prefix 0.0.0.0' > add route 23922f0 prefix 0.0.0.0/0 from 0.0.0.0/0 installed yes id > f6:f2:6d:ff:fe:b6:a0:1d metric 96 refmetric 0 via > fe80::f6f2:6dff:feb6:a01c if eno1 > add route 23930d0 prefix 0.0.0.0/0 from 0.0.0.0/0 installed no id > f6:f2:6d:ff:fe:b6:a0:1d metric 192 refmetric 96 via > fe80::46d9:e7ff:fe93:822e if eno1 > add route 2393140 prefix 0.0.0.0/0 from 0.0.0.0/0 installed no id > f6:f2:6d:ff:fe:b6:a0:1d metric 192 refmetric 96 via > fe80::230:18ff:fec9:de9c if eno1 > add route 239d960 prefix 0.0.0.0/0 from 0.0.0.0/0 installed no id > a2:21:b7:ff:fe:ac:e4:55 metric 416 refmetric 320 via > fe80::e091:f5ff:febe:a353 if eno1 > d@dancer:~/git/babeld-refmetric$ > > here not: > > d@dancer:~/git/babeld-refmetric$ echo dump | nc ::1 33123 | grep > 'prefix 0.0.0.0' > add route 23922f0 prefix 0.0.0.0/0 from 0.0.0.0/0 installed yes id > f6:f2:6d:ff:fe:b6:a0:1d metric 96 refmetric 0 via > fe80::f6f2:6dff:feb6:a01c if eno1 > add route 23930d0 prefix 0.0.0.0/0 from 0.0.0.0/0 installed no id > f6:f2:6d:ff:fe:b6:a0:1d metric 192 refmetric 96 via > fe80::46d9:e7ff:fe93:822e if eno1 > add route 2393140 prefix 0.0.0.0/0 from 0.0.0.0/0 installed no id > f6:f2:6d:ff:fe:b6:a0:1d metric 192 refmetric 96 via > fe80::230:18ff:fec9:de9c if eno1
OK, I think I figured this one out. I think juliusz put in an optimization to only consider 3 routes (somewhere in the code?). as I had 3+ default routes available having 3+ babel speakers, the routers re-announcing the local default route with a "gooder" metric than the backup default route. This meant when I took the labgw down, it retracted it's routes, then the other stable speakers heard that and retracted their routes, and then they had to wait for a reannouncement of the default route from the couch gw in order to find another default route to use, so I'd have more than a few seconds of connectivity interrupted. (similarly, with the inflated metric by not having txcost right, I had the opposite problem while I took the couch box down) Assuming this theory is correct... ?? In the case of defaultish routes I think retaining all of them is probably a good idea. Still... other routes tend to be pretty specific, so multiple speakers would encounter this problem less. I don't mind chewing up more compute to keep more routes in memory either, I guess, in the general case. Or perhaps there's some sort of fuzzy characteristic like "Hey, I got a lot of routes with good metrics from this speaker", and "a couple lousy ones, let's still keep those around because they probably point somewhere usefully different". > > On Wed, Oct 3, 2018 at 7:06 AM Dave Taht <[email protected]> wrote: > > > > So I reverted those two boxes (couch and labgw) to 1.7.1 (they are arm > > and mips respectively). No matter in which order I restart the > > daemons, the refmetric is set properly to 0, and the metric falls to > > 256 after a couple hellos, making the lab always be the correct > > default gw. > > > > so I am assuming there is a change (bug?) in a calculation on the > > refmetric since 1.7.1. The only puzzling thing > > about this set of events is that *same exact babel configuration for > > 1.8.3 and 1.7.1* I get a metric 96 refmetric 0 > > if I only have one major speaker (the labgw) and the weird refmetric > > inflation I noted on the prior email, > > > > ... and a consistent metric 256 refmetric 0... with two 1.7.1 speakers > > (watching with an 1.8.3 speaker, going to > > kill that next) > > > > echo dump | nc ::1 33123 | grep 'prefix 0.0.0.0' > > add route 93d990 prefix 0.0.0.0/0 from 0.0.0.0/0 installed yes id > > f6:f2:6d:ff:fe:b6:a0:1d metric 256 refmetric 0 via > > fe80::f6f2:6dff:feb6:a01c if eno1 > > add route 937f60 prefix 0.0.0.0/0 from 0.0.0.0/0 installed no id > > f6:f2:6d:ff:fe:b6:a0:1d metric 352 refmetric 256 via > > fe80::230:18ff:fec9:de9c if eno1 > > add route 938e60 prefix 0.0.0.0/0 from 0.0.0.0/0 installed no id > > f6:f2:6d:ff:fe:b6:a0:1d metric 352 refmetric 256 via > > fe80::46d9:e7ff:fe93:822e if eno1 > > add route 941a70 prefix 0.0.0.0/0 from 0.0.0.0/0 installed no id > > a2:21:b7:ff:fe:ac:e4:55 metric 320 refmetric 224 via > > fe80::e091:f5ff:febe > > > > I kill the couche babel > > > > d@dancer:~/git$ echo dump | nc ::1 33123 | grep 'prefix 0.0.0.0' > > add route 93d1f0 prefix 0.0.0.0/0 from 0.0.0.0/0 installed yes id > > f6:f2:6d:ff:fe:b6:a0:1d metric 256 refmetric 0 via > > fe80::f6f2:6dff:feb6:a01c if eno1 > > add route 937f60 prefix 0.0.0.0/0 from 0.0.0.0/0 installed no id > > f6:f2:6d:ff:fe:b6:a0:1d metric 352 refmetric 256 via > > fe80::230:18ff:fec9:de9c if eno1 > > add route 938e60 prefix 0.0.0.0/0 from 0.0.0.0/0 installed no id > > f6:f2:6d:ff:fe:b6:a0:1d metric 352 refmetric 256 via > > fe80::46d9:e7ff:fe93:822e if eno1 > > > > Restart the labgw - this result totally makes sense to me, for a couple > > hellos, > > > > d@dancer:~/git$ echo dump | nc ::1 33123 | grep 'prefix 0.0.0.0' > > add route 93d1f0 prefix 0.0.0.0/0 from 0.0.0.0/0 installed yes id > > f6:f2:6d:ff:fe:b6:a0:1d metric 511 refmetric 0 via > > fe80::f6f2:6dff:feb6:a01c if eno1 > > add route 937f60 prefix 0.0.0.0/0 from 0.0.0.0/0 installed no id > > f6:f2:6d:ff:fe:b6:a0:1d metric 703 refmetric 607 via > > fe80::230:18ff:fec9:de9c if eno1 > > add route 938e60 prefix 0.0.0.0/0 from 0.0.0.0/0 installed no id > > f6:f2:6d:ff:fe:b6:a0:1d metric 607 refmetric 511 via > > fe80::46d9:e7ff:fe93:822e if eno1 > > > > Which then evolves back to this: > > > > d@dancer:~/git$ echo dump | nc ::1 33123 | grep 'prefix 0.0.0.0' > > add route 93d1f0 prefix 0.0.0.0/0 from 0.0.0.0/0 installed yes id > > f6:f2:6d:ff:fe:b6:a0:1d metric 256 refmetric 0 via > > fe80::f6f2:6dff:feb6:a01c if eno1 > > add route 937f60 prefix 0.0.0.0/0 from 0.0.0.0/0 installed no id > > f6:f2:6d:ff:fe:b6:a0:1d metric 369 refmetric 273 via > > fe80::230:18ff:fec9:de9c if eno1 > > add route 938e60 prefix 0.0.0.0/0 from 0.0.0.0/0 installed no id > > f6:f2:6d:ff:fe:b6:a0:1d metric 352 refmetric 256 via > > fe80::46d9:e7ff:fe93:822e if eno1 > > > > -- > > Dave Täht > CEO, TekLibre, LLC > http://www.teklibre.com > Tel: 1-669-226-2619 -- Dave Täht CEO, TekLibre, LLC http://www.teklibre.com Tel: 1-669-226-2619 _______________________________________________ Babel-users mailing list [email protected] https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users
