Re: [Babel-users] Route-dete :wq
On Fri, Mar 9, 2018 at 3:29 PM, Christof Schulzewrote: >> I start running into trouble with 1000+ routes using 1Mbit mcast. >> Sooner if I seriously >> slam the network with flent or something else that abuses mcast like mdns. >> YMMV. > > So what is the culprit here? What would it take to add an order of > magnitude? Most of the meshy networks run wifi mcast at 11Mbit or higher. Ath9k devices support this, many others don't. Experiment with the new unicast code, as that will transfer routes at the underlying rate of the medium (up to 300Mbit on wireless-n). Don't run odhcpd or network manager either. They tend to get in a tussle with babeld in the kernel. Simulate first, deploy second. Don't prematurely optimize. :) I have a backlog of other optimizations for babel lying about, ranging from trivial stuff like at least logging when you are low on compute or overbuffered: https://github.com/dtaht/rabeld/commit/b74b4a6f9b532717ee93346963efd894e94615b3 to something that tries to be more aware of compute bounds, to another thing that pushes some work down into the kernel via bpf > >> As for aggregation and filtering: Most of my Aps have at minimum, >> ethernet, and two channels - usually four, including the meshy links. >> The meshy links are ptp, so I've generally "wasted" an entire /22 ipv4 >> network to talk to the Aps. ipv6 /62s. >> >> The lab component of my network, for example, has two main links to >> the production net, and the gateways only announce the >> subnet it is on (172.22.0.0/16). This cuts the churn seen outside the >> network when I do crazy things like reboot the whole thing. >> >> The biggest problem I've run into, is that meshy links, are, meshy - >> and I've lost track of the number of times where >> I had a well defined /16 network in the lab suddenly leak all the >> meshy /32 bits over the worst possible link - because I plugged >> something in that was adhoc (and poorly) connected to the outside >> network that I shouldn't have. >> >> Lede creates one /48 ULA by default per AP, and then more /60s. I've >> had a tendency to try to share one /48, but more recently I was trying >> to go native ipv6 and disabled the ULA generation entirely. >> >> I don't bridge anything except sometimes on the last Aps on a link >> (which don't announce babel on that bridge). Bridging can do weird >> things to daemons that want also to be measuring the individual links. >> >> So in your network design I'd try to identify your backbone links and >> try upfront to rationally partition the network numbering scheme, >> and still, periodically try to optimize it. It makes no sense to >> export all the churn the last hop of a meshy, yet leaf network can >> have to the whole network. I'd simulate what you plan, and then slam >> it with traffic from every point with a tool like flent, and deploy >> cake or htb+fq_codel on the ISP up/downlinks. > > This being a Freifunk-Network, there is not going to be much planning of the > structure beyond mere basics. > Until now I just hoped we could get away with having 10K+ routes in one > network which would translate into 3k+ clients when considering many nodes > and 3 IPv6 addresses for each client which seems to be a reasonable amount > when taking into account a clat-address per client and IPv6 privacy > extensions. As toke noted, just distribute the /64. Source specific routing is cool, too, you should try distributing real ipv6 ranges from two or more gateways. > There are approaches to reduce the amount of routes per client including > using nat66 on each node. You certainly are making it sound like there > should be put some thought into reducing the amount of routes. This will be > the next step after we have more than just a few nodes / clients inside the > same network. > > BTW: the Freifunk networks use an autoupdater. This might just solve your > tree-climbing-problem in the long run... Across 6 versions of the OS in 6 years of deployment, and 5 different generations of hardware, my automation problem is hopeless. Of all the gear I've had to date the nanostation 5s, wndr3800s, and picostations have been the best. I'm having good results thus far with the ubnt UAP-AC-M-USes with the candeletech firmware - aside from not having enough flash for a web interface. > > Cheers > Christof > >> >> I'm working these days, on making netem better emulate wifi's >> behaviors. I'm not satisified with it yet. >> > >>> Note that babeld currently sends updates as a single burst when the upate >>> interval expires (the same is true of Toke's implementation of Babel, as >>> far as I'm aware). For very large networks, it would be good to split >>> updates into one-packet pieces that are sent throughout the update >>> interval. I'd be glad to accept a patch that does that. I'd rather like to keep the burst but measure how long it takes to transmit. > * making babel trigger updates on newly appeared routes > > >>> I've gone
[Babel-users] [PATCH] Always specify linux route metric to defaults
I updated both an aarch64 and an x86_64 box to git head today. No issues. ... Attached is part one of trying to get atomic updates to work, which is a change to how metrics are passed into the kernel. I don't know if this change is worth it by itself, nor do I understand why -1 was used in the first place. The original code would export -1 into the linux kernel routing table for unreachable routes, in addition to setting RTN_UNREACHABLE. RTN_UNREACHABLE is all you need (at least, on modern kernels), and changing the metric to -1 makes it impossible to do atomic updates. Also it has a theoretical flaw in that other similar routes with smaller metrics, inserted by other protocols, can override the now unreachable route. Using the defaults leads to more consistent behavior. This uses the defaults of 0 and 1024 for ipv4 and ipv6 route metrics respectively. From cea7e11152b0a7cebd3572c63ccc9454e6840277 Mon Sep 17 00:00:00 2001 From: Dave Taht <d...@taht.net> Date: Thu, 8 Mar 2018 13:00:01 -0800 Subject: [PATCH] Always specify linux route metric to defaults The original code would export -1 into the linux kernel routing table for unreachable routes, in addition to setting RTN_UNREACHABLE. RTN_UNREACHABLE is all you need (at least, on modern kernels), and changing the metric to -1 makes it impossible to do atomic updates. Also it has a theoretical flaw in that other similar routes with smaller metrics, inserted by other protocols, can override the now unreachable route. Using the defaults leads to more consistent behavior. This uses the defaults of 0 and 1024 for ipv4 and ipv6 route metrics respectively. --- kernel_netlink.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/kernel_netlink.c b/kernel_netlink.c index 94f01b0..e521ed7 100644 --- a/kernel_netlink.c +++ b/kernel_netlink.c @@ -100,6 +100,9 @@ int num_old_if = 0; static int dgram_socket = -1; +static const int ipv4_metric = 0; +static const int ipv6_metric = 1024; + #ifndef ARPHRD_ETHER #warning ARPHRD_ETHER not defined, we might not support exotic link layers #define ARPHRD_ETHER 1 @@ -1055,9 +1058,9 @@ kernel_route(int operation, int table, rta = RTA_NEXT(rta, len); rta->rta_len = RTA_LENGTH(sizeof(int)); rta->rta_type = RTA_PRIORITY; +*(int*)RTA_DATA(rta) = ipv4 ? ipv4_metric : ipv6_metric; if(metric < KERNEL_INFINITY) { -*(int*)RTA_DATA(rta) = metric; rta = RTA_NEXT(rta, len); rta->rta_len = RTA_LENGTH(sizeof(int)); rta->rta_type = RTA_OIF; @@ -1074,8 +1077,6 @@ kernel_route(int operation, int table, rta->rta_type = RTA_GATEWAY; memcpy(RTA_DATA(rta), gate, sizeof(struct in6_addr)); } -} else { -*(int*)RTA_DATA(rta) = -1; } buf.nh.nlmsg_len = (char*)rta + rta->rta_len - buf.raw; -- 2.7.4 ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] Route-dete :wq
On Wed, Mar 7, 2018 at 2:41 PM, Juliusz Chroboczek <j...@irif.fr> wrote: > Christof, I'm very much interested in your experiments, which are likely > to improve the quality of the Babel implementations. > >> We have update-interval set to 5 minutes to reduce the load on the network >> because we are hoping to run this topology on 500+ APs with 1000+ Clients. I wrote the rtod tool to experiment with creating large topologies. I've not got around to publishing the veth topos, but the tool is here: https://github.com/dtaht/rtod One thing that fell out of that was mainline babeld had a very suboptimal routine for merging routes (also it's use of memcmp was inefficient) - and juliusz put out a call for a qsorted implementation a while back. Elsewhere in babeld, atomic updates remain elusive. I made a bit of progress on that last year (unreachable routes have to keep the same metric to be an atomic change) but got beat by another bug that juliusz/martin(?) fixed later. These are both problems that I've long meant to get to, but the prospect of redeploying with the pending unicast feature and source specific routing, as well as reflashing a few dozen devices in treetops with modern code, has thus far stopped me. I kind of hope to use bird on a goodly portion of the next deployment, and hopefully this summer I can find someone that likes climbing trees in california more than I do. I have high hopes for the unicast stuff to lighten the routing load by potentially orders of magnitude. Somewhere in there (after breaking every routing daemon and protocol in multiple ways with rtod, and making several improvements to "rabeld", and scheming to replace all! all of it, with my own massively sorted, threaded, NEON coprocessor using version of everything rewritten from scratch and running out of time long before it reached plausible promise)... I read "No Free Lunch Theorems for Search", and despaired. Every daemon and protocol will break on some number of routes. Period. See: https://en.wikipedia.org/wiki/No_free_lunch_in_search_and_optimization Short of a revolution in graph theory I see no way to get anywhere on that, and yet tend to think dynamically making the update interval larger to account for various bounds (cpu,bandwidth) is part of a way out, along with having a harder r/t scheduler for the bellman-ford calc (needn't be threaded) would help to make sure it fits within bounds, repeating important packets more often, etc. (please note that most of what I'm saying as per NFL, etc is common to all routing protocols and daemons) > The protocol is very flexible, but the reference implementation is not > designed to work with such large update intervals. The amount of data in > an update is pretty small, so I would recommend reducing the update > interval -- it should be fine with thousands of routes in your network. Try to remember that other traffic eats capacity, and even in a fq_codel'd environment you only can get a percentage of bandwidth (with low delays). I've seen devices with essentially infinite queues for mcast, also - over 16 seconds long! I start running into trouble with 1000+ routes using 1Mbit mcast. Sooner if I seriously slam the network with flent or something else that abuses mcast like mdns. YMMV. > The right way to reduce the amount of routing traffic in Babel is not to > increase the update interval (which can at best yield a linear reduction), > but to use aggregation and filtering (which can yield an exponential > decrease in a well designed network). Dave Taht has been successful with > this approach, perhaps he'll want to chime in. I've also experimented with dynamically changing the broadcast update interval, as long term stable routes are, well, long term stable. Even a linear reduction seemed worthwhile at the time I was fiddling with this. Another way to preserve some percentage of sanity would be to always update default routes often but with a long declared broadcast interval and less used routes on a longer one. As for aggregation and filtering: Most of my APs have at minimum, ethernet, and two channels - usually four, including the meshy links. The meshy links are ptp, so I've generally "wasted" an entire /22 ipv4 network to talk to the APs. ipv6 /62s. The lab component of my network, for example, has two main links to the production net, and the gateways only announce the subnet it is on (172.22.0.0/16). This cuts the churn seen outside the network when I do crazy things like reboot the whole thing. The biggest problem I've run into, is that meshy links, are, meshy - and I've lost track of the number of times where I had a well defined /16 network in the lab suddenly leak all the meshy /32 bits over the worst possible link - because I plugged something in that was adhoc (and poorly) connected to the outside network that I shouldn't have. Lede creates one /48 ULA by defau
Re: [Babel-users] Heads up: merged rfc6126bis into master
While not exactly a flag day from a protocol perspective, the commit removing keep-unfeasible will break existing startup scripts and conf files that have it enabled. https://github.com/jech/babeld/commit/0111f5c1d69ce643a6a76a811eb8f89fb4deb936 -- Dave Täht CEO, TekLibre, LLC http://www.teklibre.com Tel: 1-669-226-2619 ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] [babel] routing tables of death
I too may be able to visit Paris in late march. also, a start at an alternative to shortest path metrics, with so many problems that I'd run off this page writing them up, but I was happy to read it this morning. http://ieeexplore.ieee.org/abstract/document/8025937/ ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
[Babel-users] routing tables of death
I confess curiosity, now that we have quagga, bird, and mainline babeld versions, and nifty new stuff like unicast hellos, as to how well they interoperate at this point, as well as perform, under a stress test like: https://github.com/dtaht/rtod -- Dave Täht CEO, TekLibre, LLC http://www.teklibre.com Tel: 1-669-226-2619 ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
[Babel-users] [PATCH 5/6] gitignore tags TAGS emacs and vi temp files and bad patch attempts
From: Dave Taht <d...@taht.net> Quiet git more. --- .gitignore | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/.gitignore b/.gitignore index 635e60b..d298e6e 100644 --- a/.gitignore +++ b/.gitignore @@ -2,6 +2,13 @@ babeld babeld.html version.h -cscope.out +cscope.* gmon.out core +TAGS +tags +babeld-whole* +*.rej +*.orig +*~ +\#*# -- 2.7.4 ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
[Babel-users] [PATCH 6/6] babeld-whole: Enable single unit compilation of all of babeld
From: Dave Taht <d...@taht.net> This adds compile guards and Makefile support to building all of babeld in a single shot, as "babeld-whole". This has compelling advantages: - It lets you A/B two versions with different compilation options such as debugging on or off, or selectively enable work in progress. Example: make EXTRA_DEFINES="-DNO_DEBUG -DHAVE_NEON" babeld-whole - (theoretically) gives the compiler more chances to find optimizations - Compiles faster This patch also enables verbose assembly language output of the resulting code with babeld-whole.s. --- Makefile| 15 ++- babeld.h| 4 configuration.h | 5 - disambiguation.h| 5 + generate-version.sh | 2 ++ interface.h | 4 kernel.h| 4 local.h | 4 message.h | 4 neighbour.h | 5 + net.h | 5 + resend.h| 6 +- route.h | 5 - rule.h | 6 +- source.h| 5 + util.h | 4 xroute.h| 5 + 17 files changed, 83 insertions(+), 5 deletions(-) diff --git a/Makefile b/Makefile index 9454b1a..e8795fb 100644 --- a/Makefile +++ b/Makefile @@ -68,6 +68,19 @@ xroute.o: xroute.c babeld.h kernel.h neighbour.h message.h route.h util.h \ version.h: ./generate-version.sh > version.h +# Whole program compilation with maximum optimization + +babeld-whole.c: $(SRCS) $(INCLUDES) + cat $(SRCS) > babeld-whole.c + +babeld-whole.s: babeld-whole.c + $(CC) $(CFLAGS) -O3 $(LDFLAGS) -fwhole-program -fverbose-asm \ +babeld-whole.c -S -o babeld-whole.s + +babeld-whole: babeld-whole.c + $(CC) $(CFLAGS) $(LDFLAGS) -O3 -fwhole-program babeld-whole.c \ + -o babeld-whole $(LDLIBS) + .SUFFIXES: .man .html .man.html: @@ -99,7 +112,7 @@ uninstall: -rm -f $(TARGET)$(MANDIR)/man8/babeld.8 clean: - -rm -f babeld babeld.html version.h *.o *~ core + -rm -f babeld babeld.html version.h *.o *~ core babeld-whole* reallyclean: clean -rm -f TAGS tags gmon.out cscope.out diff --git a/babeld.h b/babeld.h index e46dd79..2b1b64d 100644 --- a/babeld.h +++ b/babeld.h @@ -1,3 +1,5 @@ +#ifndef _BABEL_BABELD +#define _BABEL_BABELD /* Copyright (c) 2007, 2008 by Juliusz Chroboczek @@ -110,3 +112,5 @@ void schedule_neighbours_check(int msecs, int override); void schedule_interfaces_check(int msecs, int override); int resize_receive_buffer(int size); int reopen_logfile(void); + +#endif diff --git a/configuration.h b/configuration.h index de2b1ba..7a4a349 100644 --- a/configuration.h +++ b/configuration.h @@ -1,3 +1,5 @@ +#ifndef _BABEL_CONFIGURATION +#define _BABEL_CONFIGURATION /* Copyright (c) 2007, 2008 by Juliusz Chroboczek @@ -19,7 +21,6 @@ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ - /* Values returned by parse_config_from_string. */ #define CONFIG_ACTION_DONE 0 @@ -77,3 +78,5 @@ int install_filter(const unsigned char *prefix, unsigned short plen, const unsigned char *src_prefix, unsigned short src_plen, struct filter_result *result); int finalise_config(void); + +#endif diff --git a/disambiguation.h b/disambiguation.h index 13fc89a..aa67a68 100644 --- a/disambiguation.h +++ b/disambiguation.h @@ -1,3 +1,6 @@ +#ifndef _BABEL_DISAMBIGUATION +#define _BABEL_DISAMBIGUATION + /* Copyright (c) 2014 by Matthieu Boutier and Juliusz Chroboczek. @@ -25,3 +28,5 @@ int kuninstall_route(const struct babel_route *route); int kswitch_routes(const struct babel_route *old, const struct babel_route *new); int kchange_route_metric(const struct babel_route *route, unsigned refmetric, unsigned cost, unsigned add); + +#endif diff --git a/generate-version.sh b/generate-version.sh index 22dba7d..7cd00f4 100755 --- a/generate-version.sh +++ b/generate-version.sh @@ -10,4 +10,6 @@ else version="unknown" fi +echo "#ifndef BABELD_VERSION" echo "#define BABELD_VERSION \"$version\"" +echo "#endif" diff --git a/interface.h b/interface.h index 7294196..a2eb390 100644 --- a/interface.h +++ b/interface.h @@ -1,3 +1,6 @@ +#ifndef _BABEL_INTERFACE +#define _BABEL_INTERFACE + /* Copyright (c) 2007, 2008 by Juliusz Chroboczek @@ -143,3 +146,4 @@ void set_timeout(struct timeval *timeout, int msecs); int interface_up(struct interface *ifp, int up); int interface_ll_address(struct interface *ifp, const unsigned char *address); void check_interfaces(void); +#endif diff --git a/kernel.h b/kernel.h index b6286fc..e2a7565 100644 --- a/kernel.h +++ b/kernel.h @@ -1,3 +1,6 @@ +#ifndef _BABEL_KERNEL +#define _BABEL_KERNEL + /* Copyright (c) 2
[Babel-users] [PATCH 4/6] Tests: Add subdir and test for structure packing
From: Dave Taht <d...@taht.net> An initial run of this test on x86_64 shows suboptimal packing for the filter structure (can be 96), and kernel_route (can be 64). Other things, such as kernel_rule, buffered_update and xroute could be padded to more natural 8 byte boundaries for this arch. Other structure optimizations are feasible. Having this tool readily available to measure changes across different architectures is helpful. ./show_babel_packing Compiled with gcc 5.4.0 for the x86_64 architecture in little endian mode 24 = sizeof (filter_result) 112 = sizeof (filter) 44 = sizeof (buffered_update) 56 = sizeof (interface_conf) 272 = sizeof (interface) 68 = sizeof (kernel_route) 28 = sizeof (kernel_rule) 64 = sizeof (kernel_filter) 24 = sizeof (local_socket) 120 = sizeof (neighbour) 88 = sizeof (resend) 88 = sizeof (babel_route) 56 = sizeof (source) 44 = sizeof (xroute) --- tests/.gitignore | 11 tests/Makefile | 22 tests/arch_detect.h| 53 + tests/show_babel_packing.c | 66 ++ 4 files changed, 152 insertions(+) create mode 100644 tests/.gitignore create mode 100644 tests/Makefile create mode 100644 tests/arch_detect.h create mode 100644 tests/show_babel_packing.c diff --git a/tests/.gitignore b/tests/.gitignore new file mode 100644 index 000..dd71803 --- /dev/null +++ b/tests/.gitignore @@ -0,0 +1,11 @@ +show_babel_packing +*.o +cscope.* +gmon.out +core +TAGS +tags +*.rej +*.orig +*~ +\#*# diff --git a/tests/Makefile b/tests/Makefile new file mode 100644 index 000..e6fe035 --- /dev/null +++ b/tests/Makefile @@ -0,0 +1,22 @@ +PREFIX = /usr/local +MANDIR = $(PREFIX)/share/man + +PROGS = show_babel_packing + +CDEBUGFLAGS = -Os -g -Wall + +DEFINES = $(PLATFORM_DEFINES) + +CFLAGS = $(CDEBUGFLAGS) $(DEFINES) $(EXTRA_DEFINES) + +INCLUDES = babeld.h net.h kernel.c util.h interface.h source.h neighbour.h \ + route.h xroute.h message.h resend.h configuration.h local.h \ + disambiguation.h rule.h version.h + +INCLUDES1 := $(INCLUDES:%=../%) + +show_babel_packing: show_babel_packing.c $(INCLUDES1) + $(CC) $(CFLAGS) $(LDFLAGS) -I.. $@.c -o show_babel_packing $(OBJS) $(LDLIBS) + +clean: + -rm -f $(PROGS) show_babel_packing.o diff --git a/tests/arch_detect.h b/tests/arch_detect.h new file mode 100644 index 000..103c2ef --- /dev/null +++ b/tests/arch_detect.h @@ -0,0 +1,53 @@ +/** + * arch_detect.h + * + * Toke Høiland-Jørgensen + * 2017-03-08 + */ + +#ifndef ARCH_DETECT_H +#define ARCH_DETECT_H + +#ifdef __GNUC__ +#define GCC_VERSION (__GNUC__ * 1 \ + + __GNUC_MINOR__ * 100 \ + + __GNUC_PATCHLEVEL__) +#endif + +#define P(a)typedef struct a a ##_t; \ + printf("%4ld = sizeof (%s)\n", sizeof(a ## _t), #a) + +#if !(defined(__arm__) || defined(__aarch64__) || defined(__x86_64__) \ + || defined(__i386__) || defined(__mips__)) + const char arch[] = "unknown"; +#else +#if defined(__arm__) || defined(__aarch64__) +#if defined(__ARM_ARCH_ISA_A64__) || defined(__aarch64__) + const char arch[] = "aarch64"; +#else +#ifdef __ARM_ARCH_7S__ + const char arch[] = "armv7s"; +#else +#ifdef __ARM_ARCH_7A__ + const char arch[] = "armv7"; +#endif +#endif +#endif /* end of arm detection. Fixme - need to find neon regs */ +#else +#if defined(__x86_64__) + const char arch[] = "x86_64"; +#endif + +#if defined(__i386__) +const char arch[] = "x86"; +#endif + +#if defined(__mips__) +const char arch[] = "MIPS"; +#endif + +#endif + +#endif + +#endif diff --git a/tests/show_babel_packing.c b/tests/show_babel_packing.c new file mode 100644 index 000..a670abd --- /dev/null +++ b/tests/show_babel_packing.c @@ -0,0 +1,66 @@ +/** + * test_packing.c + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +#include "babeld.h" +#include "util.h" +#include "net.h" +#include "kernel.h" +#include "interface.h" +#include "source.h" +#include "neighbour.h" +#include "route.h" +#include "xroute.h" +#include "message.h" +#include "resend.h" +#include "configuration.h" +#include "local.h" +#include "rule.h" +#include "version.h" + +#include "arch_detect.h" + +int main() +{ +#ifdef __GNUC__ +printf("Compiled with gcc %d.%d.%d for the %s " +"architecture in %s mode\n", +__GNUC__, __GNUC_MINOR__, __GNUC_PATCHLEVEL__, +arch, __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ ? +"little endian" : "big endian"); +#
[Babel-users] [PATCH 3/6] Quiet the compiler on two uninitialized variable warnings
From: Dave Taht <d...@taht.net> When compiled on arm and gcc 4.9.2, the compiler picks out two sets of variables passed by reference that "might be used uninitialized". They aren't, but certainly in the getnet case it was not immediately obvious to either me or the compiler. Quiet the warnings by explicitly initializing these variables. --- configuration.c | 4 ++-- route.c | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/configuration.c b/configuration.c index c4b4d55..b455181 100644 --- a/configuration.c +++ b/configuration.c @@ -273,8 +273,8 @@ getnet(int c, unsigned char **p_r, unsigned char *plen_r, int *af_r, char *t; unsigned char *ip; unsigned char addr[16]; -unsigned char plen; -int af, rc; +unsigned char plen = 0; +int af = 0, rc = 0; c = getword(c, , gnc, closure); if(c < -1) diff --git a/route.c b/route.c index a4cd302..5cd2260 100644 --- a/route.c +++ b/route.c @@ -212,7 +212,7 @@ resize_route_table(int new_slots) static struct babel_route * insert_route(struct babel_route *route) { -int i, n; +int i, n = 0; assert(!route->installed); -- 2.7.4 ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
[Babel-users] [PATCH 1/6] Improve Makefile
From: Dave Taht <d...@taht.net> - Add INCLUDES variable for headers - Add support for tags and TAGS - create "reallyclean" to get rid of tags, TAGS, gmon.out, cscope.out - remove TAGs and gmon.out from "clean" - Add full and correct dependencies on internal headers --- Makefile | 58 +- 1 file changed, 53 insertions(+), 5 deletions(-) diff --git a/Makefile b/Makefile index 9b98eb8..9454b1a 100644 --- a/Makefile +++ b/Makefile @@ -17,14 +17,53 @@ OBJS = babeld.o net.o kernel.o util.o interface.o source.o neighbour.o \ route.o xroute.o message.o resend.o configuration.o local.o \ disambiguation.o rule.o +INCLUDES = babeld.h net.h kernel.c util.h interface.h source.h neighbour.h \ + route.h xroute.h message.h resend.h configuration.h local.h \ + disambiguation.h rule.h version.h + +KFILES = kernel_netlink.c kernel_socket.c + babeld: $(OBJS) $(CC) $(CFLAGS) $(LDFLAGS) -o babeld $(OBJS) $(LDLIBS) -babeld.o: babeld.c version.h +babeld.o: babeld.c $(INCLUDES) + +local.o: local.c local.h version.h babeld.h interface.h source.h neighbour.h \ +kernel.h xroute.h route.h util.h configuration.h + +kernel.o: kernel_netlink.c kernel_socket.c kernel.h babeld.h + +configuration.o: configuration.c babeld.h util.h route.h kernel.h \ +configuration.h rule.h + +disambiguation.o: disambiguation.c babeld.h util.h route.h kernel.h \ + disambiguation.h interface.h rule.h + +interface.o: interface.c babeld.h util.h route.h kernel.h local.h \ +interface.h neighbour.h message.h configuration.h xroute.h + +message.o: message.c babeld.h util.h net.h interface.h source.h neighbour.h \ + route.h kernel.h xroute.h resend.h configuration.h + +neighbour.o: neighbour.c babeld.h util.h interface.h source.h route.h \ +neighbour.h message.h resend.h local.h -local.o: local.c version.h +net.o: net.c net.h babeld.h util.h -kernel.o: kernel_netlink.c kernel_socket.c +resend.o: resend.c babeld.h util.h neighbour.h message.h interface.h \ + resend.h configuration.h + +route.o: route.c babeld.h util.h kernel.h interface.h source.h neighbour.h \ +route.h xroute.h message.h configuration.h local.h disambiguation.h + +rule.o: rule.c babeld.h util.h kernel.h configuration.h rule.h + +source.o: source.c babeld.h util.h interface.h route.h source.h + +util.o: util.c babeld.h util.h + +xroute.o: xroute.c babeld.h kernel.h neighbour.h message.h route.h util.h \ + xroute.h configuration.h interface.h local.h version.h: ./generate-version.sh > version.h @@ -36,10 +75,16 @@ version.h: babeld.html: babeld.man -.PHONY: all install install.minimal uninstall clean +.PHONY: all install install.minimal uninstall clean reallyclean all: babeld babeld.man +TAGS: $(SRCS) $(INCLUDES) $(KFILES) + etags $(SRCS) $(INCLUDES) $(KFILES) + +tags: $(SRCS) $(INCLUDES) $(KFILES) + ctags $(SRCS) $(INCLUDES) $(KFILES) + install.minimal: babeld -rm -f $(TARGET)$(PREFIX)/bin/babeld mkdir -p $(TARGET)$(PREFIX)/bin @@ -54,4 +99,7 @@ uninstall: -rm -f $(TARGET)$(MANDIR)/man8/babeld.8 clean: - -rm -f babeld babeld.html version.h *.o *~ core TAGS gmon.out + -rm -f babeld babeld.html version.h *.o *~ core + +reallyclean: clean + -rm -f TAGS tags gmon.out cscope.out -- 2.7.4 ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
[Babel-users] Misc cleanups to the babeld build system
This patch series makes it easier to work on babeld's source code in a variety of ways. [PATCH 1/6] Improve Makefile [PATCH 2/6] Make v4prefix a shared constant between util.c and [PATCH 3/6] Quiet the compiler on two uninitialized variable warnings [PATCH 4/6] Tests: Add subdir and test for structure packing [PATCH 5/6] gitignore tags TAGS emacs and vi temp files and bad patch [PATCH 6/6] babeld-whole: Enable single unit compilation of all of ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
[Babel-users] [PATCH 2/6] Make v4prefix a shared constant between util.c and message.c
From: Dave Taht <d...@taht.net> Share the data better. --- message.c | 3 +-- util.c| 2 +- 2 files changed, 2 insertions(+), 3 deletions(-) diff --git a/message.c b/message.c index fdc1999..f8c4ad2 100644 --- a/message.c +++ b/message.c @@ -54,8 +54,7 @@ unsigned char *unicast_buffer = NULL; struct neighbour *unicast_neighbour = NULL; struct timeval unicast_flush_timeout = {0, 0}; -static const unsigned char v4prefix[16] = -{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0xFF, 0xFF, 0, 0, 0, 0 }; +extern const unsigned char v4prefix[16]; #define MAX_CHANNEL_HOPS 20 diff --git a/util.c b/util.c index 1c15dbf..0de2245 100644 --- a/util.c +++ b/util.c @@ -246,7 +246,7 @@ normalize_prefix(unsigned char *restrict ret, return ret; } -static const unsigned char v4prefix[16] = +const unsigned char v4prefix[16] = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0xFF, 0xFF, 0, 0, 0, 0 }; static const unsigned char llprefix[16] = -- 2.7.4 ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
[Babel-users] [PATCHes] Misc cleanups to the babeld build system
I am not particularly huge on posting and reviewing patches in-line on mailing lists, but I can repost this pull request here, if desired. https://github.com/jech/babeld/pull/9 This patch series consists of a bunch of miscellaneous improvements to the babeld build system: improving the makefile to fully check dependencies, adding direct support for TAGS, tags and cscope, enabling whole program compilation, smashing some tabs that crept in elsewhere, fixing a missed dependency in disambiguation.h, enabling whole program compilation and assembly language output, and finally, adding a tests directory where a test for structure packing now exists as a further optimization aid. It is the first of 5 major sets cleanly building on deeper improvements from the now-retired wreckage in the rabeld repository. Next up are patch sets for better error logging, then better error handling. After that - ghu willing - atomic updates, and some major performance improvements. -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
[Babel-users] risc-v
I've been fiddling with risc-v stuff here and there of late. http://bellard.org/riscvemu/ Amazingly babel "just builds" and with a little fiddling merely crashes on the lack of ipv6 support in the supplied fedora25 kernel. I guess I should go build a better kernel. root@nemesis:/tmp/babeld# file babeld babeld: ELF 64-bit LSB executable, UCB RISC-V, version 1 (SYSV), dynamically linked, interpreter /lib/ld.so.1, for GNU/Linux 2.6.32, BuildID[sha1]=b771fb64167fe8dd6d2a424cfe6eff1d96916c0b, not stripped -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
[Babel-users] babeldStyle
Is there a specific emacs or vi "c-style" setting I should use while hacking on babeld? (.el file would be helpful) I see: functions typically start with the { on the first line 4 character indent 8 characters are usually characters, but there are tabs -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] Debugging unreachable routes - IPv6 as next hop?
I have been chasing a similar set of bugs for months now. Routes would be unreachable for no reason I could see, updates to the kernel would fail[1]. How big is the total route table? Does it stay unreachable? Can you try reverting to babeld-1.7.1 for lede? ... I finally got heads down on it last week and I have a slew of debugging patches that I need to clean up for 1.8... and then I need to do a build for lede - but I haven't got around to it yet. Nor have I tried 1.7.x - I was trying to debug something elsewhere In particular I found that network manager was stomping on babel in my network. My test case is unfortunately not as simple as yours. But I was originally seeing some sort of interaction with odhcpd also. [1] Lastly there was a major bug in the wifi ATF fairness code for ath9k stomped last week, which could scribble on memory just about anywhere, and some fixes for odhcpd, and ubus landed also. ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] [LEDE-DEV] Babeld now has procd support on OpenWRT/LEDE
On Fri, Jan 13, 2017 at 8:11 AM, L. D. Pinney <ldpin...@yahoo.com> wrote: > Go back to playing the guitar and smoking dopethat's what you do best. I look forward very much to doing more of that... after lede ships. In the interim, I have two fairly large testbeds setup to test lede - the ATF patch, the dnsmasq-dnssec code, the bcp38 code, the dhcpv6-pd code, the new bridging code, and babel integration are all high on my list, along with sch_cake and the sqm-scripts. I have 5 lede platforms under test - the archer c7v2, wndr3800, linksys 1200ac, uap-lite, and picostation, with two more that I plan to add after testing interop (edgerouter and an apu2). There are also 3 different brands of cablemodem in the loop. Test clients include a dozen hackerboards (kernels going back to 3.10), OSX, and Linux. I'm perpetually testing edge cases and things that seem like they should work, but don't. While things are vague and confusing (like, babel-1.8's behavior being confusing in some instances) - and a piece of a puzzle lands (like not restarting it as much, thus causing less route retractions and floods) - I do tend to dump partial state about the rathole I may have been going down. I've been trying to come up with tests for understanding the true effects of the multicast-unicast bridging code, for wifi primarily, and simplifying. At the moment I'm more trying to test 3 layers of lede dhcp-pd (bridged and routed scenarios) across gear that needs to interop... ... instead of dealing with how all that feeds into babel source specific routing - which is a combination of interactions between netifd, babeld, and multicast/bridging. > > STOP CROSS POSTING YOU FSCKin' Clown Boy If these projects want to write "no cross posting ever" into the rules, I will gladly comply. I hope you get your caps-lock key fixed. I'm going to just filter out all further postings from you into my trash folder. > > On Saturday, January 14, 2017 12:06 AM, Dave Taht <dave.t...@gmail.com> > wrote: > > > On Fri, Jan 13, 2017 at 4:08 AM, L. D. Pinney <ldpin...@yahoo.com> wrote: >> DAVE : >> >> WILL YOU PLEASE STOP YOUR FSCKin' CROSS POSTING ??? > > I did not start the cross-post in this case. > >> This is UNRELATED to the OpenWrt / LEDE DEV mailing list...as the change >> has >> been merged. > > Interop with routing protocols... and networking in general... does > exist outside of the openwrt universe. In my case I am deeply > concerned about what happens against older, deployed versions of > Linuxes (and other OSes) with the new multicast-unicast bridge > conversion code in lede. Babel tends to use it, and I am also testing > (in lede!), as per the below, the dhcpv6-pd code, interop-ing with > several devices. > >> F O > > I'm really sorry that you hate cross posting so much. It must be > terrible to have to elide additional responses or deal with bounce > messages on every 20th email from me. And it must be wonderful to be > living in a world where all you have is openwrt/lede devices on the > network and modern kernels everywhere. > >> >> On Friday, January 13, 2017 5:20 AM, Dave Taht <dave.t...@gmail.com> >> wrote: >> >> >> On Thu, Jan 12, 2017 at 1:01 PM, Baptiste Jonglez >> <bapti...@bitsofnetworks.org> wrote: >>> Hi, >>> >>> Here is yet another OpenWRT-related change for babeld: I just merged >>> procd >>> support for babeld [2], after more than two years of lingering [1]. >>> >>> The only user-visible changes should be: >>> >>> - babeld now logs to the system log (visible with "logread") instead of a >>> file in /var/log. This is nice for embedded devices, where you don't >>> want to write too much to the filesystem. It is still possible to >>> explicitly configure babeld to use a log file; >>> >>> - babeld is now restarted automatically whenever it crashes; >>> >>> - the usual procd niceties: calling "/etc/init.d/babeld reload" will >>> restart babeld only if the configuration has changed. >>> >>> >>> Please test babeld 1.8.0-2 and report any resulting breakage. I would >>> like this change (and the other compatibility change) to make it into the >>> upcoming LEDE release, which is due to happen quite soon. >> >> Groovy. >> >> lede can dynamically insert/delete routes into tables from netifd >> babeld can pull routes from "protos" but not tables. >> >> I spoke with hedecker (? can't remember his email) about somehow >> having a field to export routes into kernel protos in the lede network >> file, he indicated he'd look at it in
Re: [Babel-users] Babeld now has procd support on OpenWRT/LEDE
On Thu, Jan 12, 2017 at 1:01 PM, Baptiste Jonglezwrote: > Hi, > > Here is yet another OpenWRT-related change for babeld: I just merged procd > support for babeld [2], after more than two years of lingering [1]. > > The only user-visible changes should be: > > - babeld now logs to the system log (visible with "logread") instead of a > file in /var/log. This is nice for embedded devices, where you don't > want to write too much to the filesystem. It is still possible to > explicitly configure babeld to use a log file; > > - babeld is now restarted automatically whenever it crashes; > > - the usual procd niceties: calling "/etc/init.d/babeld reload" will > restart babeld only if the configuration has changed. > > > Please test babeld 1.8.0-2 and report any resulting breakage. I would > like this change (and the other compatibility change) to make it into the > upcoming LEDE release, which is due to happen quite soon. Groovy. lede can dynamically insert/delete routes into tables from netifd babeld can pull routes from "protos" but not tables. I spoke with hedecker (? can't remember his email) about somehow having a field to export routes into kernel protos in the lede network file, he indicated he'd look at it in a few weeks. (I wanted to get away from ever having to revise the conf file dynamically, but it looks like not this release. Not having to restart babeld as per the above is a nice improvement though and I'll get on testing it this weekend. At the moment I'm going through some mild hell with dhcpv6-pd on comcast and adding "sonic" fiber (with a HE ipv6 tunnel. Will hopefully have 4 source specific gateways to play with here) In other other news the "rabeld" backport of the gentler route switch change loses kernel routes on the vyatta (3.10 based) OS in the edgerouter. :(. That said, haven't tested mainline babeld there yet. It seems to work on debian. For those fiddling with edgerouter's default 1.9.x OS, backports of cake, iproute, and rabeld are presently here: https://build.lochnair.net/ > Baptiste > > [1] https://github.com/openwrt-routing/packages/pull/55 > [2] https://github.com/openwrt-routing/packages/pull/250 > > ___ > Babel-users mailing list > Babel-users@lists.alioth.debian.org > http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] unicast attempt breaks timestamping
On Mon, Jan 9, 2017 at 6:03 AM, Baptiste Jonglez <bapti...@bitsofnetworks.org> wrote: > Hi again, > > On Fri, Jan 06, 2017 at 11:02:05AM -0800, Dave Taht wrote: >> Based on the patch juliusz supplied me to enable unicast IHU, and >> >> default enable-timestamp true >> >> this stops sending timestamps (which apparently relies on hellos and >> IHUs being bundled together) > > Can you provide the patch in question? Otherwise, it's hard to test. https://github.com/dtaht/rabeld/commit/38f1fd2338506a06fbc89d618ab9640d97fce565 > Is there any reason why you couldn't send hellos alongside the IHUs? As > far as I remember, the method we use to compute the RTT really needs to > have a timestamped Hello alongside the IHU. > > Baptiste > > ___ > Babel-users mailing list > Babel-users@lists.alioth.debian.org > http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
[Babel-users] automating route injection
what I've typically done to export a single IP is add it to the babel configuration file, and I also use covering routes a lot, essentially the same method. redistribute local ip fd99::66/128 eq 128 allow redistribute ip 172.26.130.0/23 eq 23 allow redistribute local deny While this works, as more and more of my stuff ends up getting dynamic addresses (ipv6, tunnels), something simpler that pulled from a kernel table dynamically, rather than requiring reconfiguration seemed desirable. Doing things like this config filter option 'type' 'redistribute' option 'ip' '::/0' option 'le' '61' option 'action' 'allow' break when you end up getting a bunch of /64s from elsewhere instead of a coherent range... ... the soon to be stable lede/openwrt has a new facility to automatically insert routes into different tables (you add ipv4table or ipv6table to the interface setup). So I thought that I'd simplify matters (hah!) by using that ipvXtable facility and change my babel configuration to just pull from a specific kernel table, but thus far with a variety of attempts, singleton routes like: ip -6 addr add fd99::66/128 dev lo ip -6 route add fd99::66/128 dev lo table 8 root@chip-6:~# ip -6 route show table 8 unreachable fd99::66 dev lo metric 1024 error -101 never manage to get exported to the universe (filtered out due to be-ing locally unreachable?), or, if exported, remain unreachable... I've tried various combinations of import-table 8 install table 8 redistribute table 8 redistribute local table 8 * Side note The recent babel 1.8 commit into lede does not include support for the install or table keywords in filters. I can fix that there is sadly no support for inserting stuff into a proto rather than table in lede. -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
[Babel-users] unicast attempt breaks timestamping
Based on the patch juliusz supplied me to enable unicast IHU, and default enable-timestamp true this stops sending timestamps (which apparently relies on hellos and IHUs being bundled together) (aside from that unicast IHU is working fine, am still deploying stuff around the testbed, not evaluating anything yet) I was polling rtts (as measured by babel) in some prior "latency under load" tests. * side notes the "chip" does not support ipv6-subtrees, nor adhoc. It is certainly handy to babel the usb0 interface, but working around systemd (on ubuntu) can be a pain. add neighbour 1cc4970 address fe80::14c7:67ff:fe31:ceac if enp0s16u1 reach rxcost 96 txcost 96 rtt 0.743 rttcost 0 cost 96 add neighbour 1cc4f40 address fe80::16cc:20ff:fee5:64c2 if enp3s0 reach rxcost 96 txcost 256 rtt 1.067 rttcost 0 cost 256 saturating the usb interface only bumps up the measured rtt by < 5ms add interface enp0s16u1 up true ipv6 fe80::3c21:acff:fe2d:82c1 ipv4 172.26.66.10 add neighbour 1cc4970 address fe80::14c7:67ff:fe31:ceac if enp0s16u1 reach rxcost 96 txcost 96 rtt 4.080 rttcost 0 cost 96 -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] some thoughts towards babel-1.9
On Mon, Jan 2, 2017 at 3:35 PM, Juliusz Chroboczekwrote: >> For the wired link case, I am surprised babel considers it "interfering"! > > It doesn't. > > https://github.com/jech/babeld/blob/master/interface.c#L210 If that is the only error in two highly speculative documents then I'm winning. :) It is probable that I wrote that down while dealing with something in bridge mode, and before I'd realized that we had to set the channel manually. > ___ > Babel-users mailing list > Babel-users@lists.alioth.debian.org > http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] some thoughts towards babel-1.9
On Mon, Jan 2, 2017 at 6:28 AM, Benjamin Henrion <zoo...@gmail.com> wrote: > On Mon, Dec 12, 2016 at 6:25 PM, Dave Taht <dave.t...@gmail.com> wrote: >> I've long been testing a few out of tree patches for babel and long >> have had the intent to try a few more once the first phase of the >> make-wifi-fast work was completed - which it mostly is, so far as lede >> is concerned ( https://lwn.net/Articles/705884/ ) - and babel-1.8 >> stablized. >> >> I wrote up some of my thinking then in: >> >> https://github.com/dtaht/rabeld/blob/master/rabel.md > > You are right about the 2 remarks on diversity, the code needs to be > adapted to handle 802.11AC, and how it minimize channel interference, > especially nowadays with using multiple channels at once. Merely handling channel detection at all again would be good. Code for this, using the correct API, exists in olsrv2, but my brain crashes when looking at netlink. Then there's merely HT20 vs HT40... and THEN ac really wonks up the ideas. > For the wired link case, I am surprised babel considers it "interfering"! I can't remember how I drew this conclusion, whether it was from packet captures or from the code, but it appeared to be the case at the time. > On this topic, I wanted to BAN hoping on the same channel, as this is > a really bad feature of wifi mesh. Not sure if I understand. There are plenty of cases where using the same channel again makes sense, for example where a directional 5ghz radio has a ton more bandwidth than a weaker 2.4ghz omni to a given point. One test case we did not explore yet with the make-wifi-fast code was where there is a 5 ghz channel in AP mode with an adhoc backchannel on the same radio. My hope is we vastly improved the behavior in that scenario... but to deploy it we needed > At some point, I should find some time to install a proper outdoor > testbed somewhere to try that kind of configuration. The yurtlab (and 110 acre campus) is uninhabitably cold in the winter (at least to a californian - no snow, though). I'd hoped to do a new deployment (30+ radios) by this past september, but we weren't done make-wifi-fast yet. With the time available before spring (say, may), it seems possible to work on other bothersome problems (ipv6 address assignment, source specific routing, name services, monitoring, and security, and other rabel-ish issues) (If anyone out here wants to bundle up and climb a few roofs with me, getting a partial deployment along the 2 main backbones would be nice, long before then) > I have tried to simulate that with hwmod kernel module, but did not go very > far. > > -- > Benjamin Henrion > FFII Brussels - +32-484-566109 - +32-2-3500762 > "In July 2005, after several failed attempts to legalise software > patents in Europe, the patent establishment changed its strategy. > Instead of explicitly seeking to sanction the patentability of > software, they are now seeking to create a central European patent > court, which would establish and enforce patentability rules in their > favor, without any possibility of correction by competing courts or > democratically elected legislators." -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] [babel] some thoughts towards babel-1.9
I added a todo list for my "rabel" branch of babel covering some of the stuff I'd like to try. https://raw.githubusercontent.com/dtaht/rabeld/master/todo.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
[Babel-users] essentially atomic updates
In my seemingly once yearly attempt at working on babel (admittedly the year is young!), here's my latest attempt at gently switching routes. https://github.com/dtaht/rabeld/commit/6ca4b0fa60dbbd25eb5d7e792d8f8058941d4cdb -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] dropping CS6
On Thu, Dec 15, 2016 at 1:10 AM, Henning Rogge <hro...@gmail.com> wrote: > On Mon, Dec 12, 2016 at 6:03 PM, Dave Taht <dave.t...@gmail.com> wrote: >> diffserv CS6 (I am the original author of the patch) has turned out to >> be generally a lose on wifi, putting things into the VO queue where >> they cannot be aggregated. I'd like to see babel go back to best >> effort. > > Do you have any numbers how well aggregation does work for multicast > transmissions compared to unicast? It doesn't, so far as I know. > Henning Rogge -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] babel on an otherwise transparent bridge
On Wed, Sep 7, 2016 at 5:00 PM, Juliusz Chroboczekwrote: >> like it to do is have a dedicated ipv6 ULA ip address for management >> purposes (not using a vlan here), and announce that to the network, but >> never offer itself as a routing opportunity to anything > > [...] > >> out br-lan something deny? >> in? >> inflate the metric? > > out br-lan ula/128 allow > out deny It's nice to know that both of us can struggle with babel syntax. The first line there doesn't parse. Should it? out ip fd42:a3d6:5621::1/128 allow out deny Parses. the route appears elsewhere, it doesn't show up as a router, but the box is unpingable via ipv6. -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
[Babel-users] babel on an otherwise transparent bridge
I have a box setup to be a transparent wifi/ethernet bridge. What I'd like it to do is have a dedicated ipv6 ULA ip address for management purposes (not using a vlan here), and announce that to the network, but never offer itself as a routing opportunity to anything on any side of the bridge, except for stuff trying to get to that specific IP. It also needs to have a route table internally to get around the network (snmp). If I just blithely leave babel at the defaults, new devices on the bridge tend to get routes from it, first, leading to a 30-60 second period where those devices can (and do) choose that 2 hop route rather than the "1 hop" one. out br-lan something deny? in? inflate the metric? -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
[Babel-users] alternate source specific encoding?
is that in a branch somewhere yet? I am doing some builds in the yurtlab for the latest and greatest make-wifi-fast code, and can try this, if available. -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] Some minor incompatible changes to the configuration language
On Sun, Jul 31, 2016 at 6:19 PM, Juliusz Chroboczekwrote: > I've just pushed some changes that could, in some edge cases, break your > configuration files. Since I'd like to release 1.8 before the end of the > summer, I'll be grateful if you could test. > > First of all, the keyword "wired" is now deprecated (undocumented but > supported for backwards compatibility), you should now say > > interface eth0 type wired > > or > > interface wlan0 type wireless > > Possibel types are "auto", "wired", "wireless" and "tunnel", where > "tunnel" enables RTT-based cost estimation. > > Split-horizon is *disabled* on interfaces of type "auto", so if you want > split-horizon, you should manually set the interface type. I'm hesitating > to enable link-quality estimation on auto interfaces -- it would avoid > some wrong configurations, but it carries a significant cost. > > The second change is that setting max-rtt-penalty no longer enables > timestamps. This particular bit of DWIM is no longer necessary now the we > have the tunnel interface type. I would like a way to enable timestamping universally. I like seeing/collecting the measured RTTs on the monitoring tools, and one day hope to improve the RTT metric system to work on normal (well, fq_codeled) paths. The cost of getting and sending the timestamp is trivial. > I'll be grateful for testing. I'll also be grateful for any suggestions > that make the interface configuration more automatic. > > -- Juliusz > > ___ > Babel-users mailing list > Babel-users@lists.alioth.debian.org > http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] [Cerowrt-devel] Cross-compiling to armhf [was: beaglebone green wireless boards...]
On Thu, Jun 23, 2016 at 4:20 PM, Jonathan Mortonwrote: > >> On 24 Jun, 2016, at 01:57, Juliusz Chroboczek >> wrote: >> >>> the long slow EABI changeover that was obsoleted almost overnight by the >>> armhf work the raspian folk did, and so on. >> >> I am pretty positive that armhf predates raspbian. Let's please give >> credit where credit is due. > > Ironically, it was I who demonstrated to the Raspbian folks the benefits of > an armhf build for the R-Pi 1, back in the early days of that platform. It > seems like an awfully long time ago now. :-) Yes, it was a group team of hackers going against the prevailing wisdom of endless backward compatibility, and succeeding due to technical excellence and demonstrable performance improvements - that's why that story sticks with me. I admire anybody that can do that. :) -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] Cross-compiling to armhf [was: beaglebone green wireless boards...]
On Thu, Jun 23, 2016 at 3:57 PM, Juliusz Chroboczekwrote: >> the long slow EABI changeover that was obsoleted almost overnight by the >> armhf work the raspian folk did, and so on. > > I am pretty positive that armhf predates raspbian. Let's please give > credit where credit is due. I note that I *really like arm*, going back a very long ways. http://the-edge.blogspot.com/2002/06/axioms-one-of-my-axioms-about.html I remember telling the CTO of palm they were doomed back then... they had started trying to differentiate models by *color* at that point sure the abi and compiler "were out there" - but getting 20,000 packages converted over and widely into a popular distro and platform, to me, was the tipping point for wider adoption of the hard float abi, as something others could build on. I just spent a few minutes googling for that story, but couldn't find it (what I remember was 3 guys, 3 months, hammering at getting 20,000 packages to all "just work"). What we had before was a mess of different ABIs, and a whole bunch of slightly incompatible arm cpu versions - all enough different to fragment the arm ecosystem. there was no way you could trust one binary on a different box. Back around this time (2006-2010?) it was also unclear that arm would accellerate so far past the herd, either, and there were a ton of other factors, of course that led to where it's now being considered for supercomputers and looks set to start unseating intel in many places. And despite really liking arm, I look forward to entirely new arches like the risc-v and mill eating its lunch one day. Things like trustzone, the mali gpu, and other portions of onchip IP commonly shipped with the chips suck rocks, still. Very few applications are taking good advantage of the neon vfp code, the onboard caches are way behind intel's, and so on... speaking of trustzone - yea! there's a way to use it now. https://github.com/OP-TEE > > -- Juliusz -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] Cross-compiling to armhf [was: beaglebone green wireless boards...]
On Thu, Jun 23, 2016 at 3:10 PM, Juliusz Chroboczekwrote: >> (does a working cross compiler exist for the aarch64 in the c2?) > > apt-get install gcc-aarch64-linux-gnu d@osx: apt-get install gcc-aarch64-linux-gnu Not found. ... One of the bigger mistakes I have made in the last 3 years was adopting an macbook air as my main laptop - primarily because the keyboard was tolerable and backlit, it was light on my back, and everything worked, all the time. Running a vm for any length of time drains the battery, and the mental semantic confusion I get from switching keyboard and mouse interfaces between linux vm and osx, not to mention the added overhead of porting over the tools I use (notably aquamacs), has led to an enormous decline in my day to day development activity and a corresponding rise in using email and other management tools. For years I'd advocated to others that if they are going to develop on linux, for any platform, then they should eat, sleep, and breathe linux to do so, and I've hurt my day to day productivity by trying, only counterbalanced by that I can try for longer (like a 10 hr airline flight) It turns out I use absolutely no native osx apps that don't run on linux; although things like garageband had some initial appeal, ardour4 proved better. So the only defenses I have for that laptop are the lightness, keyboard, and battery life. It also serves as a constant reminder of how limited other OSes are and the uphill battle on what needs to happen for getting universal fixes on everything. I have two other linux laptops, both broken. On one, the ethernet is fried, on the other, the X11 gui environment got so messed up that I can no longer log in - so both have ended up in the testbed for use as fq_codel development targets rather than directly in front of me. I have a chromebook, but my attempt to get a real linux on it ended in disaster. > Dave, I know you're a grumpy old man, but the Debian folks have done some > remarkable work on cross-compilation, on multiarch, chroots and emulation. Yes they have! It is quite amazing how arm got it's act together, including and especially all the integration work linaro did. I have a long story on all the work I did on arm architecture long before armhf became popular, and the mess that that was, all the way back to 1998 and handhelds.org, the disaster that was the ep9302 FPU, the long slow EABI changeover that was obsoleted almost overnight by the armhf work the raspian folk did, and so on. I do plan to try and reform on this upcoming trip - bringing an air, and reinstalling that busted laptop from scratch - but even then the trackpad never worked worth a darn. If I don't manage to reform, I'll also have an odroid c2 and beaglebone with me that both support native compilation. > (I wonder why they still insist that we use the morass of complexity > called Debian-installer. It is so much easier to run deboostrap, generate > a root filesystem, tweak the root filesystem until you're happy, and then > copy it over to the target and be done with it.) > > -- Juliusz -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] Cross-compiling to armhf [was: beaglebone green wireless boards...]
On Wed, Jun 22, 2016 at 4:31 AM, Juliusz Chroboczekwrote: >> The preinstalled OS has sufficient compiler and onboard flash space to >> build a current babeld from git, and I'm happy to report IPV6_SUBTREES >> is compiled in by default. > > Dave, > > It's not the first time that I notice with wonder that you're compiling on > the devel boards. Are you aware that cross-compiling babeld to armhf is > so easy it's not even funny? > > sudo apt-get install gcc-arm-linux-gnueabihf > make CC=arm-linux-gnueabihf-gcc I ended up writing a long rant about this that I will blog one day... but my short answer to both your suggestions that I cross compile or install a docker: "You kids, get off my lawn!" :) I have a tendency to need to compile things vastly more complex than babel, often more bleeding edge than what is supplied in a repo, and *knowing* that an apt-get build-dep something; then checking it out from git head, will actually work with minimal effort, is a joy. The latest generation of hackerboards are actually "real computers", because they have a working, on-board compiler and full debian (and android) support. They would be even more real if the Xwindow drivers worked worth a damn and I could hook up a keyboard, or a variety of more obscure languages (:cough: "go", "rust") actually worked, also. I would love to one day soon be back on a world where I only had to compile stuff for one architecture, and could spend more time writing code rather than dealing with ABI differences. I am impressed with the java port... Although I don't care for java much, it would be nice to carry these new protocols into android somehow. > Shncpd is a little bit trickier, since it depends on libbsd. I think I'll > remove the dependency before relase, but in the meantime you may either > build yourself an armhf libbsd, or install libbsd0:armhf on your system > (which requires setting up a multiarch environment), or set up > a cross-compilation chroot, or simply copy libbsd.so from the target system. Compiling natively, I don't have to think about that. (does a working cross compiler exist for the aarch64 in the c2?) -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] rib out of sync
I stepped back to 1.6.3 and the kernel and babel ribs stayed correct no matter how much I upped or downed the usb0 interface while keeping the wifi alive. This by itself does not mean enough (because source specific routing is not in 1.6 as best I recall) In terms of bisecting between babel 1.6.3 and git head, what should I try next? 1.7.1? ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
[Babel-users] beaglebone green wireless boards now available
I just got two of 'em and getting usbnet up was a snap. I got 'em because they have dual 2.4ghz 802.11n antennas and I figured the wifi would be faster than the getchip stuff. (there is no adhoc support. another reason for looking at this board is to look at the structure of the drivers for make-wifi-fast) I have long liked the beaglebones as being a well built product, with some special features like the onboard PRUs nothing else can match. The cpu is getting a bit long in the tooth tho, and these wireless ones (no ethernet!) are so new that cases don't exist for them yet. https://www.amazon.com/Seeedstudio-BeagleBone-Green-Wireless-Bluetooth/dp/B01GKE8F10/ref=sr_1_1?ie=UTF8=1466552623=8-1=beaglebone+green+wireless they boot up (pretty fast) with debian jesse, kernel 4.4.9-ti-r25, on the onboard 4GB emmc flash chip. I was unaware until this moment that debian jesse appears to be shipping babeld 1.5.1. The preinstalled OS has sufficient compiler and onboard flash space to build a current babeld from git, and I'm happy to report IPV6_SUBTREES is compiled in by default. As for whether or not I'll end up going through the same hell I'm going through elsewhere, too soon to tell. -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] Babel-users Digest, Vol 103, Issue 12
As for "interesting items on the agenda", there are a wide variety of things that appeal to *someone*, if you browse the agenda and working groups available. If you plan to attend all week, the sunday newcomers orientation is quite helpful. In my case for example I am very interested in the "ACM, IRTF & ISOC Applied Networking Research Workshop 2016 " on saturday. sunday is "Emerging Work in IEEE 802" - (one of my big interests in layer 3 protocols is getting more link layers to work right) then there is dtn, 6lo, 6titch, homenet, tsvarea, nmlrg, lpwan, detnet, bier, and a couple others, and that's just monday. If I could clone myself 3 times, I could fit all that in... the rest of the week is lighter (for me), but then theres dnssd, homenet, babel, bgp, iccrg, etc, etc... and despite all that I mostly find myself in a hallway, talking there. ... This year, I decided that the smartest thing I can do is bring a giant thermos of coffee, because they have a tendency to run out, shortly before I need it. ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] Babel-users Digest, Vol 103, Issue 12
On Tue, Jun 21, 2016 at 12:47 PM, Jehan Trembackwrote: > I might be interested in participating in IETF 96. Are there more details > about how much it costs etc.? How much of the event will be Babel-related? About 1/1th. https://datatracker.ietf.org/meeting/96/agenda.html Cost for non-students is 700 dollars, there are also one day passes, and student passes. https://www.ietf.org/meeting/register.html > -Jehan > > On Sun, Jun 19, 2016 at 5:00 AM, > wrote: >> >> Send Babel-users mailing list submissions to >> babel-users@lists.alioth.debian.org >> >> To subscribe or unsubscribe via the World Wide Web, visit >> >> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users >> or, via email, send a message with subject or body 'help' to >> babel-users-requ...@lists.alioth.debian.org >> >> You can reach the person managing the list at >> babel-users-ow...@lists.alioth.debian.org >> >> When replying, please edit your Subject line so it is more specific >> than "Re: Contents of Babel-users digest..." >> >> Today's Topics: >> >>1. Re: [babel] WG Action: Formed Babel routing protocol (babel) >> (Juliusz Chroboczek) >> >> >> -- Forwarded message -- >> From: Juliusz Chroboczek >> To: ba...@ietf.org >> Cc: babel-users@lists.alioth.debian.org >> Date: Sun, 19 Jun 2016 13:17:59 +0200 >> Subject: Re: [Babel-users] [babel] WG Action: Formed Babel routing >> protocol (babel) >> The IESG says: >> >> > A new IETF WG has been formed in the Routing Area. >> >> > Charter: https://datatracker.ietf.org/doc/charter-ietf-babel/ >> >> Hourra! >> >> Many thanks to everyone who helped with making this happen. (Too many >> people to fit in the margin of this mail, so I'll just single out the >> fine-tuning work that Alia did in the final phases.) >> >> Pro memoria, IETF 96 is from 17 to 22 July in Berlin, Germany (from >> Warsaw, just follow the A2 highway westwards for 550km, or take the train >> at central station; from Paris, hop into the first night train at Gare de >> l'Est -- no need to suffer through airport security). The inscription >> fees are high, but there are very reasonable student rates, and Berlin is >> full of cheap hotels, many of them close to the Hilton. (And if you've >> never been to Berlin -- it's worth visiting.) >> >> I'll send a reminder to both lists just before the meeting, with pointers >> to the online participation URLs. >> >> Hope to see you all there, >> >> -- Juliusz >> >> >> >> ___ >> Babel-users mailing list >> Babel-users@lists.alioth.debian.org >> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users > > > > ___ > Babel-users mailing list > Babel-users@lists.alioth.debian.org > http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
[Babel-users] rib out of sync
I have definitively proven that on at least one of the arm boxes we are using (the getchip) that there is either a new babel bug in git, and/or arm related issue with forming the netlink message, or a kernel bug, or excessive gamma radiation in the atmosphere. I am going to step back to 1.6.1 and see what happens there, and also drop the arm boxes out of the equation, and hammer on x86, also, and reduce or increase the number of routes babel is carrying. I managed to duplicate all the weird behaviors I've seen over the past several months by buckling down and simplifying things and just looking at ipv6, while repeatedly failing and restoring an interface (ifconfig down/up), and rebooting a lot... In particular, seeing the asymmetric routing scenario take place, repeatably, gave much insight. ... A feature request for babel one day is: to have a "Verify that the kernel rib matches babel's rib" option. ... dumping the babel routes (I do have a ton of routes (about 60 installed in a "working" kernel route table)), selecting 2 for the sake of example: fd99::13/128 from ::/0 metric 352 (357) refmetric 256 id 02:0d:b9:ff:fe:41:6c:2c seqno 60577 chan (255) age 16 via usb0 neigh fe80::f0ba:5eff:feca:7f4e (installed) fd99::13/128 from ::/0 metric 494 (525) refmetric 96 id 02:0d:b9:ff:fe:41:6c:2c seqno 60577 chan (36) age 7 via wlan0 neigh fe80::100d:7fff:fe64:c990 (feasible) ^C root@chipper:~# ip -6 route | grep usb fe80::/64 dev usb0 proto kernel metric 256 nothing else. That was last night. Same box, this morning... root@chipper:~# ip -6 route | grep usb fd99::4 via fe80::f0ba:5eff:feca:7f4e dev usb0 proto babel metric 1024 fd9f:237b:c8a6::1d1 via fe80::f0ba:5eff:feca:7f4e dev usb0 proto babel metric 1024 fd9f:237b:c8a6:0:cca0:3c4a:60df:a69 via fe80::f0ba:5eff:feca:7f4e dev usb0 proto babel metric 1024 fd9f:237b:c8a6::/48 via fe80::f0ba:5eff:feca:7f4e dev usb0 proto babel metric 1024 fe80::/64 dev usb0 proto kernel metric 256 (out of about 30 more that babel's rib claims is on this interface) In terms of other issues, all in one log... A) As for why all these routes are stored unreachable in this sample, dunno B) refmetric 0? ::/0 from 2001:558:6045:bd:3977:7182:7772:8b2b/128 metric 65535 (65535) refmetric 65535 id 16:cc:20:ff:fe:e5:64:c2 seqno 8764 chan (6,36) age 4 via wlan0 neigh fe80::7ec7:9ff:fede:2bb5 (installed) ::/0 from 2001:558:6045:bd:3977:7182:7772:8b2b/128 metric 640 (645) refmetric 256 id 16:cc:20:ff:fe:e5:64:c2 seqno 8764 chan (36) age 4 via wlan0 neigh fe80::100d:7fff:fe64:c990 (feasible) ::/0 from 2001:558:6045:bd:3977:7182:7772:8b2b/128 metric 96 (101) refmetric 0 id 16:cc:20:ff:fe:e5:64:c2 seqno 8764 age 6 via usb0 neigh fe80::f0ba:5eff:feca:7f4e (feasible) ::/0 from 2001:558:6045:bd:3977:7182:7772:8b2b/128 metric 65535 (65535) refmetric 65535 id 16:cc:20:ff:fe:e5:64:c2 seqno 8764 chan (6,36) age 5 via wlan0 neigh fe80::7ec7:9ff:fede:2bb5 (installed) ::/0 from 2001:558:6045:bd:3977:7182:7772:8b2b/128 metric 640 (645) refmetric 256 id 16:cc:20:ff:fe:e5:64:c2 seqno 8764 chan (36) age 5 via wlan0 neigh fe80::100d:7fff:fe64:c990 (feasible) ::/0 from 2001:558:6045:bd:3977:7182:7772:8b2b/128 metric 96 (101) refmetric 0 id 16:cc:20:ff:fe:e5:64:c2 seqno 8764 age 7 via usb0 neigh fe80::f0ba:5eff:feca:7f4e (feasible) -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
[Babel-users] unicast IHU patch works... but
tested against stock daemons running 1.7.1, 1.6, and all patches to 1.8pre to date on an admittedly too complex network, over ethernet, usbnet, and wifi in ap/sta mode. (not adhoc yet) I have not entirely got around to testing the main circumstances (blocked on other factors) I wanted to look at again (powersave, asymmetric failover)... and still groping to explain the possibly interrelated weird behaviors I've been seeing all month... I am sitting here (and NOT running my attempt at an atomic route change - just the IHU wire change) after several ifconfig usb0 up and down events, watching ipv6 failover to the usbnet device generate errors like this (while ipv4 works correctly) kernel_route(ADD): Invalid argument kernel_route(ADD): Invalid argument kernel_route(MODIFY): Invalid argument kernel_route(ADD): Invalid argument kernel_route(ADD): Invalid argument kernel_route(MODIFY): Invalid argument kernel_route(ADD): Invalid argument kernel_route(ADD): Invalid argument kernel_route(ADD): Invalid argument So I think we have a bug in ipv6 routing on kernel 4.3.0 in the getchip, at least. All these should have been installed by "babel". fd99::13 via fe80::100d:7fff:fe64:c990 dev wlan0 proto static metric 1024 fd99::14 via fe80::100d:7fff:fe64:c990 dev wlan0 proto static metric 1024 fd99::23 via fe80::7ec7:9ff:fede:2bb5 dev wlan0 proto babel metric 1024 I will add a check to see if we are *always* using the right proto, and (now that I learned how) start monitoring netlink messages a bit better. -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] [BUG] Route "deadlocks" under load due to non-atomic kernel route updates
On Thu, Jun 16, 2016 at 1:40 PM, Kirill Smelkov <k...@nexedi.com> wrote: > On Thu, Jun 16, 2016 at 08:38:49AM -0700, Dave Taht wrote: >> On Thu, Jun 16, 2016 at 4:17 AM, Kirill Smelkov <k...@nexedi.com> wrote: >> > On Wed, Jun 15, 2016 at 12:56:34PM +0200, Juliusz Chroboczek wrote: >> >> >> If I read you correctly, this looks like a kernel bug: incorrect >> >> >> invalidation of the route cache. >> >> >> >> [...] >> >> >> >> > What we have here is of another kind - it is inherent race condition >> >> > inside kernel >> >> >> >> Perhaps I'm confused, but it still looks like a kernel bug to me. >> > >> > Yes, it is a kernel bug. But in a sense it is so old and so widespread >> > that it has to be cared about in userspace - as with atomic route >> > updates we do not hit it. >> > >> > Also: atomic route updates are needed not only for avoiding this bug. >> > Another reason is: if we have routedel & routeadd pair, even after >> > routeadd the state of cache is correct, in the time between del & add, >> > if a packet destined to that route gets to the node, it hits >> > 'unreachable' route case. >> > >> > For usual packets it is only "packet lost" and TCP probably retransmits. >> > But for SYN packets, e.g. when a connection is going to be established, >> > ICMP error is returned which results in "host unreachable" error on >> > originator side. >> >> Yes this variant of the bug is still there, essentially, and it bugs me. >> >> (btw the facebook page you pointed to fixes they did was fascinating - >> they have "interesting problems" - like dealing with 1+m routes in >> their route table) >> >> one day a year, for several years now, I get sufficiently irked about >> the atomic update problem in babel to refresh my knowledge of netlink, >> hack babel all to hell, and have nothing work. I left myself a bunch >> more breadcrumbs last night in my hacked up babel version, as to what >> I tried and what it did wrong... (because I'm actually also chasing >> another bug which I'll put up in another message) >> >> But: >> >> Why doing the equivalent of this (and understanding how it does it) >> >> ip -6 route add fd99::33/128 via fe80::120d:7fff:fe64:c992 dev eno1 >> ip -6 route replace fd99::33/128 via fe80::120d:7fff:fe64:c991 dev wlp2s0 >> >> is so hard for me to figure out - that I don't understand. But it >> seems to require completely tracing through the ip route code, and >> writing a decoder for the netlink packets created, to figure out why >> what I thought would be an equivalent for babel, and taking the week >> or more to do it... >> >> -- look! Squirrel! > > Dave, maybe this might help you: Wireshark (not tcpdump) has decoder for > netlink route packets: > > https://code.wireshark.org/review/gitweb?p=wireshark.git;a=blob;f=epan/dissectors/packet-netlink-route.c;hb=v2.1.1rc0-170-gc269684 Groovy. Thank you. I did not know. In discussing this with shemminger this morning, he pointed out there was a semantic difference between how routes can be replaced in ipv6 and ipv4. At *one point* last night I thought I'd successfully got ipv6 to atomic replace, but it had failed on ipv4 - so I will revisit the work soon, brain cells and time willing. > so you can create a virtual netlink monitor interface - something along > the lines of > > modprobe nlmon > ip link add type nlmon > ip link set nlmon0 up > > ( see more details in e.g. https://patchwork.ozlabs.org/patch/259444/ ) > > and see the actual packets exchanged between iproute and kernel. > > Also: there is pyroute2 (https://github.com/svinota/pyroute2) which has debug > decoder for netlink packets, but out of the box you have to specify packet > type > explicitly: > > https://github.com/svinota/pyroute2/blob/master/docs/debug.rst > > Maybe you already know all this, but I decided to provide info anyway to make > sure it is not missed, because you mentioned it is hard for you to understand > what is going on underneath `ip -6 ...` > > Hope this might help, > Kirill > > >> >> Perhaps it would make sense to speak to netdev about that? >> > >> > Yes, makes sense. Though as this particular case is not present on 4.2+ >> > kernels, people on netdev will probably has less interest to look into. >> > >> > I will see what can be done. >> > >> >> > Quagga, at least, switched to atomic updates some tim
[Babel-users] the costs of periodic disassociation in conventional ap/sta mode
In the new lab I ended up connecting up a bunch of machines in sta mode over wpa... (partially because adhoc was unavailable - and *mostly* because that's what normal homenet users would do, and lastly because it improved throughput by 50x in some cases) ... with bad results for babel behavior in general, that I am gradually trying to reduce the impact of (as well as observe what happens to other protocols, tests, and daemons). It doesn't help that I'm also trying to make a major change in how wifi is queued underneath... Anyway, to summarize two out of three and add a new one... A) Powersave enabled caused stas to drop off the net by missing multicast. There was a few patches that went by on the kernel list recently that might have fixed a beacon offset problem, which I haven't tested. good fixes for this problem include rigorously testing for it and fixing on all the chipsets in the world, or having babel be aware it is in powersave mode and using a bit of unicast, or something, to keep itself alive. B) Network Manager triggered scans were devastating[1], and even after locking to one bssid as suggested on the relevant thread: http://blog.cerowrt.org/post/disabling_channel_scans/ [wifi] bssid=04:F0:21:1F:36:E2 mac-address-blacklist= mac-address-randomization=0 mode=infrastructure seen-bssids=04:F0:21:1F:36:E2; ssid=FQCODEL I still had some trouble with babel, which I think I managed to see in the small with C) EVEN after putting this in place (which definitely makes things better) - on anther machine I am periodically disassociating on another interval for no reason I can discern (yet) - and since last night, I was logging babel's behavior thoroughly... this is what that does to babel. I am pretty sure that events like this trigger a bit more routing traffic and jitter of babels states, than is desirable, (and maybe not enough, the link is essentially down on both sides) but did not capture the traffic in these cases, and for all I know there are daemon or kernel mods that can make dropping out of the multicast group, losing the local addresses, forgetting the channel, and then coming back online a little less hard on everything. ... Interface wlp2s0 has no link-local address. Couldn't determine channel of interface wlp2s0: Invalid argument. Interface wlp2s0 has no link-local address. Couldn't determine channel of interface wlp2s0: Invalid argument. Interface wlp2s0 has no link-local address. Couldn't determine channel of interface wlp2s0: Invalid argument. Interface wlp2s0 has no link-local address. Couldn't determine channel of interface wlp2s0: Invalid argument. Interface wlp2s0 has no link-local address. Couldn't determine channel of interface wlp2s0: Invalid argument. send: Cannot assign requested address Interface wlp2s0 has no link-local address. Couldn't determine channel of interface wlp2s0: Invalid argument. send: Cannot assign requested address [1] Saying the friends don't let friends use network manager is not good enough. Documenting how to work around it, or thoroughly fixing network manager - or the device drivers - would be better. -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] [BUG] Route "deadlocks" under load due to non-atomic kernel route updates
On Thu, Jun 16, 2016 at 4:17 AM, Kirill Smelkov <k...@nexedi.com> wrote: > On Wed, Jun 15, 2016 at 12:56:34PM +0200, Juliusz Chroboczek wrote: >> >> If I read you correctly, this looks like a kernel bug: incorrect >> >> invalidation of the route cache. >> >> [...] >> >> > What we have here is of another kind - it is inherent race condition >> > inside kernel >> >> Perhaps I'm confused, but it still looks like a kernel bug to me. > > Yes, it is a kernel bug. But in a sense it is so old and so widespread > that it has to be cared about in userspace - as with atomic route > updates we do not hit it. > > Also: atomic route updates are needed not only for avoiding this bug. > Another reason is: if we have routedel & routeadd pair, even after > routeadd the state of cache is correct, in the time between del & add, > if a packet destined to that route gets to the node, it hits > 'unreachable' route case. > > For usual packets it is only "packet lost" and TCP probably retransmits. > But for SYN packets, e.g. when a connection is going to be established, > ICMP error is returned which results in "host unreachable" error on > originator side. Yes this variant of the bug is still there, essentially, and it bugs me. (btw the facebook page you pointed to fixes they did was fascinating - they have "interesting problems" - like dealing with 1+m routes in their route table) one day a year, for several years now, I get sufficiently irked about the atomic update problem in babel to refresh my knowledge of netlink, hack babel all to hell, and have nothing work. I left myself a bunch more breadcrumbs last night in my hacked up babel version, as to what I tried and what it did wrong... (because I'm actually also chasing another bug which I'll put up in another message) But: Why doing the equivalent of this (and understanding how it does it) ip -6 route add fd99::33/128 via fe80::120d:7fff:fe64:c992 dev eno1 ip -6 route replace fd99::33/128 via fe80::120d:7fff:fe64:c991 dev wlp2s0 is so hard for me to figure out - that I don't understand. But it seems to require completely tracing through the ip route code, and writing a decoder for the netlink packets created, to figure out why what I thought would be an equivalent for babel, and taking the week or more to do it... -- look! Squirrel! > >> Perhaps it would make sense to speak to netdev about that? > > Yes, makes sense. Though as this particular case is not present on 4.2+ > kernels, people on netdev will probably has less interest to look into. > > I will see what can be done. > >> > Quagga, at least, switched to atomic updates some time ago, I think. >> > >> > http://patchwork.quagga.net/patch/1234/ >> >> I see. I'm busy right now, but I'll be grateful for a patch. > > I see about this. Thanks for feedback. > > > On Wed, Jun 15, 2016 at 07:35:05PM -0700, Dave Taht wrote: >> > https://lab.nexedi.com/kirr/iproute2/blob/bd480e66/t/rtcache-torture >> > (also attached to this email) >> > >> > which reproduces the problem in several minutes just on one computer and >> > retested it locally: I can reliably reproduce the issue on pristine >> > Debian 3.16.7-ckt25-2 (on both Atom and Core2 notebooks) and on pristine >> > 3.16.35 on Atom (compiled by me, since Debian kernel team has not yet >> > uploaded 3.16.35 to Jessie). >> >> I have been running this script on four different machines for hours >> now without reproducing your bug on the 4.4 or later kernels. It does >> trigger on a 3.14 kernel. (it helps to do a killall fping6 before >> exiting!) >> >> It does not seem to be happening on 4.4 or later. At one level, I'm >> relieved - one last babel bug to worry about in openwrt (now 4.4 >> based), although one of the platforms I work on is still stuck at >> 3.18, as is the 3.14 c2 (for now). >> >> At another level I still really, really, really wanted atomic updates >> in babel, and was clearing the decks to make a run at the right >> netlink stuff when I'd decided to confirm your bug existed or not in >> my kernels. :(. Weirdly demotivating. >> >> >> d@dancer:~/bin$ ssh root@pi3 uname -a >> Linux pi3 4.4.12-v7+ #892 SMP Thu Jun 2 15:41:19 BST 2016 armv7l GNU/Linux >> d@dancer:~/bin$ ssh root@pi2 uname -a >> Linux pi2 4.4.12-v7+ #892 SMP Thu Jun 2 15:41:19 BST 2016 armv7l GNU/Linux >> d@dancer:~/bin$ uname -a >> Linux dancer 4.5.0-rc7-fqfi #1 SMP PREEMPT Mon Mar 7 16:04:17 PST 2016 >> x86_64 x86_64 x86_64 GNU/Linux >> >> ... >> >> The odroid C2 has the bug. >> >&
Re: [Babel-users] Golang implementation
aha. This seems further along. https://github.com/sh3rp/gabel On Wed, Jun 15, 2016 at 7:40 PM, Dave Taht <dave.t...@gmail.com> wrote: > If this is it? > > https://github.com/casarez/gabel > > It looks like a bit more work is required to get to a decent implementation. > > On Wed, Jun 15, 2016 at 5:18 PM, Jehan Tremback > <jehan.tremb...@gmail.com> wrote: >> Someone posted in April that they were working on a Golang implementation of >> Babel. Does anyone know where the code for that is located? >> >> -Jehan >> >> ___ >> Babel-users mailing list >> Babel-users@lists.alioth.debian.org >> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users > > > > -- > Dave Täht > Let's go make home routers and wifi faster! With better software! > http://blog.cerowrt.org -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] Golang implementation
If this is it? https://github.com/casarez/gabel It looks like a bit more work is required to get to a decent implementation. On Wed, Jun 15, 2016 at 5:18 PM, Jehan Trembackwrote: > Someone posted in April that they were working on a Golang implementation of > Babel. Does anyone know where the code for that is located? > > -Jehan > > ___ > Babel-users mailing list > Babel-users@lists.alioth.debian.org > http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] [BUG] Route "deadlocks" under load due to non-atomic kernel route updates
> https://lab.nexedi.com/kirr/iproute2/blob/bd480e66/t/rtcache-torture > (also attached to this email) > > which reproduces the problem in several minutes just on one computer and > retested it locally: I can reliably reproduce the issue on pristine > Debian 3.16.7-ckt25-2 (on both Atom and Core2 notebooks) and on pristine > 3.16.35 on Atom (compiled by me, since Debian kernel team has not yet > uploaded 3.16.35 to Jessie). I have been running this script on four different machines for hours now without reproducing your bug on the 4.4 or later kernels. It does trigger on a 3.14 kernel. (it helps to do a killall fping6 before exiting!) It does not seem to be happening on 4.4 or later. At one level, I'm relieved - one last babel bug to worry about in openwrt (now 4.4 based), although one of the platforms I work on is still stuck at 3.18, as is the 3.14 c2 (for now). At another level I still really, really, really wanted atomic updates in babel, and was clearing the decks to make a run at the right netlink stuff when I'd decided to confirm your bug existed or not in my kernels. :(. Weirdly demotivating. d@dancer:~/bin$ ssh root@pi3 uname -a Linux pi3 4.4.12-v7+ #892 SMP Thu Jun 2 15:41:19 BST 2016 armv7l GNU/Linux d@dancer:~/bin$ ssh root@pi2 uname -a Linux pi2 4.4.12-v7+ #892 SMP Thu Jun 2 15:41:19 BST 2016 armv7l GNU/Linux d@dancer:~/bin$ uname -a Linux dancer 4.5.0-rc7-fqfi #1 SMP PREEMPT Mon Mar 7 16:04:17 PST 2016 x86_64 x86_64 x86_64 GNU/Linux ... The odroid C2 has the bug. d@dancer:~/bin$ ssh root@c2 uname -a Linux c2 3.14.29-56 #1 SMP PREEMPT Wed Apr 20 12:15:54 BRT 2016 aarch64 aarch64 aarch64 GNU/Linux BUG: Got unexpected unreachable route for 2226:::::1: # I'd changed the number unreachable 2226:::::1 from :: dev lo src fd99::2 metric 0 \cache error -101 route table for root 2226::::/48 8< unicast 2226:::::/64 dev dum0 proto boot scope global metric 1024 unreachable 2226::::/48 dev lo proto boot scope global metric 1024 error -101 8< route for 2226:::::1 (once again) unreachable 2226:::::1 from :: dev lo src fd99::2 metric 0 \cache error -101 users 1 used 3 > > It is always the same: the issue reproduces reliably in several minutes. > And it looks like e.g. > > - 8< > root@mini:/home/kirr/src/tools/net/iproute2/t# time ./rtcache-torture > PING :::::1(:::::1) 56 data bytes > E.E.E.E..E..EE...E.. > > > BUG: Linux mini 3.16.35-mini64 #14 SMP PREEMPT Sun Jun 12 19:41:09 MSK > 2016 x86_64 GNU/Linux > BUG: Got unexpected unreachable route for :::::1: > unreachable :::::1 from :: dev lo src > 2001:67c:1254:20::1 metric 0 \cache error -101 > > route table for root ::::/48 > 8< > unicast :::::/64 dev dum0 proto boot scope global > metric 1024 > unreachable ::::/48 dev lo proto boot scope global metric > 1024 error -101 > 8< > > route for :::::1 (once again) > unreachable :::::1 from :: dev lo src > 2001:67c:1254:20::1 metric 0 \cache error -101 users 1 used 4 > > real0m49.938s > user0m4.488s > sys 0m5.872s > 8< > > The issue should not show itself with kernels >= 4.2, because there the > lookup procedure does not take table lock twice, and /128 cache entries > are not routinely created (they are created only upon PMTU exception). > > I'm running Debian testing on my development machine. Currently it has > 4.5.5-1 (2016-05-29). I can confirm that /128 route cache entries are > not created there just because a route was looked up. > > Kirill > > > 8< (rtcache-torture) > #!/bin/sh -e > # torture for IPv6 RT cache, trying to hit the race between lookup,cache-add > & route add > # http://lists.alioth.debian.org/pipermail/babel-users/2016-June/002547.html > > > tprefix=:: # "whole-network" prefix for tests /48 > tsubnet=$tprefix: # subnetwork for which "to" route will be changed > /64 > taddr=$tsubnet::1 # test address on $tsubnet > > # setup for tests: > > # dum0 dummy device > ip link del dev dum0 2>/dev/null || : > ip link add dum0 type dummy > ip link set up dev dum0 > > # clean route table for tprefix with only unreachable whole-network route > ip -6 route flush root $tprefix::/48 > ip -6 route add unreachable $tprefix::/48 > ip -6 route flush cache > > ip -6 route add $tsubnet::/64 dev dum0 > > > # put a lot of requests to rt/rtcache getting route to $taddr > trap 'kill $(jobs -p)' EXIT > rtgetter() { > # NOTE we cannot do this with `ip route get ...` in a loop, as `ip route > # get` first takes RTNL lock, and thus will be completely serialized with > # e.g. route add and del. > # >
Re: [Babel-users] [BUG] Route "deadlocks" under load due to non-atomic kernel route updates
On Fri, Jun 10, 2016 at 11:47 AM, Juliusz Chroboczekwrote: > Dear Kirill, > > Thank you very much for the detailed analysis. > > If I read you correctly, this looks like a kernel bug: incorrect > invalidation of the route cache. While we have seen some similar bugs in > earlier kernel versions, they were not triggered by something that > simple -- you needed to do some non-trivial rule manipulation in order to > trigger them. > > What is more -- I believe that babeld is using the same procedure as > Quagga and Bird. Do you understand why Quagga and Bird are not seeing the > same issues ? Quagga, at least, switched to atomic updates some time ago, I think. http://patchwork.quagga.net/patch/1234/ > > While I have no objection to switching to a different API for manipulating > routes, I'd like to first make sure that we understand what's going on here. I strongly approve of atomic updates and fixing what, if anything, that breaks... I have seen oddities in unreachable p2p routes for years now. I've suspected a variety of causes - notably getting a icmp route unreachable before babel could make the switch, but have never tracked it down. Some of the work I'm doing now could be leveraged to try and make it happen more often, but a few more pieces on top of this https://www.mail-archive.com/netdev@vger.kernel.org/msg114172.html need to land before I can propagate all the right pieces to the testbed. > > Oh -- and are you running a stock kernel, or one locally patched? Can you > reproduce the issue on a pristine, recent kernel? > > Thanks again for your help, > > -- Juliusz > > > > > > ___ > Babel-users mailing list > Babel-users@lists.alioth.debian.org > http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] Babel in Bird 1.6.0
I can confirm that toke's current set of fixes compiles on a rpi3, AND that I am too stupid to figure out how to create a correct, basic, babeld .conf file for bird. On Sat, Apr 30, 2016 at 10:39 AM, Toke Høiland-Jørgensenwrote: > Juliusz Chroboczek writes: > >>> Okay, actually trying to put this into code: Is the intention here that >>> a null-router ID update is acceptable only on *wildcard* retractions or >>> on *all* retractions? >> >> In RFC 6126, there's nothing special about a null router-ID: it's just >> a router ID. > > I didn't actually mean a 'null router ID'. I meant an *unset* router ID. > I.e., if flag 0x40 is set or the update is preceded by a router ID TLV, > the router ID is *set*. It may or may not be set to all-zeroes, but that > is orthogonal. So I was referring to the text stating "the current > router-id and seqno is not used" - does that refer to all retractions or > just wildcard ones? > > (I suspect the answer to be the former, and that the fact that this > poses problems is an artifact of the current update handling flow in the > Bird code; but want to be sure before I change it). > >> However, for AEs 0 and 1, the address is too short to carry a router-ID >> (it's 0 and 4 octets respectively, while a router-ID is 8 octets). The >> intention was that a shorter address should be stored in the right side of >> a router-ID, and padded with zeroes; e.g. the IPv4 address (AE 1) 1.2.3.4 >> maps to the router-ID 0:0:0:0:1:2:3:4, and the zero-length address (AE 0) >> maps to 0:0:0:0:0:0:0:0. However, I don't think this is spelled out in >> RFC 6126. > > Well, I've always thought about 0x40 as specifying that the router ID be > the 64 bits from the address that is semantically encoded by the TLV, > not the literal bytes in the TLV itself. I.e. Bird does this: > > if (tlv->flags & BABEL_FLAG_ROUTER_ID) > { > state->router_id = ((u64) _I2(msg->prefix)) << 32 | _I3(msg->prefix); > state->router_id_seen = 1; > } > > where msg is the internal data structure containing the parsed values. > > This means there's no problem in combining flag 0x40 with AE 0; but for > IPv4 addresses it needs to be specified whether the addresses should be > padded to the right or the left. > >> So my current thinking is: >> >> - if a Babel speaker receives an update with AE 0 or 1 and bit 0x40 set, >> it MUST set the router-ID to the address in the update, right justified >> and padded with zeroes; > > Yes, this seems reasonable, and should go into a -bis I guess. > >> - a Babel speaker SHOULD NOT set bit 0x40 in updates with AE 0 or 1, >> lest the author meet the wrath of Markus. > > This is "for the time being", or? Surely Markus can be appeased by the > time a new draft is written? ;) > > -Toke > > ___ > Babel-users mailing list > Babel-users@lists.alioth.debian.org > http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] Babel in Bird 1.6.0 (documentation)
I would like to see bird itself grow a finer knowledge of time smaller than 1sec. -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] Multicast IHUs [was: perverse powersave bug with sta/ap mode]
On Thu, Apr 28, 2016 at 10:10 AM, Juliusz Chroboczekwrote: >> 4) And ya know - it might merely be a (sadly common) bug. Everybody's >> supposed to wake up for the multicast beacons and get a notification >> there's more data to come. > > Yes, it's obviously a bug. Just like you, I'm not suprised -- ad-hoc mode > and power save is the kind of thing that's never tested. No, this is the kind of thing that normal users of wifi use - AP/station mode being the most common mode of operation. adhoc - rarely functional or tested power save - VERY tested for people that want to save major power, which is everybody running on battery, pulling out every trick (even dubious ones) to meet consumption goals (rather than network connectivity goals). I do not know to what extent or where the problem I am seeing is actually happening, I can look at the multicast beacons harder to see what's going on. Wifi powersave is not "go to sleep entirely", it is "please wake up on this schedule (250ms) so I can poke you with more unicast data if I have any, it also requires (in the spec) that buffering the accumulated packets be done til that beacon, and multicast packets are supposed to be sent as CAB ("crap after beacon" in ath9k's documentation, content after beacon, elsewhere). The "buffering til you wake up" requirement is hell on trying to roll a airtime fairness scheduler, or codel, in stack portions Certainly many devices simply disassociate when they go to sleep nowadays. -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] Multicast IHUs [was: perverse powersave bug with sta/ap mode]
On Thu, Apr 28, 2016 at 10:10 AM, Juliusz Chroboczekwrote: >> 1) Well, I have suggested that IHU messages actually be unicast rather >> than bundled with the hello. > > Yes, you have suggested that before. I answered I would implement that if > somebody volunteered to do an experimental evaluation. Nobody volunteered. We do need to build the size of the community. I am perpetually giving talks about wifi in front of various audiences. Making the comment: "how many of you use wifi, hands up" Always gets a laugh (from those paying attention) (sometimes I'm sufficiently annoyed at those not paying attention to ask "those of you that are paying attention, hands up") "How many of you understand how it works?" and nearly all the hands go down. "Why is this not a problem?", and then I launch into the talk I guess it would be better to collect my(our) rants, problems, and arguments, tone them down, and get something about - "wifi, the dominant paradigm" into more widely read publications than these mailing lists. The recent conference on wifi in DC had some data like "3 billion wifi devices shipped last year". >> That would help somewhat in this case. > > That's my intuition too, but I've learned to be wary of my intuitions. > Doing wireless stuff without careful evaluation is not something I'll do > again. Yep. Need more people on these problems. I promise to care more after we cut latencies under load on wifi by 2 orders of magnitude on 3 chipsets. >> 2) A protocol that needs "always listening" capability could signal >> the underlying stack to "make sure" these packets hit the air, and one >> that also wants "please be lossy" capabl >> I leave the actual implementation of that request to the fantasies of >> the authors - a new dscp codepoint or three? >> /me ducks > > No need to duck, Dave, it's very similar to what was done with UDP-Lite, > where the use of a specific value in the protocol field signals the link > layer not to discard corrupted frames. I've never seen it in the wild, > I wonder why. Hmm.. In babel's case, switching it to udp-lite would be like 1 line of code. Not that it would help (unless the "don't multicast this" code is explicitly filtering out normal udp only), and the flag day would be no fun, but certainly the basic properties of udplite aren't entirely unaligned... I have done tests of udplite (I have a patch available for it for netperf if anyone wants it) and over ipv6, at least, it did seem to be quite routable over multiple hops. *link-local* udp-lite should "just work". > >> 4) And ya know - it might merely be a (sadly common) bug. Everybody's >> supposed to wake up for the multicast beacons and get a notification >> there's more data to come. > > Yes, it's obviously a bug. Just like you, I'm not suprised -- ad-hoc mode > and power save is the kind of thing that's never tested. I suggest you > disable power saving on all your nodes and be done with it. That does not bode well for normal homenet users in the long run. > -- Juliusz -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] [Make-wifi-fast] [Cerowrt-devel] perverse powersave bug with sta/ap mode
On Thu, Apr 28, 2016 at 9:05 AM, Henning Roggewrote: > On Thu, Apr 28, 2016 at 5:04 PM, moeller0 wrote: >> >>> On Apr 28, 2016, at 15:43 , Toke Høiland-Jørgensen wrote: >>> Presumably the access point could transparently turn IP-level multicast >>> into a unicast frame to each associated station? Not sure how that would >>> work in an IBSS network, though... Does the driver (or mac80211 stack) >>> maintain a list of neighbours at the mac/phy level? >> >> I believe the openwrt developers are thinking a long similar lines, see e.g. >> https://lists.openwrt.org/pipermail/openwrt-devel/2015-June/033398.html > > Why not just sending IP multicast (not 802.11 management frames) with > a higher rate (lowest best linkspeed to all known neighbors)? )I've always liked this idea as an enhancement to the existing 802.11 spec. It is flawed in that the "lowest best linkspeed" in minstrel decays to the lowest configured link rate for stations that have not been sampled recently. (Another thing I like about unicast IHU is that it would keep minstrel's statistics more current). I don't deeply how other rate controllers besides the old "samplerate" work - and only just last week finally put the minstrel paper up to review - (http://blog.cerowrt.org/post/minstrel/). If there are other documents on wifi rate controllers out there, like those in certain wifi chips, I'd love to read them And: I have found several devices in the field that cannot take anything but the base multicast rate, my old nexus 7 was like that. I note I'm not seeking a solution for ND/RA/ARP at the moment - in fact I'm trying to work on something other than routing protocols entirely ! - I would like it if people would stop trying to treat wifi as an ethernet equivalent and treat it as the now dominant paradigm it is, where ethernet is the exception rather than the rule, ethernet fallback as a "nice to have" rather than a necessity. For years I got along just fine on wifi + ethernet using the then common ahcp/babeld single ip methods ( http://blog.cerowrt.org/post/failing_over_faster/ ) I am unfond of turning formerly all multicast protocols into unicast, on wifi, as proposed for openwrt and for that matter, in the ietf. This might make for some background reading: https://tools.ietf.org/html/draft-yourtchenko-colitti-nd-reduce-multicast-00 Anyway, I put a couple pictures up at http://blog.cerowrt.org/post/poking_at_powersave/ - I have some data showing the ap/sta metric going to hell over a few minutes not in that post yet and I still ended up with some difficulties (I have not turned powersave off everywhere yet, repeatably, as I need to find the right hooks to tell Ap/sta mode in networkmanager/systemd/debian/openwrt to turn it off. (?)) - did I say I wasn't working on meshy protocols already? :grump: and I have a conference today and more new gear arriving to play with. > > Henning Rogge > ___ > Make-wifi-fast mailing list > make-wifi-f...@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/make-wifi-fast -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] [Make-wifi-fast] perverse powersave bug with sta/ap mode
On Thu, Apr 28, 2016 at 8:44 AM, Dave Taht <dave.t...@gmail.com> wrote: > On Thu, Apr 28, 2016 at 7:59 AM, Juliusz Chroboczek > <j...@pps.univ-paris-diderot.fr> wrote: >>> Discovery is a special case, that is not quite multicast. [...] So you >>> don't need any facility to "reach all" in one message. >> >> Are we speaking of the IP Internet, or of some other network? > > Heh. Wifi is "some other network", at this point. Perhaps it always > was. It was originally targetted at IPX/SPX. > > As wifi has evolved all sorts of packets below the conventional link > layer that are invisible to IP (management frames in general), perhaps > finding saner ways of exposing these packet types and their properties > to the conventional IP stack - and the IP stack to the properties of > the wifi frames - would be of help. > > For example, I just "ate" the entire 802.11-2012 and 802.11ac > specifications, notably section 9 (Aggregation stuff mostly) and annex > G - for those of you into a digestif, they are publicly available via: > http://standards.ieee.org/about/get/802/802.11.html > > In my case I was mostly looking for properties of ampdus that could be > better leveraged for congestion control. It turns out that you *can* > mark certain MPDUs as QosNOACK, which means that they will not be > block acknowledged. Nobody does this... and, while you could form some > IP packets with this property there's no way to do it on the ath10k > (except in raw mode), and no hook for it on the ath9k, and no way of > the IP layer saying to the 802.11 layer, "it's totally ok you can lose > this". > > Elsewhere in the stack I am seeing retries of 10 (ath9k) and 15 > (Ath10k), and in the initial fq_codel implementation on ath10k, *ZERO* > loss coming from the wifi layer on a string of traces. (I was > leveraging codel's ecn marking abilities to "see" this ) The mac802.11 > portion also has sw retries as a global config, not as something that > is per-station. > > I am certain, out there, there is some wifi EE dancing at how perfect > they've made the wireless layer appear and "transparent" to IP, but I > look at the aircap packet traces I just got with something akin to > horror, 10s of ms of retries on stuff, eating other people's air, that > I'd just as soon throw away, which also shows up on the xplot.org > tcptraces on a wire downstream as spikes in rtt. > > (there is also the needed cts random backoff in there, also, which > makes it hard to distinguish between retries at various rates and > needed backoff. I am sick of manually tearing apart aircaps) > > Now, dpreed's position on how we do wireless wrong is a great starting > point... I wish hd'd publish his 11 layer stack document somewhere... > >> A number of fundamental Internet protocols, such as ARP and ND, use >> multicast for discovery (I see broadcast as a special case of multicast). >> So if you want to implement the TCP/IP suite, your link layer needs to >> support multicast. Some people have tried to work around that (see >> RFC 2022, for example), with IMHO little success. > > Sure wish more wifi folk drank with more ietf folk, more often. > Starting 2 decades back. > >> >> What you seem to be arguing is that it would be possible to design >> a protocol suite that uses anycast for discovery. While an interesting >> research project, your suite would no longer be TCP/IP, good luck getting >> it deployed. >> >> (So what's the solution? As Toke suggested, push the multicast >> implementation to the link layer -- have the link layer convert multicast >> to multiple unicasts in a way that's invisible to the network layer. >> After all, that's what the link layer is for -- hiding the idiosyncrasies >> of a given physical layer from the network layer.) > > 1) Well, I have suggested that IHU messages actually be unicast rather > than bundled with the hello. That would help somewhat in this case. > (and also fix cases where multicast works and unicast doesn't). > multicast hello would become more of a discovery protocol and you > could actually signal you can "take" a unicast hello (via a new tlv) > and establish an ongoing multicast-free association that way. > > Given the currently "perfect" characteristics of the underlying > unicast wireless link layer that would tend to eliminate packet loss > as a viable metric of quality. :( > > 2) A protocol that needs "always listening" capability could signal > the underlying stack to "make sure" these packets hit the air, and one > that also wants "please be lossy" capabl > > I lea
Re: [Babel-users] [Make-wifi-fast] perverse powersave bug with sta/ap mode
On Thu, Apr 28, 2016 at 7:59 AM, Juliusz Chroboczekwrote: >> Discovery is a special case, that is not quite multicast. [...] So you >> don't need any facility to "reach all" in one message. > > Are we speaking of the IP Internet, or of some other network? Heh. Wifi is "some other network", at this point. Perhaps it always was. It was originally targetted at IPX/SPX. As wifi has evolved all sorts of packets below the conventional link layer that are invisible to IP (management frames in general), perhaps finding saner ways of exposing these packet types and their properties to the conventional IP stack - and the IP stack to the properties of the wifi frames - would be of help. For example, I just "ate" the entire 802.11-2012 and 802.11ac specifications, notably section 9 (Aggregation stuff mostly) and annex G - for those of you into a digestif, they are publicly available via: http://standards.ieee.org/about/get/802/802.11.html In my case I was mostly looking for properties of ampdus that could be better leveraged for congestion control. It turns out that you *can* mark certain MPDUs as QosNOACK, which means that they will not be block acknowledged. Nobody does this... and, while you could form some IP packets with this property there's no way to do it on the ath10k (except in raw mode), and no hook for it on the ath9k, and no way of the IP layer saying to the 802.11 layer, "it's totally ok you can lose this". Elsewhere in the stack I am seeing retries of 10 (ath9k) and 15 (Ath10k), and in the initial fq_codel implementation on ath10k, *ZERO* loss coming from the wifi layer on a string of traces. (I was leveraging codel's ecn marking abilities to "see" this ) The mac802.11 portion also has sw retries as a global config, not as something that is per-station. I am certain, out there, there is some wifi EE dancing at how perfect they've made the wireless layer appear and "transparent" to IP, but I look at the aircap packet traces I just got with something akin to horror, 10s of ms of retries on stuff, eating other people's air, that I'd just as soon throw away, which also shows up on the xplot.org tcptraces on a wire downstream as spikes in rtt. (there is also the needed cts random backoff in there, also, which makes it hard to distinguish between retries at various rates and needed backoff. I am sick of manually tearing apart aircaps) Now, dpreed's position on how we do wireless wrong is a great starting point... I wish hd'd publish his 11 layer stack document somewhere... > A number of fundamental Internet protocols, such as ARP and ND, use > multicast for discovery (I see broadcast as a special case of multicast). > So if you want to implement the TCP/IP suite, your link layer needs to > support multicast. Some people have tried to work around that (see > RFC 2022, for example), with IMHO little success. Sure wish more wifi folk drank with more ietf folk, more often. Starting 2 decades back. > > What you seem to be arguing is that it would be possible to design > a protocol suite that uses anycast for discovery. While an interesting > research project, your suite would no longer be TCP/IP, good luck getting > it deployed. > > (So what's the solution? As Toke suggested, push the multicast > implementation to the link layer -- have the link layer convert multicast > to multiple unicasts in a way that's invisible to the network layer. > After all, that's what the link layer is for -- hiding the idiosyncrasies > of a given physical layer from the network layer.) 1) Well, I have suggested that IHU messages actually be unicast rather than bundled with the hello. That would help somewhat in this case. (and also fix cases where multicast works and unicast doesn't). multicast hello would become more of a discovery protocol and you could actually signal you can "take" a unicast hello (via a new tlv) and establish an ongoing multicast-free association that way. Given the currently "perfect" characteristics of the underlying unicast wireless link layer that would tend to eliminate packet loss as a viable metric of quality. :( 2) A protocol that needs "always listening" capability could signal the underlying stack to "make sure" these packets hit the air, and one that also wants "please be lossy" capabl I leave the actual implementation of that request to the fantasies of the authors - a new dscp codepoint or three? /me ducks 3) There are other stats from minstrel, station association, crypto state, etc, that could be leveraged. 4) And ya know - it might merely be a (sadly common) bug. Everybody's supposed to wake up for the multicast beacons and get a notification there's more data to come. > -- Juliusz > ___ > Make-wifi-fast mailing list > make-wifi-f...@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/make-wifi-fast -- Dave Täht Let's go make home routers and wifi faster! With better software!
[Babel-users] bug in ^C in -G
this is repeatable on all platforms. root@apu2:/etc# telnet ::1 33123 Trying ::1... Connected to ::1. Escape character is '^]'. BABEL 1.0 version babeld-1.7.1-59-gb648a17-dirty host apu2 my-id 02:0d:b9:ff:fe:41:6c:2c ok ^C dump ^C^C^C dump dump, darn it ^C dump dump dump dump please dump dump, with sugar on top? dump I really love you with echo dump | nc ::1 33123 but ^C^C dump quit Connection closed by foreign host. (why does quit work and dump not after a cntrl-c?) -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
[Babel-users] perverse powersave bug with sta/ap mode
Pain shared, reduced, joy shared increased... for weeks now I've been puzzling over why a variety of links flapped the way they did, routes coming and going, failing over to weird paths, and I think I have finally isolated one part of the problem. In an age where adhoc does not work particularly well, and AP/sta mode does (as in 6mbit vs 500 in one case), I've had a tendency to nail up links in ap/sta mode. Well, at least one ( probably several) of the devices I have in the new lab has *very aggressive* power save, to where babel ipv6 multicast traffic either doesn't sync up to the AP's request for multicast (or the sta's), or it is merely completely suppressed by the stack. (or lost due to a bug!)... Anyway... So long as there is unicast traffic on the local part of the link, you don't see a problem. And there's almost always a bit of traffic on the link. So, perversely... like when I'm looking at it... ssh'd through to the other side, running something like "watch tc", or like, pinging from one side of the link to the other... it works. When I go away for a bit... it fails. Eventually. If I run a test, after getting everything all setup and verified the network looks correct... it works. If I walk away and run a test that has a few minutes :grump: between runs to let things "settle down", things actually deteriorate. Babel misses multicast traffic and gradually increases the metric on that interface due to the loss - causing a given route, in my case, to eventually fall over to an adhoc wifi radio elsewhere on the network, which reduces the probability of unicast traffic through that link still more, until ultimately the local link, otherwise nailed up, drops off the network completely. to "fix" this: iw dev wlp4s0 set power_save off worked beautifully on the ath10k driver I'm using. The babel metric stayed stable, the route stayed stable, life was good, throughput increased, latency dropped... That said, I know how hard wifi device driver writers are hammering at trying to reduce multicast effects, and save power... and I haven't exactly found the root cause of this problem, in this driver... (or in the ath9k on the AP side) but I think I've seen it elsewhere also, while chasing this -l failover issue. multicast beacons are supposed to say "hey, chips, wake up, you need to hear this". -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] failing over faster?
On Mon, Apr 25, 2016 at 4:33 PM, Juliusz Chroboczekwrote: >> A good question would be, what would the ideal time between tests be >> for the network to stablize? 3 minutes? At least in one series I'd >> started tests back to back, and didn't kick in the drop link stuff at >> the right times. > > SOURCE_GC_TIME is 200 > hold time is 3.5 * update_interval > > so in order to make sure that all stale data has been flushed, you should > keep a silent time of > > MAX(SOURCE_GC_TIME, 3.5 * update_interval) > > Note that's update_interval, not hello_interval. OK. Basically you want two forms of experiment from me - Overall topology, 3 routers, 2 ethernet, 1 wifi. Fail over in different, regular, documented ways, and see what happens. 1) with and without -l 2) with and without the patch to check-interfaces (can't I just put in a debug statement to see if the pi/etc are sending the message?) I note there are two places where the polling interval is set in the code. I note that also a target in these earlier tests was the linksys 1200ac, which has some major issues at a gigE. In this quick failover test I'd gone from the c2 (100mbit), to the apu2 - gigE, and... wow... explain THIS! http://blog.cerowrt.org/flent/failing_over_faster/linksys_1200_majorly_headblocking.svg I am going to switch to another box as a target, finish solidifying the overall testbed, and then go heads down into the ath10k driver. I hope by documenting some of the headaches I've had in getting these boxes up and working at all, others will benefit. http://blog.cerowrt.org/post/some_hackerboards_for_wifi/ > > -- Juliusz > -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] failing over faster?
and in other news the odroid c2's current kernel, and the rpi3 and rpi2, now all do IPV6_SUBTREES correctly. ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] failing over faster?
Tee-hee! Don't overanalyze yet!, that was not a strictly repeatable test, as yet. (if you want access to the testbed send me a ssh key) A good question would be, what would the ideal time between tests be for the network to stablize? 3 minutes? At least in one series I'd started tests back to back, and didn't kick in the drop link stuff at the right times. odroid c2 (because it's slower than the apu)'s babeld.conf, running 1.8 + patches # For more information about this configuration file, refer to # babeld(8) default enable-timestamps true ipv6-subtrees true redistribute local deny interface eth0 interface usbnet0 out if eth0 metric 33 # gigE but doesn't come close out if usbnet0 metric 34 # (only gets 200mbit at best in one direction) # if you would suggest a better metric, let me know. apu2: babeld.conf running 1.8+ patches. Rate limited to 400Mbit on all ports, but does that *really well*. Can definately do gigE... but that speed messes up the the linksys 1200 which I am replacing with something faster d@apu2:~$ cat /etc/babeld.conf # For more information about this configuration file, refer to # babeld(8) redistribute local deny linksys 1200 (babeld 1.7.1 openwrt trunk) which is on the 64 network only is configured to have the ad-hoc interface, there is another box or three somewhere (pi3? pi? cerowrt?) on the adhoc that gets it back to the other network. I will simplify! This amd x86 box, btw, is finally looking like the ideal platform for the fq_codel on wifi work, as well as fiddling with router stuff with it's onboard 3 port intel ethernet http://pcengines.ch/apu2c4.htm I have two of 'em, they are so far working out pretty great. On Mon, Apr 25, 2016 at 1:03 PM, Juliusz Chroboczekwrote: >> http://blog.cerowrt.org/post/failing_over_faster/ > > Why does the first stream fail at time 120? Broken firewall? > > There's something wrong in the second stream -- you're falling back in > 30s, which is a tad high. Can i please see your babeld.conf? > > -- Juliusz -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] failing over faster?
Thank ghu we aren't homenet! Wires are dead! :) I will incorporate your comments later today. Until then, there's pictures and data now up at: http://blog.cerowrt.org/post/failing_over_faster/ I am quite puzzled as to how long it takes to fail over even in the good cases. I guess I gotta take some packet captures. On Mon, Apr 25, 2016 at 11:47 AM, Juliusz Chroboczekwrote: >> 8+ years ago, with ahcp and babel, and a network configured to use >> that with a single static ip address on both the ethernet and wifi, I >> could do that. My own networks were setup that way, anyway... I did it >> all the time. It was wonderful. I never had to think about it. > > Dave, the plan is to do exactly that with shncpd and babeld -- think of > shncpd as ahcpdv2. Please try running babeld and shncpd (-M) on the host, > and if it doesn't work as well as ahcpd, we'll fix it. > >> It was massively disconcerting to attempt to move back into the >> "regular" world where wifi and ethernet were treated as distinct, >> where taking an interface offline lost its address, > > Right. One difference between ahcpd and shncpd -M is that the former uses > a single address, while the latter uses one address per interface. The > workaround is to keep the interface up, even if it is unconnected. > Since -M is out of spec anyway, I can be convinced to change that. > >> where taking a new /64 was considered mandatory, > > That's what -M is for. > >> and no host changes allowed, > > We're not Homenet, Dave, we're independent researchers. Just because > Homenet rejects something doesn't mean we shouldn't do the right thing. > My personal opinion is that having reasonable support for unchanged hosts > is a goodness, but we shouldn't shy from designing better hosts. > >> I've harped on a need for atomic updates, but I still think that >> a userspace routing daemon simply can't react fast enough to a change in >> an ethernet routing table to prevent no-route messages being sent to one >> or more flows on a busy link when it goes down. > > Higher-layer protocols should be able to survive ICMP unreachable by > retrying after a few jiffies. TCP certainly does, and if your protocol > doesn't, it's a bug in the protocol. > >> A newer problem that I haven't thunk much about before was that babel >> aims for a stable route, so if I have 3 routes - one stable, but >> lousy, and both the better routes flap twice in under 60 seconds or >> so, we end up choosing the stablest route, sometimes for a very long >> time. > > Yes, over the years babeld has been tuned to prefer stable routes. Have > you tried playing with -M? I'm quite open to changing its default value. > > -- Juliusz -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] failing over faster?
This ended up being a deeply philosophical digression into routing behaviors that I think I'll have to blog about, with pictures, to fully describe. What I want is a world of ubiquitous always-on connectivity[1] - where you can be at your desk with 20 connections nailed up, listening to an audio stream, doing a big upload or download - then pull your box out of the ethernet dock, go to wifi, move to another room, plug in again, and everything survive and take advantage of the better link after a few seconds. 8+ years ago, with ahcp and babel, and a network configured to use that with a single static ip address on both the ethernet and wifi, I could do that. My own networks were setup that way, anyway... I did it all the time. It was wonderful. I never had to think about it. It was massively disconcerting to attempt to move back into the "regular" world where wifi and ethernet were treated as distinct, where taking an interface offline lost its address, where taking a new /64 was considered mandatory, and no host changes allowed, as part of homenet. I'd switch to how things were done "in the real world" - get up from my desk - despite having both the wifi and ethernet online at the same time - and all my connections would drop. Agh Sure, new protocols like mosh-multipath, quic, etc, recover from a move, but they don't... that wasn't the case I was testing, I was testing multiple routes through the middle of the network, where I'd hope for better behavior while there is load. So what I get currently from trying to do failover in the middle of the network right now, using the -l option and the supplied patch, is that usually the failover is not quite quick enough, and 1 or more connections fails like this: (using the flent rrul test here) Program output: netperf: send_omni: recv_data failed: No route to host netperf: send_omni: recv_data failed: No route to host Interim result: 33.47 10^6bits/s over 0.200 seconds ending at 1461547666.713 Interim result: 22.99 10^6bits/s over 0.201 seconds ending at 1461547666.914 Interim result: I've harped on a need for atomic updates, but I still think that a userspace routing daemon simply can't react fast enough to a change in an ethernet routing table to prevent no-route messages being sent to one or more flows on a busy link when it goes down. So I got a mildly better result by installing a static backup link, like this: 172.26.64.0/24 via 172.26.64.1 dev usbnet0 proto babel onlink 172.26.64.0/24 dev usbnet0 proto kernel scope link src 172.26.64.231 metric 100 172.26.64.0/24 via 172.26.16.5 dev eth0 metric 200 for which the traffic survives the ifconfig usbnet0 down event better. I imagine that putting in the "3 best routes" into the kernel RIB is not something most meshy daemons do? A newer problem that I haven't thunk much about before was that babel aims for a stable route, so if I have 3 routes - one stable, but lousy, and both the better routes flap twice in under 60 seconds or so, we end up choosing the stablest route, sometimes for a very long time. I still see many seconds before stuff recovers in some instances. [1] http://frankston.com/public/?n=IAC.UAC ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] failing over faster?
groovy. I will try it as soon as I can This showed some potential for doing it faster than that: http://stackoverflow.com/questions/7225888/how-can-i-monitor-the-nic-statusup-down-in-a-c-program-without-polling-the-ker On Sun, Apr 24, 2016 at 1:21 PM, Juliusz Chroboczekwrote: >> # and we fail over in 32 seconds > > What happens if you apply the following patch? > > diff --git a/babeld.c b/babeld.c > index 3127e72..0183b32 100644 > --- a/babeld.c > +++ b/babeld.c > @@ -744,7 +744,7 @@ main(int argc, char **argv) > > if(timeval_compare(_interfaces_timeout, ) < 0) { > check_interfaces(); > -schedule_interfaces_check(3, 1); > +schedule_interfaces_check(1000, 1); > } > > if(now.tv_sec >= expiry_time) { -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
[Babel-users] failing over faster?
I am fiddling with a rasberry pi3 with a usb ethernet (making it a 100mbit router), the onboard wifi, and 2 usb wifi sticks... with all the interfaces up I do a ping over ethernet 64 bytes from 172.26.64.231: icmp_seq=56 ttl=63 time=1.45 ms 64 bytes from 172.26.64.231: icmp_seq=57 ttl=63 time=1.18 ms 64 bytes from 172.26.64.231: icmp_seq=58 ttl=63 time=2.89 ms 64 bytes from 172.26.64.231: icmp_seq=59 ttl=63 time=1.20 ms 64 bytes from 172.26.64.231: icmp_seq=60 ttl=63 time=1.30 ms 64 bytes from 172.26.64.231: icmp_seq=61 ttl=63 time=1.42 ms I do an ifconfig eth0 down # at this point, after a bit we get: ping: sendmsg: No route to host ping: sendmsg: No route to host ping: sendmsg: No route to host ping: sendmsg: No route to host ping: sendmsg: No route to host ping: sendmsg: No route to host ping: sendmsg: No route to host ping: sendmsg: No route to host ping: sendmsg: No route to host ping: sendmsg: No route to host ping: sendmsg: No route to host ping: sendmsg: No route to host ping: sendmsg: No route to host ping: sendmsg: No route to host ping: sendmsg: No route to host ping: sendmsg: No route to host ping: sendmsg: No route to host ping: sendmsg: No route to host ping: sendmsg: No route to host ping: sendmsg: No route to host # and we fail over in 32 seconds 64 bytes from 172.26.64.231: icmp_seq=94 ttl=62 time=41.5 ms 64 bytes from 172.26.64.231: icmp_seq=95 ttl=62 time=2.10 ms 64 bytes from 172.26.64.231: icmp_seq=96 ttl=62 time=10.5 ms 64 bytes from 172.26.64.231: icmp_seq=97 ttl=62 time=7.27 ms 64 bytes from 172.26.64.231: icmp_seq=98 ttl=62 time=9.58 ms 64 bytes from 172.26.64.231: icmp_seq=99 ttl=62 time=15.3 ms 64 bytes from 172.26.64.231: icmp_seq=100 ttl=62 time=66.1 ms 64 bytes from 172.26.64.231: icmp_seq=101 ttl=63 time=7.73 ms but I was under the impression we'd fail over faster with -l on and we'd not get a "no route to host" (there are two hops on the mesh in the way...) babeld.conf default enable-timestamps true ipv6-subtrees true # eth1 is attached to a bridged wifi/wired network interface eth0 wired true link-quality false interface eth1 wired true link-quality true # All these adhoc interfaces suck compared to others on the network # and right now, all on 6 diversity 3 interface wlan1 channel 6 interface wlan0 channel 6 interface wlan2 channel 6 out if wlan1 metric 512 out if wlan0 metric 512 out if wlan2 metric 512 #I wanted to get hncp mesh addresses only (so as to be able to do ss #routing #redistribute local ::/128 eq 128 allow #redistribute local ::/64 gt 128 deny redistribute local deny # but ended up going with this for now redistribute proto 43 allow -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] New paper on Babel, BMX6 and OLSR: Evaluation of mesh routing protocols for wireless community networks
I finally got around to reading these papers. I liked the compression techniques used by bmx6... I can't help but want to also increase babel's update rate from 2sec to .5sec as in bmx6 for comparison. On Tue, Apr 12, 2016 at 12:46 PM, Baptiste Jonglezwrote: > Hi all, > > There is a new paper comparing the relative performance of babeld, olsrd, > and bmx6: > > Neumann, Axel, Ester López, and Leandro Navarro. "Evaluation of mesh > routing protocols for wireless community networks." Computer Networks > 93 (2015): 308-323. > > Compared to the 2013 paper, it's much more torough, and used a testbed > while the 2013 paper was only based on emulation (this should make > Matthieu happy!). > > The conclusion is the following: > > Babel is the most lightweight protocol with the least memory, CPU, and > control-traffic requirements as long as it is used in networks with > stable links and low node densities. > > However, if the protocol is used in large or dense wireless > deployments with frequent link changes due to dynamic interference or > nodes leaving or joining the network, then its reactive mechanisms to > encounter topology changes by sending additional routing updates and > route request messages turn into massive control-traffic and > processing overhead. In such scenarios, OLSR and BMX6, with their > strictly constant rate for sending topology and routing update > messages, outperform Babel in terms of overhead, stability, and even > self-healing capabilities. > > > The paper is available here: > > http://www.sciencedirect.com/science/article/pii/S1389128615002522 > (paywalled) > http://www2.ic.uff.br/~celio/classes/cmovel/slides/community-mesh-2015.pdf > http://people.ac.upc.edu/leandro/pubs/eomrpfwcn.pdf (open-access preprint) > http://dsg.ac.upc.edu/eval-mesh-routing-wcn (more information, data, > scripts) > > > Baptiste > > ___ > Babel-users mailing list > Babel-users@lists.alioth.debian.org > http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] babeld crashes
I eliminated the -l option from all my boxes and thus far I have not seen it crash. But I am not trying too hard to make things on my network come and go right now, I'm busy on other things: http://blog.cerowrt.org I will put a couple boxes under valgrind the next time I re-org the network, sometime later this week. On Sat, Apr 16, 2016 at 9:04 PM, Juliusz Chroboczekwrote: > Dave, could you please try to reproduce this under valgrind? -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] babeld crashes
And I got it to happen on the pi3. (gdb) bt #0 0x76e09f70 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x76e0b324 in __GI_abort () at abort.c:89 #2 0x76e45954 in __libc_message (do_abort=, fmt=0x76efb830 "*** Error in `%s': %s: 0x%s ***\n") at ../sysdeps/posix/libc_fatal.c:175 #3 0x76e4bb80 in malloc_printerr (action=1, str=0x76efba6c "double free or corruption (fasttop)", ptr=) at malloc.c:4996 #4 0x76e4cb24 in _int_free (av=, p=, have_lock=100916) at malloc.c:3840 #5 0x0001a35c in update_route (id=, prefix=, plen=, src_prefix=, src_plen=0 '\000', seqno=17136, refmetric=96, interval=1600, neigh=0xc5d9d8, nexthop=0xc5d9e0 "\376\200", channels=0x7ede598c "", channels_len=0) at route.c:920 #6 0x0001f10c in parse_packet (from=0x0, from@entry=0x7ede5a30 "\n", ifp=0x0, packet=0x1 , packetlen=) at message.c:644 #7 0x000126d8 in main (argc=, argv=) at babeld.c:675 On Fri, Apr 15, 2016 at 6:39 PM, Dave Taht <dave.t...@gmail.com> wrote: > I have been experiencing babeld crashes since starting to use git head > a few weeks ago. > > Today after putting in git head everywhere I have been getting quite a > few crashes (no babel process running, bunch of babel routes left > behind) - I was not paying much attention to it ( these are a bunch of > new machines that I was doing other things to and I had assumed it was > systemd messing up on a restart (I am new to systemd), so I would see > a creat(/var/run/babeld.pid): File exists... > > but nope, I'm segvioing at some point. > > I did just manage to see a crash go by and get a core dump. I will > reboot and retry, then go back a few versions. It took about 5 minutes > of operation on an active network before this happened, this time > > 0 malloc_consolidate (av=av@entry=0x7f47ad14fc00 ) > at malloc.c:4136 > 4136malloc.c: No such file or directory. > (gdb) up > #1 0x7f47ace0c9d4 in _int_malloc ( > av=av@entry=0x7f47ad14fc00 , bytes=bytes@entry=3916) > at malloc.c:3417 > 3417in malloc.c > (gdb) up > #2 0x7f47ace0f4ae in __GI___libc_malloc (bytes=bytes@entry=3916) > at malloc.c:2895 > 2895in malloc.c > (gdb) up > #3 0x0040c4f7 in buffer_update (ifp=ifp@entry=0x1d365e0, > prefix=prefix@entry=0x1d37dc0 "\375\020", plen=plen@entry=128 '\200', > src_prefix=src_prefix@entry=0x1d37dd1 "", src_plen=src_plen@entry=0 > '\000') > at message.c:1443 > 1443ifp->buffered_updates = malloc(n * sizeof(struct > buffered_update)); > (gdb) up > #4 0x0040c85a in send_update (ifp=ifp@entry=0x1d365e0, > urgent=urgent@entry=0, > prefix=0x1d37dc0 "\375\020", plen=, > src_prefix=0x1d37dd1 "", src_plen=0 '\000') > at message.c:1497 > 1497buffer_update(ifp, prefix, plen, src_prefix, src_plen); > > (gdb) up > #5 0x0040c6ed in send_self_update (ifp=0x1d365e0) at message.c:1595 > 1595send_update(ifp, 0, xroute->prefix, xroute->plen, > (gdb) up > #6 0x0040c86f in send_update (ifp=0x1d365e0, urgent=0, > prefix=prefix@entry=0x0, > plen=plen@entry=0 '\000', src_prefix=0x414460 "", > src_plen=src_plen@entry=0 '\000') > at message.c:1500 > 1500send_self_update(ifp); > > #7 0x0040c93f in send_update (ifp=ifp@entry=0x1d365e0, > urgent=urgent@entry=0, > prefix=prefix@entry=0x0, plen=plen@entry=0 '\000', > src_prefix=src_prefix@entry=0x0, > src_plen=src_plen@entry=0 '\000') at message.c:1524 > 1524send_update(ifp, urgent, NULL, 0, zeroes, 0); > > #8 0x00402f80 in main (argc=, argv= out>) at babeld.c:767 > 767send_update(ifp, 0, NULL, 0, NULL, 0); > > *My babeld.conf is this: > > default enable-timestamps true > redistribute local deny > > *babeld command line: > > babeld -l -G 33123 -S /var/lib/babeld/state eno1 wlp2s0 wlx9cefd5ff0b2c > > the network has got sort of complex in recent days. > > -- > Dave Täht > Let's go make home routers and wifi faster! With better software! > http://blog.cerowrt.org -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] IPV6_SUBTREES vs policy routing
I can confirm this patch does the right thing on x86_64, and that the current kernels on the pi3 and pi2 now also support IPV6_SUBTREES correctly. The wifi on the pi3 *sucks rocks* in adhoc mode from a bufferbloat perspective, even worse than the ath9k. It does support a usb ethernet and a few other usb wifi chips, I'm going to try those - but as a little quick and dirty test router the only problem I've had with loading it up with 3 usb sticks and ethernet was in the power supply. I currently have it successfully using an ethernet dongle, one usb flash stick and one external wifi adaptor. I am told IPV6_SUBTREEs is now in the odroid c2 kernel tree, but they have not put out a binary yet. I am curious as to what the behavior should be for source specific ipv4? On Fri, Apr 15, 2016 at 9:45 AM, Matthieu Boutierwrote: >> Do you think this is serious enough to justify releasing 1.7.2? > > No, it only adds additional v6 rules: associated tables remains empty. > >> Matthieu, could you please tell me when the bug was introduced, so I can >> put it in the changelog? > > ... so, it's quite unclear. The symptoms appear between > > c18e3b0a389fdbf5cc243b094ebb42e59139f035 > > and > > 58dbd2f425a7dcdb58e5b6923bd7df309e233d79 > > most probably at 72a6264355c0d0e98c48ca45c609c6d0288ec05c > > but the bug itself is probably introduced with the function (or its usage). > > Matthieu > -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] IETF news
On Sat, Apr 9, 2016 at 11:07 AM, Juliusz Chroboczekwrote: >> attending remotely was not quite satisfactory. > > We'll need to think about how to make the babel WG as friendly to remote > participation as possible. Obviously, having a competent Jabber scribe is > important, one that ensures that Jabber questions get asked at the mike in > timely manner. Anything else A video/audio recording option would be simplest in meetecho. Ghu knows how many legal hurdles that would have. I like that many other conferences do make recordings available. >> How was Bit's and Bytes? > > We left after a glass of wine or three, we were dining with the VIPs. > >> Were the omnia folk there? https://omnia.turris.cz/en/ had made a big splash at the previous ietf. They are also the people behind bird. > > Who? > > -- Juliusz ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] IETF news
On Sat, Apr 9, 2016 at 10:17 AM, Juliusz Chroboczekwrote: >> IETF is boring without Markus. And without Henning. And without Steven. > > And without Dave. (But I'm in touch with Dave more often than the other > three.) Tee-hee. I have to admit that I longed to be there and attending remotely was not quite satisfactory. How was Bit's and Bytes? Were the omnia folk there? I would like to help put together some form of cross-platform homenet demo for that part of the berlin IETF (mid-july). At the moment I have regular openwrt, raspbian, and x86 linux boxes all interoperating (though I have yet to adopt hnetd for anything), it would be nice to have yocto, dd-wrt, and perhaps a major vendor (eero? omnia? ubnt? intel? google? Cisco? anyone?) also. It might also be good to show a more regular wifi/AP mode interaction with mobility and multiple channels going. > > ___ > Babel-users mailing list > Babel-users@lists.alioth.debian.org > http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
[Babel-users] breaking the ath9k in adhoc mode with the new fq_codel implementation
So, thank you for exposing a bug in my code today. The new ath9k fq_codel code at the 802.11 mac layer bypasses the qdisc... root@dancer:~/Pictures# tc -s qdisc show dev wlp2s0 qdisc noqueue 0: root refcnt 2 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 and somewhere IFF_RUNNING is no longer being set when iwconfig is in adhoc mode... so I ended up with the infinit txcost root@dancer:~/Pictures# ifconfig wlp2s0 wlp2s0Link encap:Ethernet HWaddr 00:21:63:2f:f2:f4 inet addr:172.26.17.246 Bcast:255.255.255.255 Mask:255.255.255.255 UP BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:41177 errors:0 dropped:0 overruns:0 frame:0 TX packets:27218 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:13436569 (13.4 MB) TX bytes:11838910 (11.8 MB) root@dancer:~/Pictures# iwconfig wlp2s0 wlp2s0IEEE 802.11abgn ESSID:"babel" Mode:Ad-Hoc Frequency:2.437 GHz Cell: 32:DD:4C:C7:F8:87 Tx-Power=16 dBm Retry short limit:7 RTS thr:off Fragment thr:off Encryption key:off Power Management:off and after I unwedge the darn thing out of adhoc and back into sta... life is much better. UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 http://www.taht.net/~d/unwedged.png http://www.taht.net/~d/unwedged_eth0down.png Checking around, I am seeing no "running" field on the rpi2 which might explain why it's wifi doesn't work with babel - that's running the base OS and a "panda" usb stick in adhoc. I AM seeing the running field on the rpi3, but that has kernel issues... I can try to sum up everything that broke on every machine, but my head hurts, and I'm going back to what I was doing in the first place. Dave Täht Let's go make home routers and wifi faster! With better software! https://www.gofundme.com/savewifi ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] Control socket language [wass Babel on itty bitty boxes]
On Mon, Mar 28, 2016 at 9:27 AM, Gabriel Kerneiswrote: > > > On Mon, Mar 28, 2016, at 17:53, Juliusz Chroboczek wrote: >> > PS: I am loving the new "dump" functionality. Tons easier to read than >> > a logfile. echo 'dump' | nc ::1 33123. >> >> Thanks for the kind words. >> >> Folks -- I've had little feedback on the new control socket command >> language. If you wish to complain, please try to do it before the third >> week of April, which is when I plan to release 1.8.0. > > What about dumping the configuration options too? +1 > That would solve the > question of "was subtrees enabled during this test?" and the like. heh. yep. I wouldn't mind seeing filters also. is babelweb using "dump" yet? Seems saner to do a poll/response on a busy network... (I'll go pull it. if you can't tell this is the first time I've mucked with babel in a long while) > > -- > Gabriel > > ___ > Babel-users mailing list > Babel-users@lists.alioth.debian.org > http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] Babel on itty bitty boxes
On Mon, Mar 28, 2016 at 8:35 AM, Matthieu Boutierwrote: >> Why don't you write to the netdev list to ask what's a reliable way to >> detect IPv6_SUBTREES? > > Yes, I'm asking myself if the Dave's "invalid argument" are for > source-specific routes. In which case it answers the question. Will test it. Yea, a failure to insert these would be a good test for falling back to non-subtrees. ... I uploading to: http://www.taht.net/~d/ babeld.dump-ipv6-subtrees-false babeld_2_wlans.log babled-ipv6-subtrees-false.log I see the invalid argument go by in the subtrees-false case. It was a real joy to be able to just compile stuff directly on these itty bitty boxes, which saved much time vs openwrt. too much time. I do crazy things so others don't have to. ... another possible bug is that I would have assumed the bridged ap box would have detected the bridge and supplied a different metric or cost. I think it is (256) according to the logs, so that is right... The topology there is: pi3 - AP br-lan sw10 (fe80::120d:7fff:fe64:c992) (cerowrt connecting as a sta rather than adhoc). Admittedly that connection measures at 80mbits and is probably a great deal less flaky than the pi is. IF you are bored and want access to these boxes let me know. ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] Babel on itty bitty boxes
On Mon, Mar 28, 2016 at 8:26 AM, Matthieu Boutierwrote: >> I put a babel debug 3 log up at: >> >> http://www.taht.net/~d/babeld_pi3.log > > Strange, there is no more "kernel_route(ADD): invalid argument" lines. Is it > really the same node, with the same options ? yes. Perhaps, however, I had earlier had ipv6-subtrees true. And although I seem to have connectivity over the wifi link now (I did not on that test), it does not choose it. > > So you should have all the right routes in the FIB, no ? Meh. The source of all my issues was that I volunteered to try out 1.8 on a whole bunch of machines at once that I'd never tried it on. :) Compiling 1.8 on all of them was easy... tracking down each individual bug was not. I kind of expected the local wlan connection also to the other side of the pi's link to go direct inside of 2 minutes. It's not and damned if I know why, besides the kernel errors... For example, right now, to connect the two networks is a babel running on a bridged wifi AP/sta (not adhoc)/ethernet. Which remains for all routes on the pi, despite it too having 2 direct links. I know, one step at a time, but it was just a quick test... I thought delusion-ally that everything on every platform would "just work" by now. > > Matthieu > > PS: sorry for your flash !! ;-) > PS: I am loving the new "dump" functionality. Tons easier to read than a logfile. echo 'dump' | nc ::1 33123. http://www.taht.net/~d/ has a dump and a much bigger log, with eth0, wlan1,wlan2 enabled. ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] Babel on itty bitty boxes
On Fri, Mar 25, 2016 at 2:41 AM, Matthieu Boutierwrote: >> I checked and IPV6_SUBTREES is disabled in the raspbian kernel >> build. I filed a bug. > > Good to know. What if you specify explicitly to babeld not to use subtrees: > > ipv6-subtrees false Congrats! That fixed the 3.14 kernel based droid c2. ip -6 rule show showed the source specific stuff. It did not fix the 3.10 kernel based c1. So autodetection is broken on the c2 & c1 kernels, but works on the pi kernel, and the ipv6-subtrees option can be set manually on the c2 to work. I am not seeing the kernel errors on the c2 I am on the rpi, see below. I will try hooking up a usb wifi stick to the c2, I will try to find one that works. > >> It is straightforward to build a new kernel whenever get around to it. > > It's written "optional", yeah? So it should not be that much important. > Remove it! Your cynicism is shared and appreciated. routerB does not get the "default from" routes. >>> >>> Aren't they even sent on the link ? (tcpdump from rpi) >> >> rpi3, c2, c1, x86 box are on the same switch. > > Ah ok, so what's in the babel RIB on rpi? > (killall -USR1 babeld; cat /var/log/babeld.log) Attached. I note that while the pi3 had had ip -6 rules showing, this log dump is after a fresh restart - and even after waiting a few minutes I did not get those ip -6 rules inserted. There are a bunch of kernel errors from dmesg along the lines of the second bug I filed with the raspberry folk, which are probably related. I filed a bug on these: [258034.946162] : hw csum failure [258034.946190] CPU: 0 PID: 21128 Comm: babeld Tainted: GW 4.1.19-v7+ #853 [258034.946200] Hardware name: BCM2709 [258034.946238] [<800185e0>] (unwind_backtrace) from [<80013f48>] (show_stack+0x20/0x24) [258034.946263] [<80013f48>] (show_stack) from [<80572fac>] (dump_stack+0xd4/0x118) [258034.946289] [<80572fac>] (dump_stack) from [<80497444>] (netdev_rx_csum_fault+0x44/0x48) [258034.946312] [<80497444>] (netdev_rx_csum_fault) from [<8048c6d0>] (skb_copy_and_csum_datagram_msg+0xdc/0xe8) [258034.946394] [<8048c6d0>] (skb_copy_and_csum_datagram_msg) from [<7f01f048>] (udpv6_recvmsg+0x11c/0x7cc [ipv6]) [258034.946463] [<7f01f048>] (udpv6_recvmsg [ipv6]) from [<80509d18>] (inet_recvmsg+0xa4/0xb8) [258034.946489] [<80509d18>] (inet_recvmsg) from [<8047c3dc>] (sock_recvmsg+0x20/0x24) [258034.946510] [<8047c3dc>] (sock_recvmsg) from [<8047e274>] (___sys_recvmsg+0xa4/0x12c) [258034.946528] [<8047e274>] (___sys_recvmsg) from [<8047f0f8>] (__sys_recvmsg+0x4c/0x7c) [258034.946547] [<8047f0f8>] (__sys_recvmsg) from [<8047f140>] (SyS_recvmsg+0x18/0x1c) [258034.946566] [<8047f140>] (SyS_recvmsg) from [<8000fa20>] (ret_fast_syscall+0x0/0x54) [258035.497506] eth0: hw csum failure [258035.497552] CPU: 0 PID: 0 Comm: swapper/0 Tainted: GW 4.1.19-v7+ #853 [258035.497563] Hardware name: BCM2709 [258035.497604] [<800185e0>] (unwind_backtrace) from [<80013f48>] (show_stack+0x20/0x24) [258035.497638] [<80013f48>] (show_stack) from [<80572fac>] (dump_stack+0xd4/0x118) [258035.497664] [<80572fac>] (dump_stack) from [<80497444>] (netdev_rx_csum_fault+0x44/0x48) [258035.497688] [<80497444>] (netdev_rx_csum_fault) from [<8048c310>] (__skb_checksum_complete+0xb4/0xb8) [258035.497709] [<8048c310>] (__skb_checksum_complete) from [<8053abe4>] (udp6_csum_init+0x1cc/0x218) [258035.497791] [<8053abe4>] (udp6_csum_init) from [<7f0216c0>] (__udp6_lib_rcv+0x274/0x4b8 [ipv6]) [258035.497896] [<7f0216c0>] (__udp6_lib_rcv [ipv6]) from [<7f01e238>] (udpv6_rcv+0x1c/0x20 [ipv6]) [258035.497982] [<7f01e238>] (udpv6_rcv [ipv6]) from [<7f0069d8>] (ip6_input_finish+0x188/0x5d8 [ipv6]) [258035.498053] [<7f0069d8>] (ip6_input_finish [ipv6]) from [<7f007514>] (ip6_input+0x30/0x84 [ipv6]) [258035.498128] [<7f007514>] (ip6_input [ipv6]) from [<7f007694>] (ip6_mc_input+0x12c/0x274 [ipv6]) [258035.498198] [<7f007694>] (ip6_mc_input [ipv6]) from [<7f0067d0>] (ip6_rcv_finish+0x44/0xc4 [ipv6]) [258035.498267] [<7f0067d0>] (ip6_rcv_finish [ipv6]) from [<7f0072b4>] (ipv6_rcv+0x48c/0x6bc [ipv6]) [258035.498316] [<7f0072b4>] (ipv6_rcv [ipv6]) from [<80495294>] (__netif_receive_skb_core+0x694/0xa40) [258035.498340] [<80495294>] (__netif_receive_skb_core) from [<80497504>] (__netif_receive_skb+0x20/0x7c) [258035.498367] [<80497504>] (__netif_receive_skb) from [<8049758c>] (netif_receive_skb_internal+0x2c/0xa4) [258035.498390] [<8049758c>] (netif_receive_skb_internal) from [<80497628>] (netif_receive_skb_sk+0x24/0x9c) [258035.498414] [<80497628>] (netif_receive_skb_sk) from [<7f35c2f8>] (ri_tasklet+0xec/0x28c [ifb]) [258035.498439] [<7f35c2f8>] (ri_tasklet [ifb]) from [<8002b948>] (tasklet_action+0x74/0x10c) [258035.498459] [<8002b948>] (tasklet_action) from [<8002ade4>] (__do_softirq+0x1a0/0x3e0) [258035.498478] [<8002ade4>] (__do_softirq) from [<8002b3c4>] (irq_exit+0xdc/0x140) [258035.498499] [<8002b3c4>] (irq_exit) from [<8006e8d4>] (__handle_domain_irq+0x98/0xec)
Re: [Babel-users] Babel on itty bitty boxes
So I set up ipv6-subtrees false for the pi3 also, did get rules, still got kernel errors. Turned off the internal wifi (so it's no longer trying to be a router), but it is still getting kernel errors from the traffic on the ethernet interface, so I suspect that ipv6 support on the pi has, until now, been somewhat undertested. there are at least 3 paths in the ipv6 portion of the stack throwing errors. ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] Babel on itty bitty boxes
On Thu, Mar 24, 2016 at 4:30 PM, Juliusz Chroboczekwrote: >> I wish I lived in a world where more of the weird things "just worked" > > You'd be bored. Ha. At least compiling stuff on these boxes is slow enough to get in a song or two on the piano. Filed a bug with the odroid folk as well. https://github.com/hardkernel/linux/issues/177 Having gone 0 for 3 on these boxes (and I also had picked up a few more that I haven't bothered to plug in), I guess it was good to escape the openwrt world for a while. *sweet* little box that c2 is, speedwise, the ancient kernel meant every usb wifi stick I had failed in it (in addition to no video or sound), it CAN drive to gigE (unlike the pi3). Ah, well, back to fixing wifi on the one chip I sort of understand... ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] Babel on itty bitty boxes
and I filed this bug too. https://github.com/raspberrypi/linux/issues/1371 I wish I lived in a world where more of the weird things "just worked" ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
[Babel-users] Babel on itty bitty boxes
On Thu, Mar 24, 2016 at 1:16 AM, Matthieu Boutierwrote: >> my gateway (routerA) is a source specific babel, without a version >> number but from an openwrt build about 2 months back. RouterB is a >> cerowrt box. > > On routerA: > > 1. How did you redistribute source-specific routes std openwrt. > 2. Did you see them exported in babeld ? x86 box I see: default from 2001:556:6045:bd:1ccf:b9bb:e8da:2c77 via fe80::16cc:20ff:fee5:64c2 dev eno1 proto babel metric 1024 pref medium default from 2601:645:4103:56c0::/60 via fe80::16cc:20ff:fee5:64c2 dev eno1 proto babel metric 1024 pref medium so, yep. > >> But: On a brand new raspberry pi3 (kernel 4.1.19-v7), same wire, I >> guess it is either not compiled with IPV6_SUBTREES or the >> autodetection is broken because I don't get "default from" routes, but >> the routes inserted in the "ip -6 rules" table. > > Look at kernel_older_than in kernel.c, and test this function on your system. > (You may also want to see kernel_has_ipv6_subtrees in kernel_netlink.c) I checked and IPV6_SUBTREES is disabled in the raspbian kernel build. I filed a bug. https://github.com/raspberrypi/linux/issues/1370 It is straightforward to build a new kernel whenever get around to it. (there are more than a few 4.1 kernel features I'd like to see in there also, I imagine if I asked them for pie on pi they might get a kick out of it.) >> So, to the weird part. A path that is >> routerA <-ethernet-> rpi3 <-usbwifi-> routerB >> >> routerB does not get the "default from" routes. > > Aren't they even sent on the link ? (tcpdump from rpi) rpi3, c2, c1, x86 box are on the same switch. I will go dump the wifi side of the pi. >> an odroid c1, kernel 3.10.80 does not create the rules tables nor >> default from. > > Are you able to create them manually ? yes on the pi3, c1 and c2. > sudo ip -6 rule add from 2001:bd8::/48 prio 42 table 20 > ip -6 rule show > sudo ip -6 rule del prio 42 > >> I honestly don't know what mainline kernel was "good" >> for source specific routing > > It should be good with any (linux) kernel. (I would be surprised if you have > no support for traffic engineering rules.) All of them seem compiled without IPV6_SUBTREES. All of them have ip rule support. the rpi3 creates rules, the c1, c2 do not get rules, the rpi3 does not seem to be forwarding default from. I can certainly grant access to them (via ipv6) if you want to play, or if anyone wants a belated christmas present? :) Of these it was the rpi3 that was the most exciting thing to play with, an ideal cheap meshy router platform, given the onboard wifi did adhoc... followed closely by the c2 (64 bit arm, 50 bucks, wow). the c2 "feels" *way* faster than the rpi3 - but some major features like graphics and sound don't work as yet. Are there any well known usb sticks that work with adhoc? > Matthieu > ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] Ready for 1.8.0?
A) I take it that "diversity routing" now matches the draft but does not use the right stuff to get the channel(s)? /me hides. I know, patches gladly accepted, (but ooh, have we made some progress on bufferbloat in wifi lately) B) I have been staring at a puzzling thing for the last few days without time to look at it harder. my gateway (routerA) is a source specific babel, without a version number but from an openwrt build about 2 months back. RouterB is a cerowrt box. I just updated everything ELSE to your latest commit. On an x86 box I am seeing all but a default ipv6 route supplied by babel. Could be a config error on my part... But: On a brand new raspberry pi3 (kernel 4.1.19-v7), same wire, I guess it is either not compiled with IPV6_SUBTREES or the autodetection is broken because I don't get "default from" routes, but the routes inserted in the "ip -6 rules" table. So, to the weird part. A path that is routerA <-ethernet-> rpi3 <-usbwifi-> routerB routerB does not get the "default from" routes. C) Having more fun with kernel versions and a ton o new, cheap hardware: an odroid c1, kernel 3.10.80 does not create the rules tables nor default from. I honestly don't know what mainline kernel was "good" for source specific routing... odroid c2 (64 bit arms, 50 bucks!), which (sadly) ships with 3.14.29 - I get no default from or rules table either. Grump. Whose job is it to convince everybody to compile in this functionality? D) There is perhaps something you can do about valid_lft vs preferred_lft? Boy, can a machine get a lot of ipv6 addresses, seemingly 6 in my installation. BUT! two have a preferred_lft of 0sec. ex: inet6 2601:645:4103:56c0:31c4:ae34:38b9:d758/64 scope global temporary deprecated dynamic valid_lft 248076sec preferred_lft 0sec All 6 IPs are exported via babel, and I assume that once 248076 seconds expire, babel will withdraw the route. For some reason, with 4 other IPs available, I wouldn't mind if it exported the "best" ones out of each /64 somehow. /me ducks E) commit 8f422e0ebc2d3a0bf556611133a96b5e0a16bdb2 looks "interesting"... what problems did it induce? Dave Täht Let's go make home routers and wifi faster! With better software! https://www.gofundme.com/savewifi On Wed, Mar 23, 2016 at 6:18 PM, Juliusz Chroboczekwrote: > I've just fixed the diversity extension to comply with version -01 of the > Internet-Draft. Since I haven't heard any objections about the new > configuration interface, I'm wondering if we're ready for 1.8.0. > > (The obvious missing feature is the ability to edit filtering rules at > runtime, but the user-interface is not obvious. Adding Quagga-style > route-maps seems overkill to me.) > > -- Juliusz > > ___ > Babel-users mailing list > Babel-users@lists.alioth.debian.org > http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] Some thoughts about Babel (feature requests)
I guess I am puzzled about the need for tunnels in the architecture. (I found the usage of "vpn" confusing, to me a vpn offers additional features like encryption). You connect to each node (all 1400 of them?) to gather data via nodewatcher? (Why not just have a dedicated "control" port?) or are the vpns merely to give you a single address space in the case of a partitioned network and the other external links holding it together? ... somewhat unrelated ... I'm dying to know if ipv6 packets can transit all these different product and link types from one edge of the network to another.. ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
[Babel-users] [RFC patch] linux perf support for libmusl
since there is a need for profiling babel and other subsystems (in my case, cake) on tiny systems, the attached patch has been sitting in my tree for a while, and supplied here in the hope it will be useful while its problems are sorted out... ... openwrt: Add support for musl in the linux kernel perf utility The linux kernel's "perf" profiling utility has a hard dependency on glibc's strerror_r which has a notorious incompatibility with other posix standards. This patch wraps the musl posix std strerr_r with something that behaves like glibc strerror_r - and currently breaks glibc support as I was unable to figure out a combination of defines that would work mutually across both C libraries. My current take on it is that linux mainline perf should switch to using strerror_l which is std and known to be thread safe, and would result in less code The patch only supports linux-4.4 but is easily extended backwards. H/T to stephen walker for the initial attempt, nbd for a refinement. untested on anything but arm at present. From 866299c8c0902c8e21eee7b0f54c3abd74feb494 Mon Sep 17 00:00:00 2001 From: CeroWrt Admin <dave.t...@bufferbloat.net> Date: Thu, 17 Dec 2015 12:49:36 +0100 Subject: [PATCH] openwrt: Add support for musl in the linux kernel perf utility The linux kernel's "perf" profiling utility has a hard dependency on glibc's strerror_r which is a notorious incompatability with other posix standards. This patch wraps the musl std strerr_r with something that behaves like glibc strerror_r - and currently breaks glibc support as I was unable to figure out a combination of defines that would work mutually across both C libraries. My current take on it is that linux mainline perf should switch to using strerror_l which is std and known to be thread safe. The patch only supports linux-4.4x but is easily extended backwards. H/T to stephen walker for the initial attempt. --- package/devel/perf/Makefile| 2 +- .../patches-4.4/280-perf-fixes-for-musl.patch | 143 + 2 files changed, 144 insertions(+), 1 deletion(-) create mode 100644 target/linux/generic/patches-4.4/280-perf-fixes-for-musl.patch diff --git a/package/devel/perf/Makefile b/package/devel/perf/Makefile index 5e3d63f..27ef7b8 100644 --- a/package/devel/perf/Makefile +++ b/package/devel/perf/Makefile @@ -19,7 +19,7 @@ include $(INCLUDE_DIR)/package.mk define Package/perf SECTION:=devel CATEGORY:=Development - DEPENDS:= @USE_GLIBC +libelf1 +libdw +libpthread +librt +binutils + DEPENDS:=+libelf1 +libdw +libpthread +librt +binutils TITLE:=Linux performance monitoring tool VERSION:=$(LINUX_VERSION)-$(PKG_RELEASE) URL:=http://www.kernel.org diff --git a/target/linux/generic/patches-4.4/280-perf-fixes-for-musl.patch b/target/linux/generic/patches-4.4/280-perf-fixes-for-musl.patch new file mode 100644 index 000..029ac7d --- /dev/null +++ b/target/linux/generic/patches-4.4/280-perf-fixes-for-musl.patch @@ -0,0 +1,143 @@ +From 9673ba5369408008deef840e21edab3fa7a575fd Mon Sep 17 00:00:00 2001 +From: Dave Taht <dave.t...@bufferbloat.net> +Date: Tue, 15 Dec 2015 17:19:14 +0100 +Subject: [PATCH] perf fixes for musl + +--- + tools/lib/api/fs/tracing_path.c| 3 +++ + tools/lib/traceevent/event-parse.c | 4 + tools/perf/perf.c | 17 - + tools/perf/util/cache.h| 2 +- + tools/perf/util/cloexec.c | 4 + tools/perf/util/cloexec.h | 4 + tools/perf/util/util.h | 4 + 7 files changed, 28 insertions(+), 10 deletions(-) + +diff --git a/tools/lib/api/fs/tracing_path.c b/tools/lib/api/fs/tracing_path.c +index a26bb5e..a04df38 100644 +--- a/tools/lib/api/fs/tracing_path.c b/tools/lib/api/fs/tracing_path.c +@@ -10,6 +10,9 @@ + #include "fs.h" + + #include "tracing_path.h" ++/* musl has a xpg compliant strerror_r by default */ ++#define strerror_r(err, buf, buflen) \ ++(strerror_r(err, buf, buflen) ? NULL : buf) + + + char tracing_mnt[PATH_MAX] = "/sys/kernel/debug"; +diff --git a/tools/lib/traceevent/event-parse.c b/tools/lib/traceevent/event-parse.c +index 2a912df..0644b42 100644 +--- a/tools/lib/traceevent/event-parse.c b/tools/lib/traceevent/event-parse.c +@@ -36,6 +36,10 @@ + #include "event-parse.h" + #include "event-utils.h" + ++/* musl has a xpg compliant strerror_r by default */ ++#define strerror_r(err, buf, buflen) \ ++(strerror_r(err, buf, buflen) ? NULL : buf) ++ + static const char *input_buf; + static unsigned long long input_buf_ptr; + static unsigned long long input_buf_siz; +diff --git a/tools/perf/perf.c b/tools/perf/perf.c +index 3d4c7c0..91f57b0 100644 +--- a/tools/perf/perf.c b/tools/perf/perf.c +@@ -523,6 +523,21 @@ void pthread__unblock_sigwinch(void) + pthread_sigmask(SIG_UNBLOCK, , NULL); + } + ++unsigned cache_line_size(void); ++ ++uns
Re: [Babel-users] Detecting bridges
the resulting nanog conversation on detecting wireless bridged ended up interesting - with several clever techniques proposed - all probably futile. http://mailman.nanog.org/pipermail/nanog/2015-December/082902.html I fear the default for babel should become etx or rtt as most of the world bridges wifi to ethernet. ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] Detecting bridges
On Mon, Dec 14, 2015 at 8:44 PM, Juliusz Chroboczekwrote: > Wrong thread? Nope. > >> Is there a reliable way of determining that an underlying interface is >> a bridge? > > https://github.com/jech/babeld/blob/master/kernel_netlink.c#L723 > > However, this only works for the Linux software bridge. It doesn't work > for the hardware switches built into your favourite router, and of course > it doesn't work for switches connected over Ethernet, which is what the > WLAN-SI people are using. I was basically wondering if there was something like an igmp message that asked if this "wire" was bridged to anything. The default outside of babel towers and the yurtlab is to bridge wifi to ethernet. > > -- Juliusz > > ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
[Babel-users] latency in WLAN-SI
I tend to fork convos, sorry. On Tue, Dec 15, 2015 at 12:33 AM, Mitarwrote: > Hi! > > On Mon, Dec 14, 2015 at 11:39 AM, Juliusz Chroboczek > wrote: >> I'd like more evidence that this is needed. Estimating packet loss is >> very slow (since we're computing a metric from what is just a discrete, >> one-bit signal), so it slows down convergence. Hopefully we can get away >> without it. > > Hm, isn't computation on WiFi links exactly the same? > >> Does your RTT increase at the same time as packet loss? If so, we could >> probably do without packet loss. > > Not really. At least on the fiber links (which are most of our VPN > links) it does not. > >> (Recall that the goal is not to have an accurate model of the real >> world -- the goal is to have traffic flow according to optimal paths. If >> the traffic is going where you want it to go, there's no need to add more >> complexity to babeld.) > > Currently it seems that the routes over VPN are dropped while we would > prefer them to stay up, even if there is a slight packet loss. > >>> Why have you disabled packet-loss metric on VPN links? >> >> Because it's an experimental feature, that hasn't had enough real-world >> deployment. It works beautifully in our tests, it works beautifully in >> Nexedi's network, and if it works as well in your network, I'll enable it >> by default. > > Hm, are we talking here about packet-loss or delay-based routing > (RTT)? I understand that RTT metric is experimental, but I was talking > about packet-loss, why is that not enabled. Or am I missing something? > >> First, it slows down reaction to link failures. If you're on an Ethernet, >> and you lose two packets in a row, you can be pretty sure the link is >> down > > Or we have a very short buffer. ;-) You certainly know how to tweak me! I'd hope that your fiber nodes were fast enough to never need stuff like fq_codel and sqm-scripts configured, but that would need testing and confirmation. I note that recently a change to fq_codel's default behavior landed in openwrt, changing the quantum from 300 to 1514. This really should not affect anything but really slow links (say, < 4mbit), and fq_codel was never the right thing for anything but p2p wifi anyway. > > But yes, maybe our recent instability in VPN links is more to the > problems with routing we have, then really link instability. > > But we do have VPN links which go between countries. We have observed > really crazy stuff on for example links between Croatia and Slovenia. > Sometimes extra 100 ms appears on the link, because they have some > issues at Internet exchanges, for example (so delay is added at the > Internet exchange). Do you have something like smokeping configured? I would love to see data on not just the vpn issues but on the latencies across the mesh. The latency under load data toke showed at battle mesh for how linux wifi behaves under load these days does also apply in the real world, but perhaps less... or much more, depending on the quality of the link. Yes, I have seen 10s of seconds of delay... In lui of deploying smokeping everywhere I've been thinking of adding a lightweight latency test to flent, where we just test lightweight udp and icmp continuously with, say, a 1sec or even 60 sec period, to dozens or hundreds of hosts, for hours at a time, all the time Smokeping's basic plot is good, but flent can do a better job of aggregating data across more variables. Another approach would be embed timestamps in the needed overhead traffic (be that babel or other) and get everything synced up via ntp... As best as I recall your vpn was pure udp, no crypto, no retries I see in your (nice!) web interface that you track if a node is reachable, but not an observed rtt. > > I do not think that VPN links should be seen as Ethernet. For Ethernet > I agree that if you loose two packets you have probably issues. But > for VPN you have stuff in between, from bad ISPs, to MTU issues which > make some packets get lost (especially while PMTU is in progress). Could you clarify the behavior of your vpn? vpn's over tcp will never lose packets. vpns that do crypto tend to bottleneck on the crypto step and drop packets in the read side of the socket. nearly anything using the tun/tap interfaces tends to be slow, the recent Foo over UDP stuff corrects some of that: https://www.netdev01.org/docs/herbert-UDP-Encapsulation-Linux.pdf >> Second, the link quality estimator uses ETX, which is optimised for >> multicast Hellos over WiFi links (it's quadratic in loss rate). >> A different formula should be used for lossy wired links and for unicast >> wireless tunnels. (But then, perhaps ETX works well enough on tunnels -- >> I have no idea.) > > We have been using ETX with OLSRv1 on tunnels without visible issues. > > What do you use for ETX? ETX = 1 / (d_f x d_r) is for unicast (as > described in the A High-Throughput Path Metric for Multi-Hop
Re: [Babel-users] Detecting bridges
On Tue, Dec 15, 2015 at 11:00 AM, Henning Rogge <hro...@gmail.com> wrote: > On Tue, Dec 15, 2015 at 10:35 AM, Dave Taht <dave.t...@gmail.com> wrote: >> On Mon, Dec 14, 2015 at 8:44 PM, Juliusz Chroboczek >>>> Is there a reliable way of determining that an underlying interface is >>>> a bridge? >>> >>> https://github.com/jech/babeld/blob/master/kernel_netlink.c#L723 >>> >>> However, this only works for the Linux software bridge. It doesn't work >>> for the hardware switches built into your favourite router, and of course >>> it doesn't work for switches connected over Ethernet, which is what the >>> WLAN-SI people are using. >> >> I was basically wondering if there was something like an igmp message that >> asked >> if this "wire" was bridged to anything. >> >> The default outside of babel towers and the yurtlab is to bridge wifi >> to ethernet. > > Maybe something like this? > > https://en.wikipedia.org/wiki/Link_Layer_Discovery_Protocol > > Not sure how well it is supported by consumer grade switches. Well, while I have seen those go by, along with stp, in fiddling with bridging and unbridging a test openwrt box (linksys ac1200) today I do not even see BPDUs with wireshark... https://en.wikipedia.org/wiki/Spanning_Tree_Protocol deep magic, long hidden. My thought was to put out a query over a protocol like this, and/or learn something about the bridging topology, passively, just as the switches do. > Henning Rogge ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] latency in WLAN-SI
In pouring through the astonishing *wealth* of data available via nodewatcher, I finally scrolled down to the chart next to the very bottom to find rtt measurements. so are you really seeing real-world peaks in the 7 second range or is that an artifact of something else? zoomed in: http://snapon.cs.kau.se/~d/peaks_7_sec.png https://nodes.wlan-si.net/node/e6668d55-5e8d-4e32-94e3-ea8e9c23e5f5/ ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] Bucket full, dropping packet
On Sun, Dec 13, 2015 at 2:52 PM, Juliusz Chroboczekwrote: >> Ok, I can do some profiling on the babeld that is running on the VPN >> server with the large number of links. Just tell me what profiling data >> do you want? Should I just compile a debug build and run babeld through >> callgrind or do you have something else in mind? > > I'm not familiar with callgrind -- I've had both results with both "perf > record" and gprof. But yes, callgrind should be fine. > > I need to find out where the CPU time is going. I suspect either the > quadratic loop in xroute.c, or linear-time route selection in route.c. > I intend to fix both, but I'd like to be sure. > >> Yes, you only need to establish a VPN connection to our server using >> tunneldigger-client [1] (it compiles on Debian) and run babeld on the >> VPN interface. We only need to allocate an IPv4 address for you so there >> will be no conflicts. > > Ok, I'll see on Monday if I can get an extra VM before Christmas. I have a half dozen machines all over the world, courtesy of linode. Can spin a new one up for you in a matter of minutes. > > -- Juliusz > > ___ > Babel-users mailing list > Babel-users@lists.alioth.debian.org > http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] Bucket full, dropping packet
On Tue, Dec 8, 2015 at 5:58 PM, Jernej Koswrote: > Hello! > > On 07. 12. 2015 17:14, Juliusz Chroboczek wrote: >> Yes, that's expected. Please increase the limits, be bold, multiply them >> by 20. > > It seems that raising the limits solved the problem. Thanks! > > We are still on the lookout for unparsable packets ;-) I would like to see someone working on a babel fuzzer, or does someone know of a tool that could generate tons of packets bogus in every way possible? > > Jernej > > > ___ > Babel-users mailing list > Babel-users@lists.alioth.debian.org > http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] Low level rewrite, install filters
in the march towards 1.7 process... 0) What other features do you plan for 1.7? 1) Not clear to me if this does atomic route updates instead of delete/add? 2) can I try to get the "version" number into the telnet interface and command line in this go around? 3) fixing up wifi channel awareness to use the current linux netlink api (trying to find someone to take that on) On Thu, Nov 12, 2015 at 5:50 PM, Juliusz Chroboczekwrote: > Mitar, Kosko, > > Matthieu has just done a tremendous job of cleaning up the lower layers of > babeld. One of the effects is that export filters are now known as > install filters, so you say something like > > install ip ::/0 le 0 table 42 > > I've taken the risk of pushing this patch series into trunk, please test. > If you're using babeld in production and don't need the install filtering > functionality, please stick to the tag babeld-1.6.3 for now. > > This will become babeld-1.7.0 at some point. I might create a 1.6 branch > if it proves unstable. > > Please, please, please test. > > -- Juliusz > > > > > ___ > Babel-users mailing list > Babel-users@lists.alioth.debian.org > http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
[Babel-users] Last call for signatures to the FCC on the wifi lockdown issue
The CeroWrt project's letter to the FCC on how to better manage the software on wifi and home routers vs some proposed regulations is now in last call for signatures. The final draft of our FCC submittal is here: https://docs.google.com/document/d/15QhugvMlIOjH7iCxFdqJFhhwT6_nmYT2j8xAscCImX0/edit?usp=sharing The principal signers (Dave Taht and Vint Cerf), are joined by many network researchers, open source developers, and dozens of developers of aftermarket firmware projects like OpenWrt. Prominent signers currently include: Jonathan Corbet, David P. Reed, Dan Geer, Jim Gettys, Phil Karn, Felix nFietkau, Corinna "Elektra" Aichele, Randell Jesup, Eric S. Raymond, Simon Kelly, Andreas Petlund, Sascha Meinrath, Joe Touch, Dave Farber, Nick Feamster, Paul Vixie, Bob Frankston, Eric Schultz, Brahm Cohen, Jeff Osborn, Harald Alvestrand, and James Woodyatt. If you would like to join our call for substituting sane software engineering practices over misguided regulations, the window for adding your signature to the letter closes at 11:59AM ET, today, Friday, 2015-10-08. Sign via webform here: http://goo.gl/forms/WCF7kPcFl9 We are at approximately 170 signatures as I write. For more details on the controversy we are attempting to address, or to submit your own filing to the FCC see: https://libreplanet.org/wiki/Save_WiFi https://www.dearfcc.org/ Sincerely, Dave Täht CeroWrt Project Architect Tel: +46547001161 ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] [PATCH] Add option to consider sysctl write failures as non-fatal.
Well, it would be better if babel checked to see if the (sometimes read-only) sysctl value was already correct, instead of blithely trying to write it. On Mon, Aug 10, 2015 at 9:43 AM, Jernej Kos jer...@kos.mx wrote: Hello! +1 for this patch. We are also running babeld in a Docker container and this requires us to run it as a privileged container due to some Docker deficiencies in setting network sysctls inside containers. Jernej On 10. 08. 2015 18:02, Toke Høiland-Jørgensen wrote: Babeld will exit with a fatal error if it is unable to write sysctls. When running in a container, however, /proc/sys may be mounted read-only, which causes babeld to fail. This adds a switch to consider sysctl failures as non-fatal, in which case a warning will be issues rather than having the daemon fail to start. Signed-off-by: Toke Høiland-Jørgensen t...@toke.dk --- babeld.c | 8 ++-- babeld.h | 1 + babeld.man | 12 configuration.c | 3 +++ kernel_netlink.c | 7 ++- 5 files changed, 28 insertions(+), 3 deletions(-) diff --git a/babeld.c b/babeld.c index 943f042..45b14fa 100644 --- a/babeld.c +++ b/babeld.c @@ -67,6 +67,7 @@ int default_wired_hello_interval = -1; int resend_delay = -1; int random_id = 0; int do_daemonise = 0; +int sysctl_nonfatal = 0; const char *logfile = NULL, *pidfile = /var/run/babeld.pid, *state_file = /var/lib/babel-state; @@ -128,7 +129,7 @@ main(int argc, char **argv) while(1) { opt = getopt(argc, argv, - m:p:h:H:i:k:A:srR:uS:d:g:lwz:M:t:T:c:C:DL:I:); + m:p:h:H:i:k:A:srR:uS:d:g:lwz:M:t:T:c:C:DL:I:F); if(opt 0) break; @@ -263,6 +264,9 @@ main(int argc, char **argv) case 'D': do_daemonise = 1; break; + case 'F': +sysctl_nonfatal = 1; +break; case 'L': logfile = optarg; break; @@ -800,7 +804,7 @@ main(int argc, char **argv) [-t table] [-T table] [-c file] [-C statement]\n -[-d level] [-D] [-L logfile] [-I pidfile]\n +[-d level] [-D] [-F] [-L logfile] [-I pidfile]\n [id] interface...\n, argv[0]); diff --git a/babeld.h b/babeld.h index 92ce9b5..c1f26fe 100644 --- a/babeld.h +++ b/babeld.h @@ -86,6 +86,7 @@ extern time_t reboot_time; extern int default_wireless_hello_interval, default_wired_hello_interval; extern int resend_delay; extern int random_id; +extern int sysctl_nonfatal; extern int do_daemonise; extern const char *logfile, *pidfile, *state_file; extern int link_detect; diff --git a/babeld.man b/babeld.man index ec600c2..7182f30 100644 --- a/babeld.man +++ b/babeld.man @@ -135,6 +135,11 @@ Specify a configuration statement directly on the command line. .B \-D Daemonise at startup. .TP +.B \-F +Don't consider failures writing to sysctl as fatal. Warn of the failures, but +continue running. This is useful when running babeld in a container that mounts +/proc/sys as read-only (as systemd-nspawn does, for instance). +.TP .BI \-L logfile Specify a file to log random ``how do you do?'' messages to. This defaults to standard error if not daemonising, and to @@ -253,6 +258,13 @@ This specifies whether to daemonize at startup, and is equivalent to the command-line option .BR \-D . .TP +.BR sysctl-nonfatal { true | false } +This controls whether to consider failures writing to sysctl as fatal. If set, +warn of the failures, but continue running. This is useful when running babeld +in a container that mounts /proc/sys as read-only (as systemd-nspawn does, for +instance). Equivalent to the command line option +.BR \-F . +.TP .BI state-file filename This specifies the name of the file used for preserving long-term information between invocations of the diff --git a/configuration.c b/configuration.c index 6a9c09d..571e220 100644 --- a/configuration.c +++ b/configuration.c @@ -691,6 +691,7 @@ parse_option(int c, gnc_t gnc, void *closure, char *token) strcmp(token, link-detect) == 0 || strcmp(token, random-id) == 0 || strcmp(token, daemonise) == 0 || + strcmp(token, sysctl-nonfatal) == 0 || strcmp(token, ipv6-subtrees) == 0 || strcmp(token, reflect-kernel-metric) == 0) { int b; @@ -706,6 +707,8 @@ parse_option(int c, gnc_t gnc, void *closure, char *token) random_id = b; else if(strcmp(token, daemonise) == 0) do_daemonise = b; +else if(strcmp(token, sysctl-nonfatal) == 0) +sysctl_nonfatal = b; else if(strcmp(token, ipv6-subtrees) == 0) has_ipv6_subtrees = b; else if(strcmp(token,
Re: [Babel-users] Fwd: Specify router-id explicitly
I just did a pull of your head, built it, started a new babel... waited 2 minutes (I was assuming the calculation for the routerid had changed?)... and did not get a working route to elsewhere until I tried talking to another route. (note I did not update the other 2 babelds in operation on this network from the old openwrt babels) I get working routes locally, but the next box upstream refused any connection. d@ganesha:~/git/babeld$ ip route default via 172.26.22.224 dev wlan0 proto babel onlink 76.102.227.25 via 172.26.22.224 dev wlan0 proto babel onlink 172.26.0.0/16 dev wlan0 proto kernel scope link src 172.26.65.1 172.26.16.0/24 via 172.26.22.224 dev wlan0 proto babel onlink 172.26.16.1 via 172.26.22.224 dev wlan0 proto babel onlink 172.26.17.0/24 via 172.26.18.224 dev wlan0 proto babel onlink 172.26.17.1 via 172.26.22.224 dev wlan0 proto babel onlink 172.26.18.0/24 dev wlan0 proto kernel scope link src 172.26.18.225 172.26.19.0/24 via 172.26.22.224 dev wlan0 proto babel onlink 172.26.20.0/24 via 172.26.18.224 dev wlan0 proto babel onlink 172.26.22.224 via 172.26.22.224 dev wlan0 proto babel onlink 172.26.128.0/27 dev eth0 proto kernel scope link src 172.26.128.18 192.168.1.0/24 dev eth0 proto kernel scope link src 192.168.1.202 d@ganesha:~/git/babeld$ ssh root@172.26.22.224 fails. Weirdness - after I tried connecting to here traceroute to 172.26.19.1 (172.26.19.1), 30 hops max, 60 byte packets 1 172.26.22.224 102.543 ms 102.529 ms 103.409 ms # 100ms 2 172.26.19.1 104.586 ms 107.553 ms 107.572 ms stuff started to work. I went back to the older version and got the same result for over 2 minutes but it did eventually work. Then I went back to the newer one waiting over 2 minutes... and that too, eventually did work. running ubuntu 3.19.0-15-generic, ath9k wifi. Babeld.conf is: ipv6-subtrees true out if vpn6 ip 0.0.0.0/0 deny in if vpn6 ip 0.0.0.0/0 deny I am going to put this down to a bug not in babel. (100ms to my next hop? wtf?) And I am getting on a plane transiting paris in the next few minutes. See you in prague! ~ On Fri, Jul 10, 2015 at 11:53 AM, Juliusz Chroboczek j...@pps.univ-paris-diderot.fr wrote: This means that all the interfaces on a node will have a link-local address? That's a very bad idea. This was meant to say the same link-local address. ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users -- Dave Täht worldwide bufferbloat report: http://www.dslreports.com/speedtest/results/bufferbloat And: What will it take to vastly improve wifi for everyone? https://plus.google.com/u/0/explore/makewififast ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
[Babel-users] redistribute local prune?
Instead of adding a covering route, and explicitly denying other routes from being distributed, would it be possible to have a smarter filter with syntax like redistribute local prune That would cut something like this down from: 2601:696:8300:2cb0:dc9f:abff:fe06:1a24 via fe80::28c6:8eff:febb:9ff0 dev eth0 proto babel metric 1024 2601:696:8300:2cb0::/64 via fe80::28c6:8eff:febb:9ff0 dev eth0 proto babel metric 1024 2601:696:8300:2cb1::/64 via fe80::28c6:8eff:febb:9ff0 dev eth0 proto babel metric 1024 2601:696:8300:2cb2::/64 via fe80::28c6:8eff:febb:9ff0 dev eth0 proto babel metric 1024 2601:696:8300:2cb0::/60 via fe80::28c6:8eff:febb:9ff0 dev eth0 proto babel metric 1024 to just 2601:696:8300:2cb0::/60 via fe80::28c6:8eff:febb:9ff0 dev eth0 proto babel metric 1024 -- Dave Täht worldwide bufferbloat report: http://www.dslreports.com/speedtest/results/bufferbloat And: What will it take to vastly improve wifi for everyone? https://plus.google.com/u/0/explore/makewififast ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
[Babel-users] a cautionary note on setting up new babel nodes
As I am rolling out a bunch of new babel nodes, I decided to get a cluster (2 nanos and a pico) up in the lab, where I have good connectivity to the rest of the network, to replace an aging cluster by the pool. So I booted it up and configured it for the right channels and a new set of ip addresses... didnt have good LED support at all (RSSI does not seem to do anything)... I got blinkenlights to sort of work, and they were lit up, kind of solid, for some reason... [1] ...people started wandering by to complain about the network... naturally I didnt notice because I was even closer to the exit points than anyone else... ...to discover that I was offering the shortest path to the exit nodes, and thus had bypassed the two existing ~50mbit links into lab links that were located indoors and going through a thousand+ meters of trees... that was barely doing a megabit with 800+ms of delay. (channel diversity not working did not help either) After that experience, I decided that I would make the firmware for unconfigured nodes export a 512 metric, and use a high rxcost until they were fully configured AND in place. I might disable ipv4 entirely in favor of the autoconfigured ula openwrt has, and just start configuring stuff based on the appearance of new ulas in the network. [1] if you come up with a useful LED config for nanostations and picostations, let me know. -- Dave Täht worldwide bufferbloat report: http://www.dslreports.com/speedtest/results/bufferbloat And: What will it take to vastly improve wifi for everyone? https://plus.google.com/u/0/explore/makewififast ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] Diversity routing in a many channeled world
The iw package appears canonical, and BSD licensed, and uses nl80211.h (exported from the linux kernel, also BSD licensed), and it's 3 netlink calls to derive the channel and other info. http://pastebin.com/JXqcLNfX root@davedesk2:~# strace -f -v iw dev wlan0 info 2 netlink.txt Interface wlan0 ifindex 13 wdev 0x2 addr 04:a1:51:a3:13:04 ssid davedesk2-test type AP wiphy 0 channel 11 (2462 MHz), width: 20 MHz, center1: 2462 MHz Regrettably I don't have time to poke into this further this month, I have musl breakages elsewhere (like snmp) to fix. Are we at a point where I could just file a bug somewhere? ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
[Babel-users] Diversity routing in a many channeled world
On Wed, Jun 17, 2015 at 3:38 PM, Juliusz Chroboczek j...@pps.univ-paris-diderot.fr wrote: To somewhat answer my own question, when babel is used on a standalone AP interface, it is unable to automagically determine the channel it is on. So I guess whatever it uses can only figure out a adhoc interface? The message you're getting means that the SICGIWFREQ ioctl has failed. Could you please ask the netdev mailing list whether it's supposed to succeed on an AP, and whether I should be using a different API? well, linux-wireless would be better and for sanity's sake I am currently on neither list. So I went to a definitive bit of source code instead: apt-get source network-manager This has a truly mindblowing level number of abstractions piled on abstractions... and what I see in src/wifi are libs for wext and nl80211. for wext, I see SIOCSIWFREQ being used, only, in network-manager. I have no idea what the two differ in actual intent? #define SIOCSIWFREQ 0x8B04 /* set channel/frequency (Hz) */ #define SIOCGIWFREQ 0x8B05 /* get channel/frequency (Hz) */ nl80211 has this rather big dump format, but seems comprehensive. I would suspect nl80211 is more of what is needed nowadays... (this is an ath9k chip here)... or all 3. sort of on this topic was being able to distinguish between ht20, ht40, and the newer vht80, 160, 320 modes, which babel supports multiple channels in the encoding, but I am not sure is pulled out in the code. And on 2.4ghz channel 3 interferes with both 1 and 6. Etc. on to babel-1.6.2! :) -- Juliusz -- Dave Täht worldwide bufferbloat report: http://www.dslreports.com/speedtest/results/bufferbloat And: What will it take to vastly improve wifi for everyone? https://plus.google.com/u/0/explore/makewififast ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] ANNOUNCE: babeld-1.6.1
On Wed, Jun 17, 2015 at 1:28 AM, Gabriel Kerneis gabr...@kerneis.info wrote: Le 2015-06-17 05:41, Dave Taht a écrit : I did notice, in finally attempting to switch off of the old babels package in openwrt to the new babel package updated for 1.6.1, that there is no option in openwrt to enable diversity routing anymore. Is this an intentional omission? It should work like any other babel option: config general option diversity true option diversity_factor 128 Please let me know if this doesn't work. 1) Thx. I will try. Wouldn't I want to use 3 here, instead of true? I realize (now) that the init script translates the whatever_whatever stuff into the babel command set, but it would be nice if the whole command set in openwrt was documented and commented out in the babeld.config file. Where I had gone wrong was grepping for -z and diversity in the babeld.init file... 2) Is there any reason why diversity should not default to true? 3) some more configuration examples in babeld.conf and in the openwrt babeld.config file would be helpful. I am not attempting right now to use hnetd to further configure babel, and do get a variety of /56s and /60s from my upstreams, so i was thinking that the below src specific routing options might be good examples? (but have not tried them yet, so good is an option) http://pastebin.com/Cmdh9YMM 4) I am also curious if an old configuration trick of mine could be overridden successfully with the new babel /tmp/babel.d/ configuration method? in /etc/babeld.conf out if wan ip 0.0.0.0/0 deny in if wan ip 0.0.0.0/0 deny then later in /tmp/babeld.d/100-yea-no-nat.conf out if wan ip 0.0.0.0/0 allow in if wan ip 0.0.0.0/0 allow My general assumption is not. In my case it is always safe to run babel over ipv6 on all interfaces, and sometimes not on ipv4. I am perfectly happy to just get a set of ipv6 routes first. (no, haven't got around to doing any of this, barely got a compile working last night) Best, -- Gabriel ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users -- Dave Täht What will it take to vastly improve wifi for everyone? https://plus.google.com/u/0/explore/makewififast ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] ANNOUNCE: babeld-1.6.1
another configuration example... given that the openwrt firewall is default-deny. config rule option name 'Allow-Babel' option family 'ipv6' option src 'wan' option dest_port '6696' option proto 'udp' option target 'ACCEPT' -- Dave Täht What will it take to vastly improve wifi for everyone? https://plus.google.com/u/0/explore/makewififast ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] ANNOUNCE: babeld-1.6.1
I build this and configured it on three systems, no issues. was about to attempt an openwrt build now that dnsmasq-2.73 has landed also... where... I did notice, in finally attempting to switch off of the old babels package in openwrt to the new babel package updated for 1.6.1, that there is no option in openwrt to enable diversity routing anymore. Is this an intentional omission? On Tue, Jun 16, 2015 at 3:27 PM, Juliusz Chroboczek j...@pps.univ-paris-diderot.fr wrote: Dear all, Babeld-1.6.1 is available from http://www.pps.univ-paris-diderot.fr/~jch/software/files/babeld-1.6.1.tar.gz http://www.pps.univ-paris-diderot.fr/~jch/software/files/babeld-1.6.1.tar.gz.asc This version fixes a buffer overflow which, while probably not exploitable, could in principle cause incorrect routing tables to be computed in the presence of source-specific routes. The other user-visible change is that ipv6-subtrees will be used by default on Linux 3.11 and later. Please upgrade. -- Juliusz 16 June 2015: babeld-1.6.1. * Fixed a buffer overflow in zone_equal. This is probably not exploitable, but might cause incorrect routing tables in the presence of source-specific routing. * Added support for defaulting ipv6-subtrees automatically based on the kernel version. * Fixed compilation under musl. ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users -- Dave Täht What will it take to vastly improve wifi for everyone? https://plus.google.com/u/0/explore/makewififast ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users