[homenet] Experiences implementing Babel in the Bird routing daemon
Hi everyone Over the last couple of weeks, I've amused myself with doing a clean-slate implementation of the Babel protocol in the Bird routing daemon, and thought I'd report my experiences here. I saw Juliusz' talk at the Babel side meeting in Prague (and again at Battlemesh), which is what convinced me that this was actually a viable project. Otherwise, I based my implementation on the RFC and did basic interoperability testing with the official babeld implementation. Overall, I found the RFC clear and easy to follow. Section 2 gives a nice background on how the protocol works, and sections 3 and 4 gives the details of how the implementation should work; in sufficient detail that the implementation can done by referring only to those two sections. The details that are left up to the implementation have nicely suggested solutions in the appendices (which I used for my implementation). The main thing that I found confusing in the text was the mention of 'id' in section 3.5; took me a while to realise that this was supposed to be the router id. In the rest of the document, this is quite explicit, but in sections 3.5.1 and 3.5.4 they are referred to simply as 'id'. This is technically defined in the text, but one has to go looking for it, so when flipping back and forth between the code and the document I found it somewhat confusing. The second thing I would have liked to have available is some more guidance on how to ensure an implementation is actually compliant to the RFC. I.e. a test suite, or at least some description of what kind of edge cases to test (tricky topologies, that sort of thing). My testing so far has been fairly ad hoc, and I'm pretty sure there's still bugs in there. The main part of the implementation took about a week, with another week to fix bugs and convince myself that it actually works as intended; this includes time to familiarise myself with the Bird daemon API and internal logic (but on the other hand, I didn't have to write any code to talk to the kernel). I by no means consider it a production-quality implementation at this point, but it does exchange routes with the babeld daemon and hasn't crashed my laptop yet :) I've posted the implementation as a patch on the Bird list here: http://trubka.network.cz/pipermail/bird-users/2015-August/009855.html -- that email also describes the current limitations of the implementation. There's also a github repository with the complete commit history for those who want to amuse themselves with perusing my struggles: https://github.com/tohojo/bird I hope this can serve as a useful data point in the assessment of the implementability of Babel. I started this mainly as a project to satisfy my own curiosity (the best way to understand a protocol is to implement it, and all that), but do plan to put in some effort to get it to a more robust state, depending on the feedback I get from the Bird developers :) Cheers, -Toke ___ homenet mailing list homenet@ietf.org https://www.ietf.org/mailman/listinfo/homenet
Re: [homenet] New Open-Source PIM BIDIR (and more) implementation
Le 19 août 2015 à 13:53, Juliusz Chroboczek j...@pps.univ-paris-diderot.fr a écrit : The assumption is that the user will want to receive traffic from the ISP. To do so, it needs to subscribe first (e.g. using MLD) on one of the WANs. One problem is that you don’t know which WAN. The solution used here is to subscribe on all WANs. Sorry if I'm being dense -- but does that mean that if multiple ISPs route multicast, you get duplication of traffic? If they provide the same traffic (source and group), yes. Adding some signaling in the Proxy Controller protocol would be a fairly simple way to fix that problem. I do not plan to do that for now. - Pierre ___ homenet mailing list homenet@ietf.org https://www.ietf.org/mailman/listinfo/homenet
Re: [homenet] New Open-Source PIM BIDIR (and more) implementation
Could you please explain what problem you're solving with the SSMBIDIR extension? SSBIDIR is not very different than BIDIR. It still uses one single forwarding tree, Thanks for the explanation. So what happens when there are multiple default routes? What problem does the proxying business attempt to solve? And what does it use TCP for? And what about that? -- Juliusz ___ homenet mailing list homenet@ietf.org https://www.ietf.org/mailman/listinfo/homenet
Re: [homenet] New Open-Source PIM BIDIR (and more) implementation
Le 19 août 2015 à 13:31, Juliusz Chroboczek j...@pps.univ-paris-diderot.fr a écrit : Could you please explain what problem you're solving with the SSMBIDIR extension? SSBIDIR is not very different than BIDIR. It still uses one single forwarding tree, Thanks for the explanation. So what happens when there are multiple default routes? What problem does the proxying business attempt to solve? And what does it use TCP for? And what about that? Oops, sorry about that. I missed this question. It is related to So what happens when there are multiple default routes? as well. PIM-(SS)BIDIR does the routing inside the home network. The only route that BIDIR uses is the one to the RP-Address, which is inside the home network. So it does not uses default routes. The assumption is that the user will want to receive traffic from the ISP. To do so, it needs to subscribe first (e.g. using MLD) on one of the WANs. One problem is that you don’t know which WAN. The solution used here is to subscribe on all WANs. This subscription/forwarding is the proxy part. Proxy *controller* is the process that allows routers to ask border routers to subscribe to given groups. In the HNCP case, a single router is elected on the RP-Link. This router will have a knowledge of the home-wide-membership-state and will reflect that state by « asking » (through TCP connexions) all border routers to subscribe to those groups. I hope it is clear enough, - Pierre ___ homenet mailing list homenet@ietf.org https://www.ietf.org/mailman/listinfo/homenet
Re: [homenet] New Open-Source PIM BIDIR (and more) implementation
Le 19 août 2015 à 12:37, Juliusz Chroboczek j...@pps.univ-paris-diderot.fr a écrit : I am pleased to announce the public release of pimbd, the PIM implementation that was demonstrated during the last Bits and Bites in Prague. I've now looked at it, and it's looking good to me. It's roughly the size of babeld (10kloc), the code looks reasonably clean, and there appears to be dome amount of OpenWRT integration. I've been able to compile it under Debian (libubox, grumble grumble), I've been able to get it to join IPv6 multicast groups, but I've been unable to get it to actually route multicast. Probably something wrong on my side, I'll wait until it migrates into OpenWRT Thank you for this feedback. We can try to trouble-shoot the problem if you want. But I must admit that I learned how painful Linux can be with multicast forwarding sometimes. A few questions. Does SSMBIDIR interoperate with plain BIDIR? What happens when there are both BIDIR and SSMBIDIR routers in the Homenet? Could you please explain what problem you're solving with the SSMBIDIR extension? I'm not a multicast specialist, so please correct me if I'm wrong, but I understand that BIDIR: (1) doesn't optimise SSM trees; (2) wants a well-defined default route. Which problem exactly are you trying to solve with SSMBIDIR? Both? If just (1), might we as well go with plain BIDIR? SSBIDIR is not very different than BIDIR. It still uses one single forwarding tree, relies on designated forwarder election, all the traffic is sent to the RPA, ... The only difference is really an optimization that filters downstream traffic. If you have part of your network which wants (S1,G), and another that wants (S2,G), you will be able to filter that out such that you don’t send (S1,G) to the part of the network that does not want it. And you don’t send (S2,G) to the part of the network that only asked for (S1,G) (With the exception of upstream path, as traffic is always sent to the RPA). As for backward compatibility with PIM-BIDIR, it works fine. SSBIDIR routers will detect the presence of BIDIR-only router on the upstream interface. If a BIDIR-only router is detected upstream but is not the designated router: - You can still send (S,G) and (*,G) Join/Prunes - You stop using (S,G,rpt) Join/Prunes If a BIDIR-only router is detected upstream and *is* the designated router: - You use (*,G) Join/Prunes only This logic is implemented and, afaik, works. SSBIDIR is disabled by default. You can enable it with ‘pimbc link ...' What problem does the proxying business attempt to solve? And what does it use TCP for? To compile this under Debian: apt-get install liblua5.1-0-dev git clone http://git.openwrt.org/project/libubox.git (cd libubox make sudo make install) sudo ldconfig /usr/local/lib git clone https://github.com/Oryon/pimbd (cd pimbd make sudo make install) Thank you ! I will add that to the HowTo. And may create a script at some point. - Pierre ___ homenet mailing list homenet@ietf.org https://www.ietf.org/mailman/listinfo/homenet
Re: [homenet] New Open-Source PIM BIDIR (and more) implementation
Am 19.08.2015 um 13:31 schrieb Juliusz Chroboczek: What problem does the proxying business attempt to solve? And what does it use TCP for? And what about that? You need to tell the ISPs (via IGMP / MLD joins) which (global) MC groups someone in the net is interested in, so you need to replicate each (globally-scoped) join in the network on all your edge routers or at least the sum of all your joins. ___ homenet mailing list homenet@ietf.org https://www.ietf.org/mailman/listinfo/homenet
Re: [homenet] New Open-Source PIM BIDIR (and more) implementation
The assumption is that the user will want to receive traffic from the ISP. To do so, it needs to subscribe first (e.g. using MLD) on one of the WANs. One problem is that you don’t know which WAN. The solution used here is to subscribe on all WANs. Sorry if I'm being dense -- but does that mean that if multiple ISPs route multicast, you get duplication of traffic? -- Juliusz ___ homenet mailing list homenet@ietf.org https://www.ietf.org/mailman/listinfo/homenet
Re: [homenet] [pim] New Open-Source PIM BIDIR (and more) implementation
If bidir-tree is required in home network, i think IGP based bidir-tree solution will be simpler than bidir-PIM, it uses unified protocol for both unicast and multicast. Also the home network scale is not so large, so the multicast group membership flooding through IGP protocol will not be an issue. The plug and play of IGP based multicast solution also will be better than the combo of unicast IGP + bidir-PIM. So i think the home network maybe the applicable scenario for IGP based multicast solution. What kind of protocols are you suggesting here and are they implemented anywhere? ___ homenet mailing list homenet@ietf.org https://www.ietf.org/mailman/listinfo/homenet
Re: [homenet] [babel] Experiences implementing Babel in the Bird routing daemon
On 19.8.2015, at 23.32, Markus Stenberg markus.stenb...@iki.fi wrote: On 19.8.2015, at 23.26, Juliusz Chroboczek j...@pps.univ-paris-diderot.fr wrote: If anybody knows how to write a test suite for a routing protocol, I'm interested. I imagine a set of scripts that set up some virtual machines and perform some tests, but I have trouble imagining how it could perform a test such as the one described above. You don’t really need bunch of VMs. In order of decreasing hackiness: [1] you need to only steal i/o (hello, LD_PRELOAD) and play with the black box that talks network protocols. [2] Or alternatively, do some root level raw packet i/o on the interface(s) the daemon uses. [3] Or provide a box (/VM) which pretends to be whatever else is on the network and changing topology and whatever, which deals with the device/daemon/… (of course, you’re screwed if the protocol requires L2 events, but some boxes can simulate this too) It is not really rocket science (I used to do test automation software for a living at some point), but very seldom really done comprehensively unless there is serious industrial interest. Note: I everything I talk about applies to testing ‘a protocol’, there isn’t really anything particular magic about testing a routing protocol. And in case this was not obvious from the context, you _do not_ typically use a full implementation as the reference to test against, but instead e.g. write the tests based on particular test cases that are obviously derived from the spec, and test each implementation against each of those individually. You may need subset of the implementation (or even full one) to really pretend to be one, but usually limiting complexity of tests (= having small subset of full protocol used per test case) is sensible. Usually you need - way to reset implementation system under test (SUT) to a specific state (e.g. ‘reset’) - way to inspect SUT state which is not visible externally (in case of routing protocol, probably locally installed routes) - way to do the I/O with SUT (in case of a RP, probably a socket, wrapped in one of the few ways noted above) Standard disclaimer: It’s been years since I did any other tests than hnetd ones (or unit tests for my own hobby things) :) Cheers, -Markus ___ homenet mailing list homenet@ietf.org https://www.ietf.org/mailman/listinfo/homenet
[homenet] Getting new HNCP TLV types
Why are HNCP codepoints specified as standards action? It's a 16-bit space, wouldn't documentation required be good enough? Or even FCFS? -- Juliusz ___ homenet mailing list homenet@ietf.org https://www.ietf.org/mailman/listinfo/homenet
Re: [homenet] [babel] Experiences implementing Babel in the Bird routing daemon
On Wed, Aug 19, 2015 at 9:26 PM, Juliusz Chroboczek j...@pps.univ-paris-diderot.fr wrote: Over the last couple of weeks, I've amused myself with doing a clean-slate implementation of the Babel protocol in the Bird routing daemon Excellent news, Toke. I've had a first read over your code, and it looks almost correct (I have some minor nits). I'll read it again, and do a detailed review with stupid questions about the bits I don't understand. For the record, while Toke and I are friends, this is a completely independent implementation of the IPv6 subset of RFC 6126 together with Appendices A and B (I once looked over Toke's shoulder when he was hacking at it, and he quickly shooed me away). The main thing that I found confusing in the text was the mention of 'id' in section 3.5; took me a while to realise that this was supposed to be the router id. Noted, thanks. The second thing I would have liked to have available is some more guidance on how to ensure an implementation is actually compliant to the RFC. I.e. a test suite, or at least some description of what kind of edge cases to test (tricky topologies, that sort of thing). Point taken. I may be biased, but in my experience the only tricky bit in the protocol is reacting to starvation. Everytime I touch this code, I put a router in the middle of the network then increase the cost to all neighbours, and check that seqno requests behave according to spec. If they don't, you'll notice right away -- either there'll be a request storm, or your routes will remain unreachable for a long time. If anybody knows how to write a test suite for a routing protocol, I'm interested. I imagine a set of scripts that set up some virtual machines and perform some tests, but I have trouble imagining how it could perform a test such as the one described above. The main part of the implementation took about a week, with another week to fix bugs and convince myself that it actually works as intended; Impressive. I'll dust some old laptops, and we'll do some more serious testing when you come to Paris. I can bring over the same 7 routers I had at battlemesh. -- Juliusz ___ babel mailing list ba...@ietf.org https://www.ietf.org/mailman/listinfo/babel -- Dave Täht worldwide bufferbloat report: http://www.dslreports.com/speedtest/results/bufferbloat And: What will it take to vastly improve wifi for everyone? https://plus.google.com/u/0/explore/makewififast ___ homenet mailing list homenet@ietf.org https://www.ietf.org/mailman/listinfo/homenet
Re: [homenet] [babel] Experiences implementing Babel in the Bird routing daemon
On 19.8.2015, at 23.26, Juliusz Chroboczek j...@pps.univ-paris-diderot.fr wrote: If anybody knows how to write a test suite for a routing protocol, I'm interested. I imagine a set of scripts that set up some virtual machines and perform some tests, but I have trouble imagining how it could perform a test such as the one described above. You don’t really need bunch of VMs. In order of decreasing hackiness: [1] you need to only steal i/o (hello, LD_PRELOAD) and play with the black box that talks network protocols. [2] Or alternatively, do some root level raw packet i/o on the interface(s) the daemon uses. [3] Or provide a box (/VM) which pretends to be whatever else is on the network and changing topology and whatever, which deals with the device/daemon/… (of course, you’re screwed if the protocol requires L2 events, but some boxes can simulate this too) It is not really rocket science (I used to do test automation software for a living at some point), but very seldom really done comprehensively unless there is serious industrial interest. Cheers, -Markus ___ homenet mailing list homenet@ietf.org https://www.ietf.org/mailman/listinfo/homenet
Re: [homenet] Experiences implementing Babel in the Bird routing daemon
Over the last couple of weeks, I've amused myself with doing a clean-slate implementation of the Babel protocol in the Bird routing daemon Excellent news, Toke. I've had a first read over your code, and it looks almost correct (I have some minor nits). I'll read it again, and do a detailed review with stupid questions about the bits I don't understand. For the record, while Toke and I are friends, this is a completely independent implementation of the IPv6 subset of RFC 6126 together with Appendices A and B (I once looked over Toke's shoulder when he was hacking at it, and he quickly shooed me away). The main thing that I found confusing in the text was the mention of 'id' in section 3.5; took me a while to realise that this was supposed to be the router id. Noted, thanks. The second thing I would have liked to have available is some more guidance on how to ensure an implementation is actually compliant to the RFC. I.e. a test suite, or at least some description of what kind of edge cases to test (tricky topologies, that sort of thing). Point taken. I may be biased, but in my experience the only tricky bit in the protocol is reacting to starvation. Everytime I touch this code, I put a router in the middle of the network then increase the cost to all neighbours, and check that seqno requests behave according to spec. If they don't, you'll notice right away -- either there'll be a request storm, or your routes will remain unreachable for a long time. If anybody knows how to write a test suite for a routing protocol, I'm interested. I imagine a set of scripts that set up some virtual machines and perform some tests, but I have trouble imagining how it could perform a test such as the one described above. The main part of the implementation took about a week, with another week to fix bugs and convince myself that it actually works as intended; Impressive. I'll dust some old laptops, and we'll do some more serious testing when you come to Paris. -- Juliusz ___ homenet mailing list homenet@ietf.org https://www.ietf.org/mailman/listinfo/homenet
Re: [homenet] Getting new HNCP TLV types
On 20/08/2015 08:02, Juliusz Chroboczek wrote: Why are HNCP codepoints specified as standards action? It's a 16-bit space, wouldn't documentation required be good enough? Or even FCFS? With my RFC6709 hat on, I would advocate a fairly strict policy for extending something that walks and quacks like a routing protocol. Some sort of review seems advisable. In RFC5226 terms, I'd go for Expert Review at least. Brian ___ homenet mailing list homenet@ietf.org https://www.ietf.org/mailman/listinfo/homenet
Re: [homenet] [babel] Experiences implementing Babel in the Bird routing daemon
Juliusz Chroboczek j...@pps.univ-paris-diderot.fr writes: Over the last couple of weeks, I've amused myself with doing a clean-slate implementation of the Babel protocol in the Bird routing daemon Excellent news, Toke. I've had a first read over your code, and it looks almost correct (I have some minor nits). I'll read it again, and do a detailed review with stupid questions about the bits I don't understand. Thanks. I'll look forward to your comments :) For the record, while Toke and I are friends, this is a completely independent implementation of the IPv6 subset of RFC 6126 together with Appendices A and B (I once looked over Toke's shoulder when he was hacking at it, and he quickly shooed me away). Yes, can confirm. Juliusz has likewise been most insistent on not giving any hints. Most annoying, making me read like that... I may be biased, but in my experience the only tricky bit in the protocol is reacting to starvation. Everytime I touch this code, I put a router in the middle of the network then increase the cost to all neighbours, and check that seqno requests behave according to spec. If they don't, you'll notice right away -- either there'll be a request storm, or your routes will remain unreachable for a long time. Well, basically taking the above paragraph and putting it in an appendix as things to test for would be useful. I.e. some list of make sure these scenarios work and you should be set. For the packet format, I targeted having wireshark agree with me on the contents of the packets. That was fairly straight forward. If anybody knows how to write a test suite for a routing protocol, I'm interested. I imagine a set of scripts that set up some virtual machines and perform some tests, but I have trouble imagining how it could perform a test such as the one described above. The CORE emulator might be useful in this regard: http://www.nrl.navy.mil/itd/ncs/products/core Impressive. I'll dust some old laptops, and we'll do some more serious testing when you come to Paris. It'll be instructive, annoying and fun, I expect :) -Toke ___ homenet mailing list homenet@ietf.org https://www.ietf.org/mailman/listinfo/homenet
Re: [homenet] Getting new HNCP TLV types
Why are HNCP codepoints specified as standards action? It's a 16-bit space, wouldn't documentation required be good enough? Or even FCFS? With my RFC6709 hat on, I would advocate a fairly strict policy for extending something that walks and quacks like a routing protocol. It's not really a routing protocol -- it only installs blackhole routes and routes to directly connected prefixes, so it's unlikely to cause routing pathologies. And it's supposed to distribute random configuration information. I'll remark, while I'm at it, that the only place DHCP data can be advertised is in the EXTERNAL-CONNECTION TLV, so there's currently no way to announce e.g. I'm a SIP proxy server without announcing an external connection. (Everyone relax -- I'm not implementing SIP in shncpd.) Some sort of review seems advisable. In RFC5226 terms, I'd go for Expert Review at least. That would be fine with me. -- Juliusz ___ homenet mailing list homenet@ietf.org https://www.ietf.org/mailman/listinfo/homenet
Re: [homenet] New Open-Source PIM BIDIR (and more) implementation
Both of these techniques makes use of a metric in order to decide which one is the ‘best’ router to forward some packet. Perfectly clear. The unicast routing protocol needs to communicate the metric to the PIM daemon, and the kernel priority is a convenient place to put it. Thanks, Pierre. -- Juliusz ___ homenet mailing list homenet@ietf.org https://www.ietf.org/mailman/listinfo/homenet
Re: [homenet] New Open-Source PIM BIDIR (and more) implementation
I am pleased to announce the public release of pimbd, the PIM implementation that was demonstrated during the last Bits and Bites in Prague. I've now looked at it, and it's looking good to me. It's roughly the size of babeld (10kloc), the code looks reasonably clean, and there appears to be dome amount of OpenWRT integration. I've been able to compile it under Debian (libubox, grumble grumble), I've been able to get it to join IPv6 multicast groups, but I've been unable to get it to actually route multicast. Probably something wrong on my side, I'll wait until it migrates into OpenWRT A few questions. Does SSMBIDIR interoperate with plain BIDIR? What happens when there are both BIDIR and SSMBIDIR routers in the Homenet? Could you please explain what problem you're solving with the SSMBIDIR extension? I'm not a multicast specialist, so please correct me if I'm wrong, but I understand that BIDIR: (1) doesn't optimise SSM trees; (2) wants a well-defined default route. Which problem exactly are you trying to solve with SSMBIDIR? Both? If just (1), might we as well go with plain BIDIR? What problem does the proxying business attempt to solve? And what does it use TCP for? To compile this under Debian: apt-get install liblua5.1-0-dev git clone http://git.openwrt.org/project/libubox.git (cd libubox make sudo make install) sudo ldconfig /usr/local/lib git clone https://github.com/Oryon/pimbd (cd pimbd make sudo make install) Careful, libubox compilation fails silently if liblua is not version 5.1. -- Juliusz ___ homenet mailing list homenet@ietf.org https://www.ietf.org/mailman/listinfo/homenet
Re: [homenet] New Open-Source PIM BIDIR (and more) implementation
So this is to choose between identical routes. Why is this needed? I have no idea. You'll have to ask Pierre. (And I'd appreciate an explanation myself.) *clearing throat* :) Here is my humble understanding as a multicast non-expert. PIM makes an extensive use of RPF (Reverse Path Forwarding) to build the multicast forwarding tree. RPF can be done with the routing table alone, with no metric. You just need to know the ‘upstream’ interface to either an RP address, or a source address. The issue comes from that multicast routing is not about sending a packet to a next hop, but to a next link. This situation implies that you may have multiple routers that, according to the RPF and PIM Join/Prune, could be candidate to forwarding a packet to/from some link. When such a situation occurs, an election mechanism is used to solve it. - In PIM-SM, it is not reactively using Asserts - In PIM-BIDIR, it is not proactively using DF election mechanism. Both of these techniques makes use of a metric in order to decide which one is the ‘best’ router to forward some packet. I do not know exactly what are the consequences for PIM-SM if you don’t have these metrics (Asserts are won randomly). But I think that in PIM-BIDIR, you can end-up with stable routing loops. - Pierre ___ homenet mailing list homenet@ietf.org https://www.ietf.org/mailman/listinfo/homenet