Good evening everyone,
thanks for following up and also for adding cool new pictures on the jool website - it's cool to see some movements - AND to that guy who created an Alpine package - I owe you a drink of choice! So many things happening... today I've started to look into joold, because we are considering to run 2 active routers with the same available IPv4 addresses. Regarding the UDP loss, Stefan Brudny (see mail below) actually pointed me to an open bug in iperf that I wasn't aware of either. So this might all be a false positive, for which I'm very sorry if it costed anyone time! Best regards and many motivated greetings from the mountains, Nico p.s.: From a performance point of view and remembering how P4 lets you modify packets, I think jool should be able to handle native forwarding / line speed, as the actual modifications are very little and the information required even fits into every L1 cache. Minus the obvious OS overhead. -------------------------------------------------------------------------------- From: Stefan Brudny <[email protected]> To: Nico Schottelius <[email protected]> Subject: Re: [Jool-list] Moving to jool Flags: replied, seen Date: Thu 07 Nov 2019 11:11:37 PM CET Maildir: /ungleich/2019 Gents, Blind shot for packet loss, I was experiencing some extreme packets loss in udp in Azure, not related to any nat64, different service. I was using iperf. It turned out that iperf has a bug and sometimes in some environments and configurations it misbehaves. https://github.com/esnet/iperf/issues/296 I used nttcp and udp packet loss dropped from 90 to 0.1%. BTW, used jool for poc to find solution for pfsense, and it works perfectly. Heads up too. SB -------------------------------------------------------------------------------- Alberto Leiva <[email protected]> writes: > Ok, I was able to replicate 8 Gbit/sec by using virtualization (since j> my physical hardware cannot keep up at all). I can confirm that > > - according to top, the NAT64 machine refuses to exceed 100% CPU utilization > (which allegedly signifies that only one CPU is being used), and > - according to /proc/interrupts, most traffic that shares an incoming > interface also shares CPU: > > $ cat /proc/interrupts > CPU0 CPU1 > 3551 112239 enp0s8 > 4825321 49 enp0s3 > (Output trimmed to only relevant rows and columns) > > I don't know when this started happening, but considering that > performance is (in my experience) most people's main concern, I do > think this is a problem that needs immediate attention. > > I don't think this is a Jool bug; it's simply the way the kernel is > configured to handle interrupts by default. However, it's certainly > worth a note in the documentation, to ease the solution for people who > need to squeeze as much performance out of their translator as > possible. I just hope it doesn't require a custom kernel... > > I will try to figure this out and should come back in a few days with > more information. > > ------------------------------------------ > > I still haven't figured out what's with the "Datagram Lost" column. > Sometimes iperf's output is quite nonsensical; I have seen it report > literally 100% datagram lost rate and yet the reported "speed" is 8 > Gbit/sec. I don't understand what's up with this. Maybe it's a > checksum problem (ie. the packets arrive but the checksum is incorrect > so iperf reports them as arrived and lost at the same time), but then > it's strange that I can't identify any artifacts in video streams. > This needs to be investigated further. > > Working... > > On Thu, Nov 7, 2019 at 12:40 PM Nico Schottelius > <[email protected]> wrote: >> >> >> Good evening Jordi, Alberto, >> >> >> JORDI PALET MARTINEZ <[email protected]> writes: >> >> > Hi Nico, >> > >> > I have read your complete document when you sent it to the list, and I >> > want to thank you for it. >> > >> > I'm a frequent user of Jool, and teach about it to the community and >> > customers. >> >> Very nice! >> >> > I was also surprised about your UDP failures, I've never seen that before, >> > so as you just said, it may be due to your specific configuration. I >> > recall having tested Jool the first time in Ubuntu 16.x, but I often try >> > to upgrade the kernel to the latest available release, etc. >> > >> > In fact, I usually check and adjust myself CPU affinity (even I do that in >> > my OpenWRT routers!). >> > >> > One suggestion, in case you can invest a bit of extra time on this, so to >> > make your work more comprehensive, will be to test also using VPP: >> > >> > https://docs.fd.io/vpp/17.07/nat64_doc.html >> >> Interesting! I have added it to my backlog, I wasn't aware of nat64 in >> vpp! >> >> > I will actually say, if you allow me, "forget Tayga", it doesn't >> > scale, isn't longer mantained, and Jool and VPP are much better >> > targets to focus on! >> >> I assumed so. However, there is one really, really big advantage of >> tayga: it is included in every distribution. This was actually the >> reason why we chose tayga in 2017 for datacenterlight.ch. >> >> Now that we hit cpu limitations we are more willing to manually maintain >> it and it is somewhat "ok", because we only have 6 routers. I'm actually >> considering to spend some of our resources to package jool for Alpine >> Linux, which is our target os for the new router generation. >> >> Either way, I have to thank you guys, you did a quite impressive job >> with jool! >> >> Best regards from Switzerland, >> >> Nico >> >> >> -- >> Modern, affordable, Swiss Virtual Machines. Visit www.datacenterlight.ch -- Modern, affordable, Swiss Virtual Machines. Visit www.datacenterlight.ch _______________________________________________ Jool-list mailing list [email protected] https://mail-lists.nic.mx/listas/listinfo/jool-list
