On Sun, Apr 12, 2020 at 09:53, David Gwynne <[email protected]> wrote:
> On Fri, Jul 05, 2019 at 03:51:31AM +0000, Adam Steen wrote: >> >Synopsis: Packet loss / ENOBUFs with kqueue(2) and tap(4) >> >Category: bug >> >Environment: >> System : OpenBSD 6.5 >> Details : OpenBSD 6.5-current (GENERIC.MP) #123: Sat Jun 29 19:39:46 AWST >> 2019 >> [email protected]:/sys/arch/amd64/compile/GENERIC.MP >> >> Architecture: OpenBSD.amd64 >> Machine : amd64 >> >Description: >> In Solo5 we have been working towards supporting multiple network >> interfaces, implemented this using kqueue(2) and tap(4). >> >> This involves setting up two Tap interfaces, starting up the program. >> In another session flood pinging the first Tap interface, >> Solo5 handles this with no packets dropped. >> In another session ping the second Tap interface, then for every >> ping to the second interface a packet is dropped on the first. If you >> switch to a flood ping on the second tab interface, you will observe >> massive packet loss on both interfaces, and ping complaining about >> No buffer space available (ENOBUFS). >> >> see https://github.com/Solo5/solo5/issues/374 for more information. >> >> >How-To-Repeat: >> I have been able to reproduct this in a hacked up exampled program, >> available here https://github.com/adamsteen/test_net_2if. Please note >> this is hacked, generally butchered program, which demonstrates the >> problem. (if required i can try and clean up this test case) >> >> 01. git clone https://github.com/adamsteen/test_net_2if >> 02. cd test_net_2if >> 03. make >> 04. doas setup.sh (Setup up the Tap interfaces) >> 05. doas ./test_net_2if >> 06. in another seesion start a flood ping >> doas ping -f 10.0.0.2 >> 07. Observe that the flood ping is functioning correctly, >> with no packets dropped. >> 08. In another session, start a normal ping >> ping 10.1.0.2 >> 09. Observe that, for each ping sent to service1, a packet is dropped. >> 10. Kill the normal ping >> 11. start a flood ping >> doas ping -f 10.1.0.2 >> 12. Observe massive packet loss on both interfaces, and ping >> complaining about No buffer space available (ENOBUFS). >> >Fix: >> Not Known. > > Hi Adam, > > claudio@ and I looked at this during a2k20, and came to the conclusion > that the packet loss occurred because an interface queue filled up > and it was shedding load. It was annoyingly easy to get to that point > though. > > We also spent a lot of time massaging the tun/tap code to try and unify > the semantics of tun and tap going through the network stack, and in > particular tried to avoid queuing packets until we finally get to the > output side of the stack. > > I'm not saying we've fixed this problem for you, but hopefully we've > mitigated it a bit. Could you try again and let us know if you see any > difference? If there's no difference, could you tweak your test to loop > on the read() of the /dev/tap entry until it gets back EWOULDBLOCK or > whatever the errno is that means there's no packet to read right now? > Cheers, > dlg Hi dlg I will definitely have a look, and I will give the “EWOULDBLOCK” errno idea a go! Will probably get back to you Tuesday sometime! Cheers Adam
