Hi all, A silly question, is it possible that a ~5Mbps constant stream of NTP client traffic to an OpenBSD router over an ADSLv2 link, cause said OpenBSD router to successfully communicate with its internal network? This will take some explaining, so please bear with me. :-)
A few weeks back I was debugging an issue with an old Advantech UNO-1150G which uses Realtek Ethernet interfaces, which was dropping off the network randomly. After some experiments, there was a strong suggestion that it was hardware, and an APU2 was purchased to replace the Advantech box. I installed OpenBSD 6.3 on that new box¹, set it up with the same configuration settings as before, and away it went. Same cron jobs, so I got an email if for a period, the APU2 lost contact with the internal network. The problem continued: > Sun May 20 04:13:53 AEST 2018 > em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500 > lladdr 00:0d:b9:4a:f9:f8 > index 1 priority 0 llprio 3 > media: Ethernet autoselect (1000baseT full-duplex,rxpause,txpause) > status: active > inet 172.31.249.254 netmask 0xffffff00 broadcast 172.31.249.255 > inet6 fe80::757c:2f2f:fa68:93c%em0 prefixlen 64 scopeid 0x1 > inet6 2001:44b8:21ac:70f9::fe prefixlen 64 > Name Mtu Network Address Ipkts Ierrs Opkts Oerrs > Colls > em0 1500 <Link> 00:0d:b9:4a:f9:f8 2261254 0 3642705 0 > 0 > em0 1500 172.31.249/ 172.31.249.254 2261254 0 3642705 0 > 0 > em0 1500 fe80::%em0/ fe80::757c:2f2f:f 2261254 0 3642705 0 > 0 > em0 1500 2001:44b8:2 2001:44b8:21ac:70 2261254 0 3642705 0 > 0 > Sun May 20 04:14:53 AEST 2018 > em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500 > lladdr 00:0d:b9:4a:f9:f8 > index 1 priority 0 llprio 3 > media: Ethernet autoselect (1000baseT full-duplex,rxpause,txpause) > status: active > inet 172.31.249.254 netmask 0xffffff00 broadcast 172.31.249.255 > inet6 fe80::757c:2f2f:fa68:93c%em0 prefixlen 64 scopeid 0x1 > inet6 2001:44b8:21ac:70f9::fe prefixlen 64 > Name Mtu Network Address Ipkts Ierrs Opkts Oerrs > Colls > em0 1500 <Link> 00:0d:b9:4a:f9:f8 2262020 0 3643633 0 > 0 > em0 1500 172.31.249/ 172.31.249.254 2262020 0 3643633 0 > 0 > em0 1500 fe80::%em0/ fe80::757c:2f2f:f 2262020 0 3643633 0 > 0 > em0 1500 2001:44b8:2 2001:44b8:21ac:70 2262020 0 3643633 0 > 0 > Sun May 20 04:15:53 AEST 2018 > em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500 > lladdr 00:0d:b9:4a:f9:f8 > index 1 priority 0 llprio 3 > media: Ethernet autoselect (1000baseT > full-duplex,master,rxpause,txpause) > status: active > inet 172.31.249.254 netmask 0xffffff00 broadcast 172.31.249.255 > inet6 fe80::757c:2f2f:fa68:93c%em0 prefixlen 64 scopeid 0x1 > inet6 2001:44b8:21ac:70f9::fe prefixlen 64 > Name Mtu Network Address Ipkts Ierrs Opkts Oerrs > Colls > em0 1500 <Link> 00:0d:b9:4a:f9:f8 2262324 0 3643805 0 > 0 > em0 1500 172.31.249/ 172.31.249.254 2262324 0 3643805 0 > 0 > em0 1500 fe80::%em0/ fe80::757c:2f2f:f 2262324 0 3643805 0 > 0 > em0 1500 2001:44b8:2 2001:44b8:21ac:70 2262324 0 3643805 0 > 0 Now, no errors being reported, and the APU2 was *more* stable than the Advantech box, but the problems still continued. One might think the blame lays with my hacked-up switch. I've got a replacement switch (a Netgear GS748T) which I will install in due course, but a curious thing happened this month. The border router here at the time was a participant in the public NTP server pool (http://pool.ntp.org). This month, I got an email from my ISP to say I was over quota. Sure enough, I had sustained the equivalent of 50GB/day "downloads" over the course of a week. When I had a look with `tcpdump` on pppoe0; sure enough, it was NTP client request traffic. https://community.ntppool.org/t/excessive-traffic-to-australian-servers/791 shows when this problem started this month. I'm expecting a whopper of an Internet bill this month. I have since pulled my server out of the pool (did this on the 1st this month), and while I still have some 700-odd residual clients (scattered about the globe), the significant incoming traffic has ceased, and with it, so has my "drop-out" problem with the border router. The cron job I set up in that previous thread has not reported a single issue with the internal network. The APU2 has been *rock solid* since then. I've made modifications to my pf.conf since I pulled out of the pool but if there's interest, I can boot up the old router and pull the pf.conf and ntpd.conf off it, since that's pretty much what I was using (with em0 substituted for rl0 for pf.conf). My thinking, since the problem has disappeared, is that the sheer number of clients was overwhelming the router, and as a result, it didn't have enough buffer space to handle the number of separate hosts requesting the time from it. It's highly likely this is some naïve mistake on my part, with configuring pf.conf, and that with more appropriate rules, the problem would disappear. Does anyone had similar issues, or can think of ways to mitigate such problems? Regards, -- Stuart Longland (aka Redhatter, VK4MSL) I haven't lost my mind... ...it's backed up on a tape somewhere. 1. dmesg output is here: https://marc.info/?l=openbsd-misc&m=152672107525374&w=2

