On Wed, Feb 23, 2022 at 04:09:56AM -0500, Original Observer wrote: > This has happened to me a few times now.
That "This" is probably Subject: Infinite loop upon incoming DNS query when no upstream resolvers during cable modem reboot > Dnsmasq 2.86 gets into an infinite > loop and ceases all DNS and DHCP service when my cable modem reboots and > hands out a junk DHCP lease with no DNS resolvers. > > Some clues: Clues for what? > * /etc/resolv.conf is static and contains only "nameserver 127.0.0.1" so > programs running on the local host will resolve names via the local Dnsmasq > instance. > > * /tmp/resolv.conf is generated by the DHCP client script and usually > contains: > > domain hsd1.nh.comcast.net. > nameserver 75.75.75.75 > nameserver 75.75.76.76 > > * /etc/dnsmasq.conf contains: > > no-hosts > log-queries=extra > bogus-priv > resolv-file=/tmp/resolv.conf > domain-needed > cache-size=10000 > dhcp-range=192.168.1.1,192.168.1.254,255.255.254.0,720h > dhcp-range=::,constructor:br0,ra-only,ra-names > dhcp-option=vendor:MSFT,1,2i > dhcp-authoritative > dhcp-rapid-commit > log-dhcp > dhcp-script=/bin/logger > script-on-renewal > enable-ra > ra-param=br0,mtu:eth0,high,1800,9000 > domain=home.xxxxxx.com > local=/home.xxxxxx.com/ > cname=home.xxxxxx.com,yyyyyy.home.xxxxxx.com,86400 > address=/public.home.xxxxxx.com/ > interface-name=public.home.xxxxxx.com,eth0/4 > > * Dnsmasq is started by init as > "/sbin/dnsmasq --keep-in-foreground --user --interface=br0". > > * /tmp/resolv.conf becomes empty when the cable modem hands out a junk DHCP > lease with no DNS resolvers. > > * One of the last messages Dnsmasq logs before it hangs is "no servers found > in /tmp/resolv.conf, will retry". > > * The DHCP client script tries to synchronize the local clock with > pool.ntp.org upon every lease renewal. This triggers a DNS query, and as > /etc/resolv.conf says "nameserver 127.0.0.1", the query goes to Dnsmasq to > resolve. Dnsmasq logs the query but then logs *neither* a response to the > query *nor* an indication that the query has been forwarded anywhere. > Indeed, this appears to be the point when Dnsmasq goes into an infinite loop > and never comes back. > > * When this happens, Dnsmasq pegs the CPU and does not respond to SIGTERM. I > have to SIGKILL it. > > > Some key points from the log (which is included in full at the end of this > message): > > 1. <6>r8169 0000:02:00.0 eth0: Link is Up - 1Gbps/Full - flow control off > <30>dhcpcd[82]: eth0: carrier acquired > > This is the Ethernet link to the cable modem reacquiring carrier after > the cable modem reboots. Acknowledge > 2. <31>dhcpcd[82]: eth0: sending DISCOVER (xid 0xbd6b2cf7), next in 4.2 > seconds > <30>dhcpcd[82]: eth0: offered 192.168.100.10 from 192.168.100.1 > <31>dhcpcd[82]: eth0: sending REQUEST (xid 0xbd6b2cf7), next in 4.5 seconds > <31>dhcpcd[82]: eth0: acknowledged 192.168.100.10 from 192.168.100.1 > > This is the DHCP client obtaining a junk lease from the cable modem, > which is has not yet locked onto the DOCSIS network at this point. Ack > 3. <31>dhcpcd[82]: eth0: executing: /etc/dhcpcd.sh BOUND > <13>dhcp: Upstream routers: 192.168.100.1 > <13>dhcp: Domain name: (none) > <13>dhcp: DNS servers: (none) > <28>dnsmasq[14692]: no servers found in /tmp/resolv.conf, will retry > > This is the DHCP client script updating /tmp/resolv.conf (to become > empty) and Dnsmasq noticing the change, re-reading the file, and reporting > that there are no upstream servers. And that "no upstream servers" matches '<13>dhcp: DNS servers: (none)'. > 4. <13>dhcp: NTP servers: (none) > <30>dnsmasq[14692]: 15019 127.0.0.1/50561 query[A] pool.ntp.org from > 127.0.0.1 > > This is the DHCP client script reporting that the DHCP lease has > specified no NTP servers. At this point the script calls ntpclient with a > default NTP server of pool.ntp.org. You can see the DNS query arrive at > Dnsmasq from localhost. > > 5. <75>ntpclient[28632]: Failed resolving address to hostname pool.ntp.org: > No address associated with hostname > <75>ntpclient[28632]: Unable lookup pool.ntp.org > > This is ntpclient complaining that it was unable to resolve pool.ntp.org. > Notably, there was *no* log line from Dnsmasq stating, "config error is > REFUSED (EDE: network error)". Rather, it appears as though ntpclient simply > timed out while waiting for a reply. Yes, still no DNServer available > 6. <31>dhcpcd[82]: eth0: executing: /etc/dhcpcd.sh RENEW > <13>dhcp: NTP servers: (none) > <75>ntpclient[28638]: Failed resolving address to hostname pool.ntp.org: No > address associated with hostname > <75>ntpclient[28638]: Unable lookup pool.ntp.org > > This is the DHCP client renewing the junk lease, and the DHCP client Acknowledge on "junk lease" > script trying again with ntpclient, which still fails. Notably, this time > there is no DNS query logged by Dnsmasq. Indeed, Dnsmasq is out to lunch at > this point and will not be coming back. > > 7. <28>dhcpcd[82]: eth0: failed to renew DHCP, rebinding > <31>dhcpcd[82]: eth0: expire in 3 seconds > <31>dhcpcd[82]: eth0: sending REQUEST (xid 0xdfa6a085), next in 3.2 seconds > <28>dhcpcd[82]: eth0: NAK: from 96.113.192.209 > <31>dhcpcd[82]: eth0: deleting IP address 192.168.100.10/24 > <30>dhcpcd[82]: eth0: deleting route to 192.168.100.0/24 > <30>dhcpcd[82]: eth0: deleting default route via 192.168.100.1 > <31>dhcpcd[82]: eth0: executing: /etc/dhcpcd.sh NAK > <30>dhcpcd[82]: eth0: soliciting a DHCP lease (requesting 73.###.###.56) > <31>dhcpcd[82]: eth0: sending DISCOVER (xid 0x423103f5), next in 3.9 seconds > <30>dhcpcd[82]: eth0: offered 73.###.###.246 from 96.113.192.209 > <31>dhcpcd[82]: eth0: sending REQUEST (xid 0x423103f5), next in 4.0 seconds > <31>dhcpcd[82]: eth0: acknowledged 73.###.###.246 from 96.113.192.209 > > This is the DHCP client failing to renew the junk lease from the cable > modem and instead obtaining a real lease from Comcast. Acknowledge on "the DHCP client failing to renew". > 8. <31>dhcpcd[82]: eth0: executing: /etc/dhcpcd.sh BOUND > <13>dhcp: Upstream routers: 73.###.###.1 > <13>dhcp: Domain name: hsd1.nh.comcast.net. > <13>dhcp: DNS servers: 75.75.75.75 75.75.76.76 > <13>dhcp: NTP servers: (none) > <75>ntpclient[28674]: Failed resolving address to hostname pool.ntp.org: No > address associated with hostname > <75>ntpclient[28674]: Unable lookup pool.ntp.org > > This is the DHCP client script updating /tmp/resolv.conf with the real > domain name and name servers. Notably, Dnsmasq does not notice the change in > /tmp/resolv.conf since it is feverishly engaged in chasing its own tail. Please elaborate "its own tail" > Further, you can see that ntpclient still failed to resolve pool.ntp.org > even though the Internet connection is now up and /tmp/resolv.conf now > specifies working upstream name servers. > > Yes, that matches with what was already reported. Regards Other Observer Who couldn't resist to ignore the original posting -- Silence is hard to parse _______________________________________________ Dnsmasq-discuss mailing list Dnsmasq-discuss@lists.thekelleys.org.uk https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss