Re: [OpenWrt-Devel] dnsmasq stops receiving packets after network restart
Hi, On Tue, Sep 25, 2018 at 11:23 AM Jo-Philipp Wich wrote: > Maybe netlink congestion or something related to privilege dropping? Can > you manage to capture an strace log of the running dnsmasq instance > while the network is getting restarted? After some discussion on the dnsmasq mailing list, the cause has been found and your theory about "resubscribe" is correct. When dnsmasq is configured to listened to one particular interface, dnsmasq binds (SO_BINDTODEVICE) to this interface during startup if the interface is available. On my devices, I had configured dnsmasq to only listen to br-lan. When I restart networking, br-lan is re-created. This event is not detected by dnsmasq and the socket will silently fail. If I configure dnsmasq to listen to two interfaces, SO_BINDTODEVICE is never used and the error disappears. Why this error has appeared now is still unknown, but it could be because no one has gone looking for it. My current work-around, base on the advice of Simon Kelley, is to have the function which looks up the interface to bind to (whichdevice()) return NULL. Filtering of received packets is anyway done not dependent on the bound socket. BR, Kristian ___ openwrt-devel mailing list openwrt-devel@lists.openwrt.org https://lists.openwrt.org/mailman/listinfo/openwrt-devel
Re: [OpenWrt-Devel] dnsmasq stops receiving packets after network restart
Hi, On Wed, Sep 26, 2018 at 3:14 PM Kristian Evensen wrote: > On Tue, Sep 25, 2018 at 11:23 AM Jo-Philipp Wich wrote: > > does the same happen without "bind-dynamic" ? My hunch is that dnsmasq > > fails to "resubscribe" to the socket after the ifindex of br-lan changed > > due to the network restart (which will destroy and recreate br-lan). > > bind-dynamic seems indeed to be the trigger. Disabling bind-dynamic > makes the problem go away. I have been testing some more and I can now trigger the problem locally as well. However, I don't quite know how I managed to trigger the issue. Still, I think I have discovered something. When I restart network, there are two different outcomes for the dnsmasq process - it is either killed (SIGTERM, trying to identify process) or nor. The error always happens when the process is not killed. When dnsmasq dies and is restarted, DHCP continues to work as normal. BR, Kristian ___ openwrt-devel mailing list openwrt-devel@lists.openwrt.org https://lists.openwrt.org/mailman/listinfo/openwrt-devel
Re: [OpenWrt-Devel] dnsmasq stops receiving packets after network restart
Hi Kristian, does the same happen without "bind-dynamic" ? My hunch is that dnsmasq fails to "resubscribe" to the socket after the ifindex of br-lan changed due to the network restart (which will destroy and recreate br-lan). Maybe netlink congestion or something related to privilege dropping? Can you manage to capture an strace log of the running dnsmasq instance while the network is getting restarted? ~ Jo ___ openwrt-devel mailing list openwrt-devel@lists.openwrt.org https://lists.openwrt.org/mailman/listinfo/openwrt-devel
Re: [OpenWrt-Devel] dnsmasq stops receiving packets after network restart
Hi Jo-Philipp, On Tue, Sep 25, 2018 at 6:59 AM Jo-Philipp Wich wrote: > whats the complete dnsmasq cmdline? This is the command line: /usr/sbin/dnsmasq -C /var/etc/dnsmasq.conf.cfg01411c -k -x /var/run/dnsmasq/dnsmasq.cfg01411c.pid The configuration looks as follows: # auto-generated config file from /etc/config/dhcp conf-file=/etc/dnsmasq.conf dhcp-authoritative domain-needed localise-queries read-ethers enable-ubus expand-hosts bind-dynamic local-service domain=lan server=/lan/ server=8.8.8.8 server=8.8.4.4 server=208.67.222.222 server=208.67.220.220 interface=br-lan dhcp-leasefile=/tmp/dhcp.leases servers-file=/tmp/resolv-files/servers.conf resolv-file=/tmp/resolv-files/resolv.conf stop-dns-rebind rebind-localhost-ok dhcp-broadcast=tag:needs-broadcast addn-hosts=/tmp/hosts conf-dir=/tmp/dnsmasq.d user=dnsmasq group=dnsmasq bogus-priv conf-file=/usr/share/dnsmasq/rfc6761.conf dhcp-range=set:lan,192.168.6.2,192.168.6.201,255.255.255.0,12h no-dhcp-interface=eth0.2 Thanks, Kristian ___ openwrt-devel mailing list openwrt-devel@lists.openwrt.org https://lists.openwrt.org/mailman/listinfo/openwrt-devel
Re: [OpenWrt-Devel] dnsmasq stops receiving packets after network restart
Hi, whats the complete dnsmasq cmdline? ~ Jo signature.asc Description: OpenPGP digital signature ___ openwrt-devel mailing list openwrt-devel@lists.openwrt.org https://lists.openwrt.org/mailman/listinfo/openwrt-devel
[OpenWrt-Devel] dnsmasq stops receiving packets after network restart
Hello, I have some routers running OpenWRT (latest nightly) and that I have to access remotely (using reverse SSH). When I restart networking (/etc/init.d/network restart), clients on the LAN can no longer obtain an IP address using DHCP. If I restart networking locally, DHCP works as expected after the network is back up. In order to try and figure out what is going on, I have checked/tried the following: * I started out by checking if dnsmasq has been restarted and if the DHCP socket has been created. I can always see the socket in netstat. * I then took a look at the firewall. I can see the DHCP packets in the INPUT chain in filter, which according to my understanding of Netfilter-internals is the last stop before a packet is delivered to a socket. * I then instrumented dnsmasq and added some logging in dhcp_packet() in dhcp.c. This function is never called, as none of my log-messages are written to syslog. I checked that the logging works by checking for my messages when DHCP is working. * Restarting dnsmasq makes DHCP work again. I can't see any difference in for example netstat-output. Does anyone have any idea on what to try or where to look next? After having spent a couple of days on this issue, I am quickly starting to run out of ideas. Thanks in advance for any help, Kristian ___ openwrt-devel mailing list openwrt-devel@lists.openwrt.org https://lists.openwrt.org/mailman/listinfo/openwrt-devel