Re: [Dnsmasq-discuss] DNSMASQ fails to start on boot
Thanks for the reply kwitty42. Unfortunately you're talking to a linux novice and I am not sure of all the answers but... I am running Raspian Jessie which does use systemd I believe. I have also confirmed that dnsmasq runs a startup script in /etc/init.d which says that it requires $NETWORK. Can you spell out for me how I check the interface status in the script please? Cheers, DG > On 10/19/2016 12:06 AM, David Griffiths wrote: > > I found a discussion talking about the same problem on Ubuntu but the > > recommended fix did not work for me :-( > > https://bugs.launchpad.net/ubuntu/+source/dnsmasq/+bug/1531184 > > > > It is a case of DNSMASQ starting before the network is ready. > > > > Any suggestions please? > > you can't have your DNSMASQ start up script check to see if the network is up > before starting DNSMASQ? systemd isn't involved in your RPi installation, is > it? > the older style init.d scripts (sysV??) should be much easier to work with... > check the interfaces' statuses with the ip or ipconfig command and see if > they > are ready to be used... > --- David Griffiths da...@digitalgraphics.com.au Digital Graphics P/L http://www.digitalgraphics.com.au Sydney, Australia +61 2 4567 8999 --- Australian agents for: E-LAB computers - AVR Pascal and In Circuit programmers Logical Systems - programming adapters Intronix - LogicPort 34 channel Logic Analyser ___ Dnsmasq-discuss mailing list Dnsmasq-discuss@lists.thekelleys.org.uk http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss
Re: [Dnsmasq-discuss] Dnsmasq responses broken for Linux and Mac clients, but working on Windows and Android clients
Hi again, I have more details to add to my question - the issue just occured again and I was able to capture a failed DNS query on the router. Full details below the cited original message... Timo Sigurdsson schrieb am 19.10.2016 22:45: > Hi, > > I have a weird issue with Dnsmasq which I think is related to DNSSEC, but I > don't exactly understand why or what is happening and how to fix it. > > I'm currently running Dnsmasq 2.76 on my router powered by a fairly recent > build of LEDE (r1792, Kernel 4.4.23). DNSSEC validation and > DNSSEC-check-unsigned are both turned on. > > Sometimes, the Linux and Mac clients in my network cannot resolve random > domain > names. But at the same time, resolution of the exact same names works on > Windows clients as well as my Android devices - and even on the router itself. > When I restart Dnsmasq everything works again. > > For example, just now, my Debian machine could not resolve the domain > security.debian.org. `nslookup security.debian.org` would show: > ;; Truncated, retrying in TCP mode. > Server: 192.168.123.1 > Address: 192.168.123.1#53 > > ** server can't find security.debian.org: SERVFAIL > So, the query for security.debian.org happened to fail again. Apparently Dnsmasq declares ABANDONS the DNSSEC validation. I also think that my initial assesment that my Windows clients are still able to resolve the name was wrong. Because now a quick test on a Windows machine shows the same error for the same domain - probably the results during my previous tests were still cached on the machine itself. Anyway, so here is the log of the caputerd DNS query (timestamps removed for better readability - but it all happens within 5 seconds): dnsmasq[9650]: 23525 192.168.123.75/52394 query[A] security.debian.org from 192.168.123.75 dnsmasq[9650]: 23525 192.168.123.75/52394 forwarded security.debian.org to 2001:4860:4860::8844 dnsmasq[9650]: 23525 192.168.123.75/52394 forwarded security.debian.org to 8.8.8.8 dnsmasq[9650]: 23526 192.168.123.75/52394 query[] security.debian.org from 192.168.123.75 dnsmasq[9650]: 23526 192.168.123.75/52394 forwarded security.debian.org to 2001:4860:4860::8844 dnsmasq[9650]: 23527 192.168.123.75/52395 query[A] ftp.de.debian.org from 192.168.123.75 dnsmasq[9650]: 23527 192.168.123.75/52395 forwarded ftp.de.debian.org to 2001:4860:4860::8844 dnsmasq[9650]: 23528 192.168.123.75/52395 query[] ftp.de.debian.org from 192.168.123.75 dnsmasq[9650]: 23528 192.168.123.75/52395 forwarded ftp.de.debian.org to 2001:4860:4860::8844 dnsmasq[9650]: * 192.168.123.75/52394 dnssec-query[DS] debian.org to 2001:4860:4860::8844 dnsmasq[9650]: * 192.168.123.75/52395 dnssec-query[DS] debian.org to 2001:4860:4860::8844 dnsmasq[9650]: * 192.168.123.75/52394 dnssec-query[DS] debian.org to 2001:4860:4860::8844 dnsmasq[9650]: * 192.168.123.75/52395 dnssec-query[DS] debian.org to 2001:4860:4860::8844 dnsmasq[9650]: * 192.168.123.75/52394 dnssec-query[DNSKEY] org to 2001:4860:4860::8844 dnsmasq[9650]: * 192.168.123.75/52395 dnssec-query[DNSKEY] org to 2001:4860:4860::8844 dnsmasq[9650]: * 192.168.123.75/52394 dnssec-query[DNSKEY] org to 2001:4860:4860::8844 dnsmasq[9650]: * 192.168.123.75/52395 dnssec-query[DNSKEY] org to 2001:4860:4860::8844 dnsmasq[9650]: 23525 192.168.123.75/52394 reply security.debian.org is 212.211.132.32 dnsmasq[9650]: 23525 192.168.123.75/52394 reply security.debian.org is 195.20.242.89 dnsmasq[9650]: 23525 192.168.123.75/52394 reply security.debian.org is 212.211.132.250 dnsmasq[13030]: 23529 192.168.123.75/57452 query[A] security.debian.org from 192.168.123.75 dnsmasq[9650]: 23528 192.168.123.75/52395 reply ftp.de.debian.org is NODATA-IPv6 dnsmasq[9650]: 23527 192.168.123.75/52395 reply ftp.de.debian.org is 141.76.2.4 dnsmasq[9650]: 23526 192.168.123.75/52394 reply security.debian.org is 2001:a78:5::216:35ff:fe7f:be4f dnsmasq[9650]: 23526 192.168.123.75/52394 reply security.debian.org is 2001:a78:5:1:216:35ff:fe7f:6ceb dnsmasq[13031]: 23629 192.168.123.75/57453 query[A] ftp.de.debian.org from 192.168.123.75 dnsmasq[13030]: 23529 192.168.123.75/57452 forwarded security.debian.org to 2001:4860:4860::8844 dnsmasq[13030]: * 192.168.123.75/57452 dnssec-query[DS] debian.org to 2001:4860:4860::8844 dnsmasq[13031]: 23629 192.168.123.75/57453 forwarded ftp.de.debian.org to 2001:4860:4860::8844 dnsmasq[13031]: * 192.168.123.75/57453 dnssec-query[DS] debian.org to 2001:4860:4860::8844 dnsmasq[13030]: * 192.168.123.75/57452 dnssec-query[DNSKEY] org to 2001:4860:4860::8844 dnsmasq[13031]: * 192.168.123.75/57453 dnssec-query[DNSKEY] org to 2001:4860:4860::8844 dnsmasq[13030]: * 192.168.123.75/57452 reply org is DNSKEY keytag 17883, algo 7 dnsmasq[13030]: * 192.168.123.75/57452 reply org is DNSKEY keytag 48497, algo 7 dnsmasq[13030]: * 192.168.123.75/57452 reply org is DNSKEY keytag 9795, algo 7 dnsmasq[13030]: * 192.168.1
[Dnsmasq-discuss] Dnsmasq responses broken for Linux and Mac clients, but working on Windows and Android clients
Hi, I have a weird issue with Dnsmasq which I think is related to DNSSEC, but I don't exactly understand why or what is happening and how to fix it. I'm currently running Dnsmasq 2.76 on my router powered by a fairly recent build of LEDE (r1792, Kernel 4.4.23). DNSSEC validation and DNSSEC-check-unsigned are both turned on. Sometimes, the Linux and Mac clients in my network cannot resolve random domain names. But at the same time, resolution of the exact same names works on Windows clients as well as my Android devices - and even on the router itself. When I restart Dnsmasq everything works again. For example, just now, my Debian machine could not resolve the domain security.debian.org. `nslookup security.debian.org` would show: ;; Truncated, retrying in TCP mode. Server: 192.168.123.1 Address: 192.168.123.1#53 ** server can't find security.debian.org: SERVFAIL Similarily, `dig +dnssec security.debian.org` would show: ;; Truncated, retrying in TCP mode. ; <<>> DiG 9.9.5-9+deb8u7-Debian <<>> +dnssec security.debian.org ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 37369 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags: do; udp: 512 ;; QUESTION SECTION: ;security.debian.org. IN A ;; Query time: 1167 msec ;; SERVER: 192.168.123.1#53(192.168.123.1) ;; WHEN: Wed Oct 19 21:23:49 CEST 2016 ;; MSG SIZE rcvd: 48 However, on Windows everything would work fine. And even *on the router itself*, a lookup of the domain gives a valid result: Name: security.debian.org Address 1: 2001:a78:5::216:35ff:fe7f:be4f villa.debian.org Address 2: 2001:a78:5:1:216:35ff:fe7f:6ceb lobos.debian.org Address 3: 195.20.242.89 wieck.debian.org Address 4: 212.211.132.250 lobos.debian.org Address 5: 212.211.132.32 villa.debian.org I also checked whether the time on the router is correct - and it is (constantly synced via ntp). What's more, as soon as I restart Dnsmasq (just the service, not the router itself), everything works again. But then some other domain name might fail. Two days ago I noticed that my Linux clients couldn't resolve www.kernel.org - after that I turned off dnssec-check-unsigned (but left DNSSEC enabled), which seems to have made the problem occur less often, but still it does occur from time to time and always with different domain names. It's also not related to that specific Debian client, but occurs as well on an Ubuntu 16.04 laptop, as well as a MacBook running OS X 10.11. Is it possible the DNSSEC validation somehow causes the response by Dnsmasq too long for Linux and Mac clients? Some more background: On the very same router, I've been running OpenWrt 15.05 for more than a year. I had DNSSEC and dnssec-check-unsigned enabled for all that time without any issues. About 2-3 weeks ago, I upgraded the router to LEDE which oviously brings a newer Dnsmasq version and base system. Last week, I started noticing DNS issues - so that change might be related. The issue also seems to be a bit hard to debug, because it occurs so randomly. I tried to enable logging of DNS queries just now to see if I can find the problem on the router side. But changing that setting led Dnsmasq to restart and then the resolution of the names that didn't work before worked again. So, I have to wait until it occurs the next time with some other domain name. Does anybody have an idea what could be going on here? Thank you very much! Regards, Timo ___ Dnsmasq-discuss mailing list Dnsmasq-discuss@lists.thekelleys.org.uk http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss
Re: [Dnsmasq-discuss] Dnsmasq not resolving addresses for an hour
Hi Albert, My comments inline. John > Hi All, > The main while(1) loop uses select() to determine if it has work to > do. In most cases, it appears to use timeout of 0, which I believe > means just wait indefinitely for work on the file descriptors. Other > times, it appears that the timeout is set to a quarter second when > doing a tftp transfer or polling the dbus. > > Now what concerns me is that when a "retry later" condition occurs, we > may get stuck on the select() for a long period of time. Alas, I do > not know how frequent one might expect to see work arrive on the file > descriptors that select is watching, so I don't really know if this is > a long time or not. It seems though that in this failure scenario, > the poll_resolv() function does NOT get called very often at all. Albert: Actually, if dnsmasq does not receive any request from clients, it does not need to poll servers, so I would ask: does the select() include descriptors for client requests (either UDP datagrams received, or TCP connections opened)? If so, I think it will exit just when necessary and no tiemout is needed; otherwise, you are right that a timeout is required. Albert: Also, it may be improbable that select() does not return for a whole hour; but then, is every return from select() followed by a resolv file poll, or can select() return and then be entered again without polling the resolv files? I am thinking, for instance, about cached answers which do not need servers if their TTL is long enough. John: I have made a simple change that provides a one second timeout for select. I have found that dnsmasq is much more responsive now to changes made to /etc/resolv.conf. With code that calls poll_resolv, it rate limits the calls to once every two seconds, which I believe is fine and responsive enough. John: Given I am testing this in a lab situation and just me on the console and one idle PC connected to the router, there is little use of DNS. In my experience since the initial failure, I believe I did see poll_resolv polled in one case at an interval of about 20 minutes. I don’t think this poll interval should be driven by how active the users are and how much they use dns; just my personal feeling about that. John: It should be noted that if I had been doing a tftp transfer, the code would set the select timeout for 250ms. I am not sure why the tftp transfer being active would warrant the much quicker timeout? Anyhow, what I did was an else statement... if tftp transfer, set timeout to 250ms else set timeout to 1 second. John: I don't know dnsmasq well enough to answer your other questions about select and what all of the file descriptors are associated with. Perhaps someone more knowledgeable can chime in. My change was made in response to the situation where a "retry later" situation was pending, and not getting poll_resolv was not getting polled again in a reasonable time period to do the retry. John: I believe on our router, dhcp entries have an hour TTL and we do use dnsmasq for dhcp. On an idle PC, would it have any reason to initiate a dnsmasq query? Occasionally if the browser is up and running, I do see the browser query the address of its update server, but I haven't generally speaking had my browser running on the PC while doing my dnsmasq testing. So it seems to me that the two possible sources to cause dnsmasq activity (ie. Browser and dhcp) may be idle for at least an hour... so this seems like a possibility that poll_resolv() may not be getting called in this scenario for a long time. > My gut feeling is that there always needs to be a timeout on the > select call as the poll_resolv() should be called fairly frequently. > The code that exists today where poll_resolv() normally is called from > this loop suggests a poll rate of about once a second. This > definitely does not happen today. By just adding a my_syslog() > message to the top of poll_resolv(), it is very clear from the logfile > that it is not called often, and way to infrequently to resolve the > "retry later" condition in a timely manner. Albert: Can you compare when poll_resolv() is called wrt when the select() is exited -- and for what reason? John: What I did to see relative times between select and calls to poll_resolv was to add calls to my_syslog() before the select and at the top of poll_resolv(). The timestamp in the dnsmasq logfile was used to see how much time between calls. I don't know what the reason for exiting select is... indeed, for what I was doing, I really didn't care... I just needed to know when poll_resolv() was getting called and how often. > Going forward, as the next thing for me to try, I am going to add a > timeout for the select... perhaps a modest once a second or two. Albert: I would personally investigate further on a gut feeling without changing the code behavior, because my changes might have unwanted effects which can actually hide the
Re: [Dnsmasq-discuss] DNSMASQ fails to start on boot
On 10/19/2016 12:06 AM, David Griffiths wrote: I found a discussion talking about the same problem on Ubuntu but the recommended fix did not work for me :-( https://bugs.launchpad.net/ubuntu/+source/dnsmasq/+bug/1531184 It is a case of DNSMASQ starting before the network is ready. Any suggestions please? you can't have your DNSMASQ start up script check to see if the network is up before starting DNSMASQ? systemd isn't involved in your RPi installation, is it? the older style init.d scripts (sysV??) should be much easier to work with... check the interfaces' statuses with the ip or ipconfig command and see if they are ready to be used... -- NOTE: No off-list assistance is given without prior approval. *Please keep mailing list traffic on the list* unless private contact is specifically requested and granted. ___ Dnsmasq-discuss mailing list Dnsmasq-discuss@lists.thekelleys.org.uk http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss