Re: [Dnsmasq-discuss] occasional REFUSED response after successful query

Simon Kelley Sat, 19 Nov 2005 17:36:21 +0000

Holger Schletz wrote:

Hi,
I get occasional REFUSED responses from dnsmasq on a specific network, thoughthe query is actually successful. I am able to reproduce the error withdnsmasq 2.22 on Debian "Sarge" and 2.23 on Debian "Etch" in this network.However, i could not reproduce it with 2.23 on a different network. My Etchhome box is also OK.
The problem occurs
- with the first query after dnsmasq starts up (this would not be a seriousproblem). Subsequent queries are successful.- with queries issued by nightly cron jobs (which fail completely inconsequence - very bad!)
The network uses a dial-on-demand DSL connection, which gets triggered by thequery from the cronjobs. However, the dialup is perfomed on an externalrouter and should be completely transparent to the network (except for ashort delay). BIND9 never had this problem.Moreover, in the first case the error can be reproduced even if the DSL linkis already up.
This is how i tested it:

1. restart dnsmasq
2. run "host <some.domain.name>"
3. host responds: "Host <some.domain.name> not found: 5(REFUSED)"
4. But the query log shows:
Nov 17 11:54:48 zfg15 dnsmasq[8039]: query[A] <some.domain.name> from127.0.0.1Nov 17 11:54:48 zfg15 dnsmasq[8039]: forwarded <some.domain.name> to<first.dns.server>Nov 17 11:54:48 zfg15 dnsmasq[8039]: forwarded <some.domain.name> to<second.dns.server>Nov 17 11:54:48 zfg15 dnsmasq[8039]: reply <some.domain.name> is<some.ip.address>
What's going on here? What is so special about my network? And most important:how do I fix it? :-)
Thanks,
Holger

Dnsmasq only generates REFUSED return codes itself if there are nosuitable upstream DNS servers to forward a query to, or all the attemptsto forward fail at transmission time (typically, with "No route tohost") That's clearly not what is happening here, so the REFUSED returncode must be coming from one of the upstream servers.


What is happening if this:

* This is the first query to dnsmasq, so it doesn't know which of theupstream servers are working. In this situation, it sends the query toall the servers, (two, in this case.) That's the first three lines ofthe log.

* One of the upstream servers returns REFUSED, which gets sent back tothe original requestor, that's your problem. This is not logged by dnsmasq.

* The other upstream server returns a good anwer, which is also sent tothe orignal requestor, but too late. That's the last line in the log.The upstream server which returns a good answer is marked as "good" andany subsequent request are sent there, so the problem doesn't recur.

The reason why it happens like this is partly just history and inertia,partly because I didn't want to risk the original requestor getting noresponse at all, (and suffering a long timeout) when upstream serversare returning error codes. However, this isn't the first time this hasbeen reported as a bug (seehttp://bugs.debian.org/cgi-bin/bugreport.cgi?bug=330422), and from thenext release the behaviuor will change. Now, if a query gets send to nservers, the first n-1 error replies will be dropped, and only the lastone returned to the original requestor. That means that if some upstreamservers are erroring, but some are working, then the query will stillsuceed.

I plan to release version 2.24, which has this change, fairly soon andI'm happy to make the current development snapshot available to anyonewho wants to try it.

Holger, to fix your problem I suggest either weeding out the brokennameserver (though experience shows that by now, it's probably workingagain!), or risking the 2.24 beta.


Cheers,

Simon.

Re: [Dnsmasq-discuss] occasional REFUSED response after successful query

Reply via email to