I have a library which I think has a bug, but this bug is affecting DNS
queries, and bringing out some odd behaviour in dnsmasq...

Program is making a query to resolve an address (foo.bar.com)
A normal query results in a CNAME (foo.bar.com.edgekey.net), which results
in another CNAME (e1234.a.akamaiedge.net) which has an A record.

However every so often dnsmasq returns just the first CNAME.
Note I haven't yet caught it in the act of that first truncated response.
The only thing that makes sense to me is if the edgekey.net name servers
didn't respond in good time... but....

However the bug in the library then means it asks again, instantly.  and
again... and again....
It manages over 100MB/ minute of DNS requests - dnsmasq answering them all
from the cache (I see *no* external requests for that address).

When I restart the program the very first query (identical query as before)
gets a complete answer from dnsmasq.

What I can't understand is how that restart makes any difference to dnsmasq.
Does dnsmasq have some sort of 'Oh hell the query load is insane I'm just
extending the cache a bit to help' mode which it then escapes from as the
program restarts?
There are no external queries for this name during the period of insanity,
but the first request after does get put to the external name servers.

I'm running an 'external interface only' capture to try and capture the
initial error condition (which I very much doubt is a problem in dnsmasq),
to see if that can shed some light on the issue.

Thoughts? debug hints? laughter?



John Robson
