> If, on the other hand, you're trying to answer the question "why do I > get a SERVFAIL, some of the time, for some names, seemingly at random?", > then I don't know that a targeted tcpdump is going to help. You might > have to capture *everything*, detect the error, and then wade through > the data later.
This is exactly the problem. Is there some way to get named to dump its internal state so you can see why its returning a (cached?) SERVFAIL for what should be a working query? When I run rndc dumpdb and look at the output I can't find anything about the query its returning SERVFAIL for, is there some extra cache or rndc dumpdb flag I need to use? This problem usually only happens on the busier of two named instances and only after a week or so of uptime. If no-one else knows of this problem I guess I need to start collecting more data. Thanks for you reply, ds