Mark Andrews wrote:
In message <4a452428.9020...@provident-solutions.com>, "Vernon A. Fort" writes:
I've run into a problem with named and timeouts primarily with MX lookups. When a MX query fails the first time, i have to restart the named process before it will return a successful query. Again, its mainly with MX lookups but it also happens with A records as well. The problem subsides for 1-2 hours and starts happening again - basically i look in the mailq for deferred messages with MX lookup failures.

This box is a Gentoo install running a medium volume (500K per day) mail
server - lots of dns queries due to rbl's, spamassassin, etc. This problem started showing up around mid-may. Since then, i have re-installed bind and bind-tools several times, updated the kernel, linux headers to 2.6.29, recompiled glibc, etc....

I just updated to 9.6.0-P1 from 9.4.3-P2 - same problem exists. When doing a manual MX lookup (dig MX isc.org) - it takes around 45 seconds on the first attempt. If it fails the first time, it will never return a positive query, just "connection timed out; no servers could be reached" until i restart named. I can't say for sure but the bind application was updated around the time i noticed this problem. All versions of bind i have tried (in gentoo portage) have the same problem.

Can anyone help me find where this problem might be? I've google'd until my eyes are red and throbbing.

Thanks

Vernon
_______________________________________________
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

I suggest that you fix your firewalls to allow 4096 byte EDNS
responses though.  Both ORG and ISC.ORG are signed zones so there
reponses are larger than with unsigned zones.  Named is having to
retry with different options to get a response through your firewall
and this takes time.

A EDNS/UDP MX response is 1999 bytes for isc.org.

;; Query time: 872 msec
;; SERVER: 2001:4f8:0:2::19#53(2001:4f8:0:2::19)
;; WHEN: Sat Jun 27 09:39:34 2009
;; MSG SIZE  rcvd: 1999
I now have two servers running behind checkpoint firewall which are failing to resolve MX records. One of IT guys called CheckPoint and support suggested they disable the smart defense DNS udp check. This did correct the problem, but queries are still sluggish from time to time.

I have three questions related to this:

1. On both servers - the dns version (and glibc) were updated in mid-January bind-9.4.1 to 9.4.3. The SmartDefense DNS check has been enabled on both firewalls long before the last updates were applied. Why did the issues just now start showing up (late May - early June)?

2. When a email is deferred in the mailq, it will stay deferred until named is restarted. I just tested this on a mail message that sat in the queue for just about three days. I keep trying to dig MX domain.com during this time period and NOTHING would resolved (including any A records) until i restarted named. Why?

3. In both network environments, i switched the resolution to internal windows 2003 dns servers. NO problems occurred during the week we used the windows DNS server. Why would smartdefense not have the same effect on windows based name servers?

Updated to bind-9.6.1 and updating the root.zone file made little if any difference. Basically, It appears that SOMETHING has changed somewhere because we have just now altered the cisco PIX rules to increase the udp packet size due to timeout in these environments. I have seen posts related to my problems as far back as 2-3 years ago. So again, i'm scratching my head wondering what the heck did i miss - why did these problems just now start showing up?

Any pointers or additional reading would be greatly appreciated. I'm just trying to understand from a 1000 foot view but whatever view anyone suggests is fine.

Vernon

_______________________________________________
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

Reply via email to