On 2009-11-17 at 19:15 +1100, Ted Cooper wrote: > Well it is an issue, but again I think you've missed the key point here: > Exim doesn't use threads, thus it does not access _res in an unsafe > manner.
I see libthr coming in via libdb (BDB 4.4) and OpenSSL. Doesn't mean I'm using threads. > It's entirely fork() based so each new process will have its own copy of > _res which is can with as it pleases. > > Even so, if Exim has access to _res implemented incorrectly, we should > probably look at fixing that. Does anyone know of a good reference > program for whatever resolver library we're talking about? It's the standard Unix resolver library. resolver(3) on FreeBSD (among others) goes into more detail on _res. Exim's only usage is to initialise _res in dns_init() and to reference _res.options to construct a cache key. Oh, and some stuff in the test harness. Exim sets some items in _res.options (fully documented in the manpage) and sets _res.retrans and/or _res.retry if the dns_retrans and/or dns_retry options are set in the main config section. None of the DNS lookups explicitly reference _res; it's purely an init-time thing and a read of the options for the cache key. The proposed replacement functions are peculiar to NetBSD, so this suggested approach really translates to "implement a custom DNS layer for NetBSD". That seems to be of little benefit, unless some other OS is going to adopt these new calls. (But see below) The current _res usage works across every OS which Exim is built for except the new NetBSD platform (AIX, HP-UX, IRIX, SCO, OSF1, ULTRIX ...) so the question really is "What did NetBSD break and is there a simple way for the Exim binary to persuade the resolver library that it's behaving safely?" The use of _res in Exim's dns_init() looks to be compatible with the description of _res at: http://netbsd.gw.com/cgi-bin/man-cgi?resolver++NetBSD-current so I suspect the only issue is the later referencing of _res.options for the cache key. If there's a lock which can be grabbed to safely access _res, the right thing might be to use a global variable internal to Exim to hold _res.options, init that at the same time that res_init() is called (in dns_init()), double-check that covers all execution paths and then use NetBSD-specific #ifdef's to handle lock init. But I don't see such a lock described in resolver(3) at the URL above. We could set the global variable in the same way that the resolver options are set, but that wouldn't pick up default options. Another approach would be to figure out whether or not we really need the options as part of the cache key and look at just removing that. A working debug trace of the part which failed for the OP is: ----------------------------8< cut here >8------------------------------ calling dnslookup router dnslookup router called for [email protected] domain = gmx.net DNS lookup of gmx.net (MX) succeeded DNS lookup of mx1.gmx.net (AAAA) gave NO_DATA returning DNS_NODATA DNS lookup of mx1.gmx.net (A) succeeded ----------------------------8< cut here >8------------------------------ so the point which failed was dns.c line 563: dnsa->answerlen = res_search(CS name, C_IN, type, dnsa->answer, MAXPACKET); In fact, the DNS resolution is sufficiently isolated that we *could* implement the res_nsend API. I don't have a NetBSD box to test any changes on though, so I'm unwilling to write the patch. -Phil -- ## List details at http://lists.exim.org/mailman/listinfo/exim-users ## Exim details at http://www.exim.org/ ## Please use the Wiki with this list - http://wiki.exim.org/
