On 2008-04-22 at 19:12 +0200, Mark Schouten wrote: > We do not do any recipient checking as these checks are done on the > frontend-machines, but all of these users exist on the machines where > delivery takes place. > The users exists thanks to libnss-ldap, combined with nscd. > > As far as I can tell, the only thing that could cause these messages to > bounce is 'check_local_user'. According to the Exim documentation, > check_local_user does a getpwnam. Should I assume that, if Exim says > 'Unrouteable address', the getpwnam-reply was 'User does not exist'? If > so, the problem would lie in nscd, giving false replies. As far as I can > see Exim should tempfail if it sees an LDAP error, and not 'Unrouteable > address'. > > If anyone has a cluebat, please hit me with it. :)
nscd flaws are not implausible. Your clue-level seems sufficient. :^) Take check_local_user out of the picture by using a direct LDAP lookup in a condition instead and see if that fixes it? Do this on one of the three boxes, compare error rates. Make sure that you have sufficient indexing on the LDAP read-only caches being used for these queries that the servers can handle this without nscd buffering it, of course. Check "10.7 Named list caching" which might be useful, depending upon the queries you end up using, to reduce repeat queries from within one process. Also, if data from receive-time will be used again at delivery time and this can be deferred to a queue-runner, consider using an ACL variable or two to hold the results of the lookup to avoid repeat queries. -Phil -- ## List details at http://lists.exim.org/mailman/listinfo/exim-users ## Exim details at http://www.exim.org/ ## Please use the Wiki with this list - http://wiki.exim.org/
