Have a look at the retry and wait spool hint database entries. If a host has been down long enough, Exim won’t give it the opportunity to retry even with a new message until retry time is reached.
man exim_dumpdb and exim_fixdb You can also delete the retry and wait* databases to reset everything…. (I’d probably offline exim while I did that but it’s likely not necessary). -D On 28 Jul 2014, at 1:02 am, Russell King <[email protected]> wrote: > I know that 4.69 is an old version of exim, but... I'm seeing some > weird behaviour with it. > > The machine in question acts as a backup machine for another computer. > It's setup such that each night, it powers itself on, transfers the > data, archives it, sends a mail and powers off. Once a week, it > remains on for a 24 hour period. > > The problem is this - exim behaves itself just fine when it can send > the message immediately. If it can't (because of the DSL at the site > being down) then exim gives me a hard failure and bounces the message. > > This goes totally against what is in the config file for the retry > rules: > > * * F,2h,15m; G,16h,1h,1.5; F,4d,6h > > The config file is pretty much standard Fedora 14, but with these as > the routers (as is the above line being the F14 default): > > remote_smtp: > driver = smtp > headers_rewrite = *@* [email protected] fs > return_path = [email protected] > > > So it should take many days before bouncing. However: > > 2014-07-26 07:01:42 1XAv38-0000iU-Ln <= > [email protected] U=backup P=local S=65027 > id=20140726060142.GA2756@shgc-backup > 2014-07-26 07:01:48 1XAv38-0000iU-Ln => [email protected] R=dnslookup > T=remote_smtp H=mx0.arm.linux.org.uk [78.32.30.218] X=TLSv1:AES256-SHA:256 > > that one was fine. Then this morning: > > 2014-07-27 04:19:35 1XBEzn-0000XA-FM <= > [email protected] U=root P=local S=3340 > 2014-07-27 04:20:17 1XBEzn-0000XA-FM == [email protected] > <[email protected]> R=dnslookup defer (-1): host > lookup did not complete > 2014-07-27 04:21:22 1XBEzn-0000XA-FM == [email protected] > <[email protected]> routing defer (-51): retry time > not reached > 2014-07-27 04:26:21 1XBEzn-0000XA-FM == [email protected] > <[email protected]> routing defer (-51): retry time > not reached > 2014-07-27 04:31:23 1XBEzn-0000XA-FM == [email protected] > <[email protected]> routing defer (-51): retry time > not reached > 2014-07-27 04:36:42 1XBEzn-0000XA-FM == [email protected] > <[email protected]> R=dnslookup defer (-1): host > lookup did not complete > ... > 2014-07-27 05:16:42 1XBEzn-0000XA-FM == [email protected] > <[email protected]> R=dnslookup defer (-1): host > lookup did not complete > ... > 2014-07-27 05:36:42 1XBEzn-0000XA-FM == [email protected] > <[email protected]> R=dnslookup defer (-1): host > lookup did not complete > ... > 2014-07-27 05:56:41 1XBEzn-0000XA-FM == [email protected] > <[email protected]> R=dnslookup defer (-1): host > lookup did not complete > ... > 2014-07-27 06:16:41 1XBEzn-0000XA-FM == [email protected] > <[email protected]> R=dnslookup defer (-1): host > lookup did not complete > 2014-07-27 06:36:41 1XBEzn-0000XA-FM == [email protected] > <[email protected]> R=dnslookup defer (-1): host > lookup did not complete > > 2014-07-27 06:44:43 1XBHGF-0000iX-Rm <= > [email protected] U=backup P=local S=63350 > id=20140727054423.GA2759@shgc-backup > 2014-07-27 06:45:24 1XBHGF-0000iX-Rm == [email protected] R=dnslookup > defer (-1): host lookup did not complete > 2014-07-27 06:46:19 1XBEzn-0000XA-FM == [email protected] > <[email protected]> routing defer (-51): retry time > not reached > 2014-07-27 06:46:19 1XBHGF-0000iX-Rm == [email protected] routing defer > (-51): retry time not reached > ... > 2014-07-27 07:46:39 1XBEzn-0000XA-FM == [email protected] > <[email protected]> R=dnslookup defer (-1): host > lookup did not complete > 2014-07-27 07:46:39 1XBHGF-0000iX-Rm == [email protected] routing defer > (-51): retry time not reached > ... > 2014-07-27 08:46:19 1XBEzn-0000XA-FM == [email protected] > <[email protected]> routing defer (-51): retry time > not reached > 2014-07-27 08:46:19 1XBHGF-0000iX-Rm == [email protected] routing defer > (-51): retry time not reached > ... > 2014-07-27 09:21:40 1XBEzn-0000XA-FM == [email protected] > <[email protected]> R=dnslookup defer (-1): host > lookup did not complete > 2014-07-27 09:21:40 1XBHGF-0000iX-Rm == [email protected] routing defer > (-51): retry time not reached > ... > 2014-07-27 11:40:59 1XBHGF-0000iX-Rm mx0.arm.linux.org.uk > [2002:4e20:1eda:1:214:fdff:fe10:1be6] Network is unreachable > 2014-07-27 11:40:59 1XBHGF-0000iX-Rm mx0.arm.linux.org.uk > [2001:4d48:ad52:3201:214:fdff:fe10:1be6] Network is unreachable > 2014-07-27 11:41:04 1XBHGF-0000iX-Rm => [email protected] R=dnslookup > T=remote_smtp H=mx0.arm.linux.org.uk [78.32.30.218] X=TLSv1:AES256-SHA:256 > 2014-07-27 11:41:04 1XBHGF-0000iX-Rm Completed > 2014-07-27 11:41:19 1XBEzn-0000XA-FM ** [email protected] > <[email protected]> R=dnslookup T=remote_smtp: retry > time not reached for any host after a long failure period > 2014-07-27 11:41:19 1XBLtH-0000qh-O9 <= <> R=1XBEzn-0000XA-FM U=exim P=local > S=4383 > 2014-07-27 11:41:19 1XBEzn-0000XA-FM Completed > 2014-07-27 11:41:21 1XBLtH-0000qh-O9 => [email protected] > <[email protected]> R=dnslookup T=remote_smtp > H=mx0.arm.linux.org.uk [78.32.30.218] X=TLSv1:AES256-SHA:256 > > So, at 11:41:04, exim found that the destination was now able to be > delivered to. However, it decided to time out the 1XBEzn-0000XA-FM > message _before_ the retry rules stated that it should time out, and > sent a non-delivery report... which it also successfully delivered to > the same destination! > > The wait-remote_smtp database is empty. > > The two most recent retry database entries are: > > 26-Dec-2013 03:06:32 27-Jul-2014 11:41:04 27-Jul-2014 17:41:04 * > T:mx0.arm.linux.org.uk:2002:4e20:1eda:1:214:fdff:fe10:1be6 101 77 Network is > unreachable > 24-Dec-2013 03:06:23 27-Jul-2014 11:41:04 27-Jul-2014 17:41:04 * > T:pandora.arm.linux.org.uk:2002:4e20:1eda:1:214:fdff:fe10:1be6 101 77 > Network is unreachable > > which are expected as the site running this exim has no IPv6 connectivity > to be able to use the IPv6 addresses I have here. The only entry for the > IPv4 address is an old one which should have expired long ago (and the > DNS changed since then): > > 13-Feb-2014 05:26:39 13-Feb-2014 05:26:39 13-Feb-2014 05:41:39 > T:caramon.arm.linux.org.uk:78.32.30.218 110 333 Connection timed out > > Indeed, having tidied the retry database, the only two entries which > remain are the two above. > > The DNS for the machine is configured to use google's DNS servers > (iow, 8.8.8.8 and 8.8.4.4) as I've had problems with the ISPs DNS > servers - so DNS would have been unavailable during the loss of > connectivity too. > > So, the question is whether there's something screwed with the config > file, or whether it's just this old exim version misbehaving (which I > suspect is the real problem here.) What I don't understand is why the > successful delivery of 1XBHGF-0000iX-Rm seemed to cause 1XBEzn-0000XA-FM > to be immediately bounced. > > This probably isn't an issue that I can reproduce at will; I've seen it > a number of times, and it's always triggered by the loss of connectivity > at the site. > > -- > Russell King > > -- > ## List details at https://lists.exim.org/mailman/listinfo/exim-users > ## Exim details at http://www.exim.org/ > ## Please use the Wiki with this list - http://wiki.exim.org/ > -- ## List details at https://lists.exim.org/mailman/listinfo/exim-users ## Exim details at http://www.exim.org/ ## Please use the Wiki with this list - http://wiki.exim.org/
