Hello listAfter an upgrade from 4.94.2 to 4.95 on one of our FreeBSD mxout hosts, we encounter lots of stuck exim processes trying to deliver messages:
mailnull 55171 0.0 0.0 26112 15628 - I Fri13 0:00.03 /usr/local/sbin/exim -Mc 1nIVPr-000ELk-EB mailnull 55305 0.0 0.0 26036 15564 - I 16:04 0:00.03 /usr/local/sbin/exim -Mc 1nItwa-000ENz-8e mailnull 55439 0.0 0.0 26132 15632 - I 13:42 0:00.03 /usr/local/sbin/exim -Mc 1nIrjE-000EQ9-Ki mailnull 55722 0.0 0.0 26132 15632 - I 08:17 0:00.03 /usr/local/sbin/exim -Mc 1nImfM-000EUi-Bf mailnull 57242 0.0 0.0 25964 15504 - I Fri08 0:00.03 /usr/local/sbin/exim -Mc 1nIQR1-000EtE-2k mailnull 57528 0.0 0.0 26000 15528 - I Fri11 0:00.03 /usr/local/sbin/exim -Mc 1nIT1d-000Exq-Qj
Running one of these manually always shows a similar behaviour. The connection is not correctly closed after the smtp transaction:
[root@mxout013:~] # exim -v -Mc 1nJZii-0005m9-Ms LOG: MAIN Warning: purging the environment. Suggested action: use keep_environment. delivering 1nJZii-0005m9-MsConnecting to relay03.remote.net [2001:abcd::157]:25 ... TFO mode connection attempt to 2001:abcd::157, 0 data
connected SMTP<< 220 relay03.remote.net ESMTP Postfix (Debian/GNU) SMTP>> EHLO mxout013.local.net SMTP<< 250-relay03.remote.net 250-PIPELINING ---8<--- SMTP>> STARTTLS SMTP<< 220 2.0.0 Ready to start TLS SMTP>> EHLO mxout013.local.net SMTP<< 250-relay03.remote.net 250-PIPELINING ---8<--- SMTP|> MAIL FROM:<xxxx> SIZE=30441 SMTP|> RCPT TO:<xxxx> will write message using CHUNKING SMTP+> BDAT 3224 SMTP<< 250 2.1.0 OkSMTP<< 550 5.1.1 <xxxx>: Recipient address rejected: User unknown in relay recipient table
SMTP<< 554 5.5.1 Error: no valid recipients SMTP+> QUIT SMTP(TLS shutdown)>> SMTP(shutdown)>> SMTP<< 221 2.0.0 Bye --> Nothing happens any more but process keeps hanging ^CThe corresponding TCP connection can be found in netstats' output with a state of "FIN_WAIT_2". In fact there is an unusual high amount of connections in this state on this host and attaching `truss` to a stuck process showed no output. Killing the TCP connection with `tcpdrop` causes the stuck process to resume and finish.
The problem appears with different remote MX hosts as well as with IPv4 and IPv6 and is immediately resolved by downgrading back to 4.94.2.
Maybe this issue is related to the previous thread on this list. Regards Patrik
OpenPGP_signature
Description: OpenPGP digital signature
-- ## List details at https://lists.exim.org/mailman/listinfo/exim-users ## Exim details at http://www.exim.org/ ## Please use the Wiki with this list - http://wiki.exim.org/
