Error: timeout exceeded (in reply to end of DATA command)
Hi All, I manage a local intranet mail server which collects mails from our local users and sends them all via our public mail hub server. Everything was fine until few weeks ago. Some small part of our emails (about 10%) hangs in mail's server queue with error: timeout exceeded (in reply to end of DATA command)). Some emails stay in queue untill timeout expires (3 days) and returns to a sender with error. There are emails which stay in queue for a few hours (with the same error) but after serveral retries finally leave our server and reach recepients. Noticed following facts: - every timeout error is: (in reply to end of DATA command), there are no other errors - problem does not depend on email size (there were delayed emails with size range from 1kb to 5mb) Turned off: tcp window scalling, tcp ecn, smtp_connection_cache_on_demand - without success. In my opinion problem is on the mail hub server but I have no proofs. Talking with mail hub server's administrator but he said, that no others have problems with mails and that must be a problem with my email server. Now I stuck with my undelivered emails. Best regards. Tom. Logs System: Debian Wheezy postfix ver 2.8.3-1 -Queue ID- --Size-- Arrival Time -Sender/Recipient--- 4A9409365 4130 Tue Jun 14 09:59:13 a...@aaa.aaa.aa (host mailhub.aaa.aaa[NN.NN.N.NN] said: 421 4.4.2 mailhub.aaa.aaa Error: timeout exceeded (in reply to end of DATA command)) postconf -n alias_database = hash:/etc/aliases alias_maps = hash:/etc/aliases append_dot_mydomain = no biff = no bounce_queue_lifetime = 2d config_directory = /etc/postfix content_filter = vscan:[127.0.0.1]:10024 delay_warning_time = 24h disable_dns_lookups = yes header_checks = regexp:/etc/postfix/header_checks home_mailbox = Maildir/ inet_interfaces = merlin, 127.0.0.1 mailbox_command = /usr/bin/procmail -a $EXTENSION DEFAULT=$HOME/Maildir/ MAILDIR=$HOME/Maildir mailbox_size_limit = 51200 maximal_queue_lifetime = 3d message_size_limit = 10240 mydestination = $myhostname, $myhostname.$mydomain, localhost, AAA mydomain = AAA.AAA.AAA myhostname = mynetworks = 127.0.0.0/8, NN.NN.NN.NN/24 myorigin = /etc/mailname notify_classes = bounce, 2bounce, policy, protocol, resource, software recipient_delimiter = + relayhost = smtp_host_lookup = native smtpd_banner = $myhostname ESMTP $mail_name (Debian/GNU) transport_maps = hash:/etc/postfix/transport, regexp:/etc/sympa/transport
Re: Error: timeout exceeded (in reply to end of DATA command)
Tomasz Iwanowski: Hi All, I manage a local intranet mail server which collects mails from our local users and sends them all via our public mail hub server. Everything was fine until few weeks ago. Some small part of our emails (about 10%) hangs in mail's server queue with error: timeout exceeded (in reply to end of DATA command)). Some emails stay in queue untill timeout expires (3 days) and returns to a sender with error. There are emails which stay in queue for a few hours (with the same error) but after serveral retries finally leave our server and reach recepients. Noticed following facts: - every timeout error is: (in reply to end of DATA command), there are no other errors - problem does not depend on email size (there were delayed emails with size range from 1kb to 5mb) Turned off: tcp window scalling, tcp ecn, smtp_connection_cache_on_demand - without success. In my opinion problem is on the mail hub server but I have no proofs. Talking with mail hub server's administrator but he said, that no others have problems with mails and that must be a problem with my email server. Now I stuck with my undelivered emails. Try collecting a tcpdump recording. http://www.postfix.org/DEBUG_README.html#sniffer Command: # tcpdump -s 0 -w /file/name host server-ip-address and port 25 After some time, kill -INT the tcpdump process. Look in the logfile for a session that breaks, and find that session in the tcpdump recording. # tcpdump -nr /file/name | less Note the client tcp port, then extract that session: # tcpdump -nr /file/name -w file/name2 port xxx Then, contact Victor or me off-list. Wietse
Re: Error: timeout exceeded (in reply to end of DATA command)
On Wed, Jun 15, 2011 at 10:38:44AM -0400, Wietse Venema wrote: Command: # tcpdump -s 0 -w /file/name host server-ip-address and port 25 After some time, kill -INT the tcpdump process. Look in the logfile for a session that breaks, and find that session in the tcpdump recording. # tcpdump -nr /file/name | less Note the client tcp port, then extract that session: # tcpdump -nr /file/name -w file/name2 port xxx The second tcpdump may also need the -s 0 option to retain the full TCP payload. -- Viktor.
Re: Error: timeout exceeded (in reply to end of DATA command)
It's nice for a change to work with someone who provides actual information, instead of that dork from last week who was just venting his opinions and not giving people a chance to help. In this case it looks like a Solaris TCP bug when sending data over a connection that suffers from heavy packet loss. Apparently, Solaris 10 stops retransmitting, and therefore the other side times out. I'm not wrapping the packets this time, so you will have to widen your window to read this message comfortably. Wietse The three-way TCP handshake show no wscale options, so we don't have to worry about borked firewalls mis-handling this: 09:16:40.786945 IP 213.207.90.2.59301 145.222.14.10.25: S 2466228028:2466228028(0) win 49640 mss 1460,nop,nop,sackOK 09:16:40.789806 IP 145.222.14.10.25 213.207.90.2.59301: S 3886146351:3886146351(0) ack 2466228029 win 5840 mss 1380 09:16:40.789829 IP 213.207.90.2.59301 145.222.14.10.25: . ack 1 win 49680 The handshake shows a round-trip time of 2.9ms. The receiver's MTU is smaller than 1460, which suggests that his packets are encapsulated in some other protocol. It does not matter for the problem at hand. Next is a segment from the middle of transmission. The sender has sent a full window up to byte 22233, but apparently there was major packet loss after byte 5673. 09:16:51.869847 IP 145.222.14.10.25 213.207.90.2.59301: P ack 5673 win 16560 09:16:51.869861 IP 213.207.90.2.59301 145.222.14.10.25: P 20853:22233(1380) ack 137 win 49680 09:16:51.869874 IP 213.207.90.2.59301 145.222.14.10.25: . 5673:7053(1380) ack 137 win 49680 09:16:51.869976 IP 145.222.14.10.25 213.207.90.2.59301: P ack 5673 win 16560 09:16:51.869977 IP 145.222.14.10.25 213.207.90.2.59301: P ack 5673 win 16560 09:16:51.869978 IP 145.222.14.10.25 213.207.90.2.59301: P ack 5673 win 16560 09:16:51.870110 IP 145.222.14.10.25 213.207.90.2.59301: P ack 5673 win 16560 09:16:51.870111 IP 145.222.14.10.25 213.207.90.2.59301: P ack 5673 win 16560 09:16:51.870376 IP 145.222.14.10.25 213.207.90.2.59301: P ack 5673 win 16560 09:16:56.216513 IP 213.207.90.2.59301 145.222.14.10.25: . 5673:7053(1380) ack 137 win 49680 09:16:56.260383 IP 145.222.14.10.25 213.207.90.2.59301: . ack 7053 win 19320 09:16:56.260414 IP 213.207.90.2.59301 145.222.14.10.25: . 7053:8433(1380) ack 137 win 49680 09:16:56.260421 IP 213.207.90.2.59301 145.222.14.10.25: . 8433:9813(1380) ack 137 win 49680 09:16:56.263577 IP 145.222.14.10.25 213.207.90.2.59301: . ack 8433 win 22080 09:16:56.263588 IP 213.207.90.2.59301 145.222.14.10.25: . 9813:11193(1380) ack 137 win 49680 09:16:56.263610 IP 213.207.90.2.59301 145.222.14.10.25: . 11193:12573(1380) ack 137 win 49680 09:16:56.263844 IP 145.222.14.10.25 213.207.90.2.59301: . ack 9813 win 24840 09:16:56.263855 IP 213.207.90.2.59301 145.222.14.10.25: . 12573:13953(1380) ack 137 win 49680 09:16:56.263865 IP 213.207.90.2.59301 145.222.14.10.25: . 13953:15333(1380) ack 137 win 49680 09:16:56.266641 IP 145.222.14.10.25 213.207.90.2.59301: P ack 9813 win 24840 09:16:56.266776 IP 145.222.14.10.25 213.207.90.2.59301: . ack 11193 win 28980 09:16:56.266777 IP 145.222.14.10.25 213.207.90.2.59301: P ack 11193 win 28980 09:16:56.266800 IP 213.207.90.2.59301 145.222.14.10.25: . 15333:16713(1380) ack 137 win 49680 Sequences like this repeat through the entire session. This network connection is so bad that your machine can send only 107656 bytes in 85 seconds or 1.27 kbyte/s. I notice there are many sequences like this: 09:16:51.869844 IP 145.222.14.10.25 213.207.90.2.59301: P ack 5673 win 16560 09:16:51.869845 IP 145.222.14.10.25 213.207.90.2.59301: P ack 5673 win 16560 09:16:51.869846 IP 145.222.14.10.25 213.207.90.2.59301: P ack 5673 win 16560 09:16:51.869847 IP 145.222.14.10.25 213.207.90.2.59301: P ack 5673 win 16560 and this: 09:16:51.869976 IP 145.222.14.10.25 213.207.90.2.59301: P ack 5673 win 16560 09:16:51.869977 IP 145.222.14.10.25 213.207.90.2.59301: P ack 5673 win 16560 09:16:51.869978 IP 145.222.14.10.25 213.207.90.2.59301: P ack 5673 win 16560 and this: 09:17:20.597737 IP 145.222.14.10.25 213.207.90.2.59301: P ack 40173 win 32767 09:17:20.597738 IP 145.222.14.10.25 213.207.90.2.59301: P ack 40173 win 32767 09:17:20.597739 IP 145.222.14.10.25 213.207.90.2.59301: P ack 40173 win 32767 And many more. This is consistent with network congestion. These identical ACKs would have been sent several ms separated from each other (remember, the round-trip time is about 3ms). Apparently the ACKs were queued up behind other data that was clogging some pipe somewere, and so they ended up traveling together. Network congestion is also consistent with increased packet loss. And now for the surprise: after 09:18:16.515592 the sender stops retransmitting. That would be a sender TCP bug. The first thing that happens next is 30s later, at 09:18:46.512567, when the server reports a 421 timeout error and closes the connection. 09:18:16.512400 IP 213.207.90.2.59301
Re: Error: timeout exceeded (in reply to end of DATA command)
On Thu, 21 Jan 2010 10:30:54 -0500 (EST), wie...@porcupine.org (Wietse Venema) wrote: It's nice for a change to work with someone who provides actual information, instead of that dork from last week who was just venting his opinions and not giving people a chance to help. I don't know how one can help me if I don't supply info ;) In this case it looks like a Solaris TCP bug when sending data over a connection that suffers from heavy packet loss. Apparently, Solaris 10 stops retransmitting, and therefore the other side times out. Ok, this is useful help. I did reroute the message via a linux postfix server and now it is successfully delivered. I try to find out if I can solve the Solaris bug but at least our customer is happy now! I'm not wrapping the packets this time, so you will have to widen your window to read this message comfortably. With your explanation I'm able to read the tcp dump, one day I will fully understand it... Wietse The three-way TCP handshake show no wscale options, so we don't have to worry about borked firewalls mis-handling this: 09:16:40.786945 IP 213.207.90.2.59301 145.222.14.10.25: S 2466228028:2466228028(0) win 49640 mss 1460,nop,nop,sackOK 09:16:40.789806 IP 145.222.14.10.25 213.207.90.2.59301: S 3886146351:3886146351(0) ack 2466228029 win 5840 mss 1380 09:16:40.789829 IP 213.207.90.2.59301 145.222.14.10.25: . ack 1 win 49680 The handshake shows a round-trip time of 2.9ms. The receiver's MTU is smaller than 1460, which suggests that his packets are encapsulated in some other protocol. It does not matter for the problem at hand. Next is a segment from the middle of transmission. The sender has sent a full window up to byte 22233, but apparently there was major packet loss after byte 5673. 09:16:51.869847 IP 145.222.14.10.25 213.207.90.2.59301: P ack 5673 win 16560 09:16:51.869861 IP 213.207.90.2.59301 145.222.14.10.25: P 20853:22233(1380) ack 137 win 49680 09:16:51.869874 IP 213.207.90.2.59301 145.222.14.10.25: . 5673:7053(1380) ack 137 win 49680 09:16:51.869976 IP 145.222.14.10.25 213.207.90.2.59301: P ack 5673 win 16560 09:16:51.869977 IP 145.222.14.10.25 213.207.90.2.59301: P ack 5673 win 16560 09:16:51.869978 IP 145.222.14.10.25 213.207.90.2.59301: P ack 5673 win 16560 09:16:51.870110 IP 145.222.14.10.25 213.207.90.2.59301: P ack 5673 win 16560 09:16:51.870111 IP 145.222.14.10.25 213.207.90.2.59301: P ack 5673 win 16560 09:16:51.870376 IP 145.222.14.10.25 213.207.90.2.59301: P ack 5673 win 16560 09:16:56.216513 IP 213.207.90.2.59301 145.222.14.10.25: . 5673:7053(1380) ack 137 win 49680 09:16:56.260383 IP 145.222.14.10.25 213.207.90.2.59301: . ack 7053 win 19320 09:16:56.260414 IP 213.207.90.2.59301 145.222.14.10.25: . 7053:8433(1380) ack 137 win 49680 09:16:56.260421 IP 213.207.90.2.59301 145.222.14.10.25: . 8433:9813(1380) ack 137 win 49680 09:16:56.263577 IP 145.222.14.10.25 213.207.90.2.59301: . ack 8433 win 22080 09:16:56.263588 IP 213.207.90.2.59301 145.222.14.10.25: . 9813:11193(1380) ack 137 win 49680 09:16:56.263610 IP 213.207.90.2.59301 145.222.14.10.25: . 11193:12573(1380) ack 137 win 49680 09:16:56.263844 IP 145.222.14.10.25 213.207.90.2.59301: . ack 9813 win 24840 09:16:56.263855 IP 213.207.90.2.59301 145.222.14.10.25: . 12573:13953(1380) ack 137 win 49680 09:16:56.263865 IP 213.207.90.2.59301 145.222.14.10.25: . 13953:15333(1380) ack 137 win 49680 09:16:56.266641 IP 145.222.14.10.25 213.207.90.2.59301: P ack 9813 win 24840 09:16:56.266776 IP 145.222.14.10.25 213.207.90.2.59301: . ack 11193 win 28980 09:16:56.266777 IP 145.222.14.10.25 213.207.90.2.59301: P ack 11193 win 28980 09:16:56.266800 IP 213.207.90.2.59301 145.222.14.10.25: . 15333:16713(1380) ack 137 win 49680 Sequences like this repeat through the entire session. This network connection is so bad that your machine can send only 107656 bytes in 85 seconds or 1.27 kbyte/s. I notice there are many sequences like this: 09:16:51.869844 IP 145.222.14.10.25 213.207.90.2.59301: P ack 5673 win 16560 09:16:51.869845 IP 145.222.14.10.25 213.207.90.2.59301: P ack 5673 win 16560 09:16:51.869846 IP 145.222.14.10.25 213.207.90.2.59301: P ack 5673 win 16560 09:16:51.869847 IP 145.222.14.10.25 213.207.90.2.59301: P ack 5673 win 16560 and this: 09:16:51.869976 IP 145.222.14.10.25 213.207.90.2.59301: P ack 5673 win 16560 09:16:51.869977 IP 145.222.14.10.25 213.207.90.2.59301: P ack 5673 win 16560 09:16:51.869978 IP 145.222.14.10.25 213.207.90.2.59301: P ack 5673 win 16560 and this: 09:17:20.597737 IP 145.222.14.10.25 213.207.90.2.59301: P ack 40173 win 32767 09:17:20.597738 IP 145.222.14.10.25 213.207.90.2.59301: P ack 40173 win 32767 09:17:20.597739 IP 145.222.14.10.25 213.207.90.2.59301: P ack 40173 win 32767 And many more. This is consistent with network congestion. These identical ACKs would have been sent several ms separated from each other
Re: Error: timeout exceeded (in reply to end of DATA command)
Zitat von Wietse Venema wie...@porcupine.org: It's nice for a change to work with someone who provides actual information, instead of that dork from last week who was just venting his opinions and not giving people a chance to help. In this case it looks like a Solaris TCP bug when sending data over a connection that suffers from heavy packet loss. Apparently, Solaris 10 stops retransmitting, and therefore the other side times out. Without following the thread closely and without examining the dump data, it reminds me of an outstanding problem we have with one site showing the same timeout after data for mail exceeding around 150kb. As far as i have found out it seems to be a expensive firewall running on solaris with extended protection on, munging with SACK and window-scaling at some point. The problem manifests in the dump with many retransmits because of missing segments (the remote has 10x bandwidth than we have) which were reordered by SACK and the remote end totaly ditching one missing segment at some point and sending over and over again some later data until transfer totaly stops and the timeout jump in. I can provide raw dump data in private for such a case. Unfortunately the remote side was not that interested to solve the problem so i give up on this. Regards Andreas smime.p7s Description: S/MIME Signatur
Error: timeout exceeded (in reply to end of DATA command)
Hi list, I have a problem with delivering mail to a host and get this error: host mx2.amsterdam.nl[145.222.14.10] said: 421 enepmx02.amsterdam.nl Error: timeout exceeded (in reply to end of DATA command) This error only seems to occur with 'large' mails. Currently I have a mail of ~600KB and ~8MB stuck in the queue. I don't think this is a postfix issue on our site but an issue with the mailserver on the other site. What can cause such issues? Thanks, Martijn -- YoungGuns Kasteleinenkampweg 7b 5222 AX 's-Hertogenbosch T. 073 623 56 40 F. 073 623 56 39 www.youngguns.nl KvK 18076568
Re: Error: timeout exceeded (in reply to end of DATA command)
On Wed, 20 Jan 2010 10:56:39 +0100, Martijn de Munnik mart...@youngguns.nl wrote: Hi list, I have a problem with delivering mail to a host and get this error: host mx2.amsterdam.nl[145.222.14.10] said: 421 enepmx02.amsterdam.nl Error: timeout exceeded (in reply to end of DATA command) This error only seems to occur with 'large' mails. Currently I have a mail of ~600KB and ~8MB stuck in the queue. I don't think this is a postfix issue on our site but an issue with the mailserver on the other site. What can cause such issues? I tried the http://ftp.nluug.nl/mail/postfix/faq.html#timeouts Cisco PIX workaround but that doesn't make any difference? I think the remote site is running PIX: Connected to mx2.amsterdam.nl (145.222.14.10). Escape character is '^]'. 220 ** postconf -n address_verify_map = btree:${data_directory}/verify alias_maps = hash:/opt/csw/etc/postfix/maps/aliases body_checks = regexp:/opt/csw/etc/postfix/maps/body_checks broken_sasl_auth_clients = yes command_directory = /opt/csw/sbin config_directory = /etc/postfix content_filter = amavisfeed:localhost:10024 daemon_directory = /opt/csw/libexec/postfix data_directory = /opt/csw/var/lib/postfix default_database_type = hash delay_warning_time = 4h disable_vrfy_command = yes header_checks = regexp:/opt/csw/etc/postfix/maps/header_checks home_mailbox = Maildir/ html_directory = /opt/csw/share/doc/postfix/html inet_interfaces = all mailbox_command = /opt/csw/bin/procmail-wrapper -o -a $DOMAIN -d $LOGNAME mailbox_size_limit = 0 mailq_path = /opt/csw/bin/mailq manpage_directory = /opt/csw/share/man maximal_backoff_time = 8000s maximal_queue_lifetime = 7d message_size_limit = 52428800 mime_header_checks = regexp:/opt/csw/etc/postfix/maps/mime_header_checks minimal_backoff_time = 1000s mydestination = $myhostname, localhost.$mydomain myhostname = stevie.youngguns.nl mynetworks_style = host myorigin = $myhostname newaliases_path = /opt/csw/bin/newaliases readme_directory = /opt/csw/share/doc/postfix/README_FILES receive_override_options = no_address_mappings recipient_delimiter = + relay_domains = slagenlandwonen.nl, wfcommunicatie.nl, gooischebrink.com, interjute.nl, melamo.nl, fair-play.nl, loopbaankamer.nl, ospl.nl, ospl.de, printcontrol.nl, dankers-schilderwerken.nl, promonta.nl, interim-denbosch.nl, vanherpt.biz, rodersana.nl relayhost = sample_directory = /opt/csw/share/doc/postfix/samples sendmail_path = /opt/csw/sbin/sendmail smtp_bind_address = 213.207.90.2 smtp_helo_timeout = 60s smtp_pix_workaround_delay_time = 10s smtp_send_xforward_command = yes smtp_skip_quit_response = yes smtp_tls_loglevel = 1 smtp_tls_security_level = may smtp_tls_session_cache_database = btree:${data_directory}/smtp_scache smtpd_banner = $myhostname ESMTP smtpd_client_connection_count_limit = 10 smtpd_data_restrictions = reject_unauth_pipelining smtpd_delay_reject = yes smtpd_hard_error_limit = 12 smtpd_helo_required = yes smtpd_recipient_limit = 100 smtpd_recipient_restrictions = permit_mynetworks, permit_sasl_authenticated, check_client_access hash:/opt/csw/etc/postfix/maps/relay_access, reject_non_fqdn_recipient, reject_non_fqdn_sender, reject_unknown_sender_domain, reject_unauth_destination, reject_non_fqdn_helo_hostname, reject_rbl_client virbl.dnsbl.bit.nl, check_policy_service inet:127.0.0.1:12525, check_sender_mx_access cidr:/opt/csw/etc/postfix/maps/bogon_networks, check_client_access cidr:/opt/csw/etc/postfix/maps/drop, check_sender_mx_access cidr:/opt/csw/etc/postfix/maps/drop, check_sender_ns_access cidr:/opt/csw/etc/postfix/maps/drop, reject_unverified_recipient, check_client_accesscidr:/opt/csw/etc/postfix/maps/dnswl_header, check_client_accesscidr:/opt/csw/etc/postfix/maps/dnswl_permit, check_policy_service inet:127.0.0.1:10023 smtpd_sasl_auth_enable = yes smtpd_sasl_authenticated_header = yes smtpd_sasl_local_domain = $myhostname smtpd_sasl_path = private/auth smtpd_sasl_security_options = noanonymous smtpd_sasl_type = dovecot smtpd_soft_error_limit = 3 smtpd_tls_cert_file = /home/yghosting/ssl/secure-youngguns-nl.pem smtpd_tls_key_file = /home/yghosting/ssl/secure-youngguns-nl.key smtpd_tls_loglevel = 1 smtpd_tls_security_level = may smtpd_tls_session_cache_database = btree:${data_directory}/smtpd_scache soft_bounce = no strict_rfc821_envelopes = yes tls_random_source = dev:/dev/urandom transport_maps = hash:/opt/csw/etc/postfix/maps/transport unknown_address_reject_code = 550 unknown_hostname_reject_code = 550 unknown_local_recipient_reject_code = 550 unverified_recipient_reject_code = 550 virtual_alias_maps = hash:/opt/csw/etc/postfix/maps/virtual Thanks, Martijn -- YoungGuns Kasteleinenkampweg 7b 5222 AX 's-Hertogenbosch T. 073 623 56 40 F. 073 623 56 39 www.youngguns.nl KvK 18076568
Re: Error: timeout exceeded (in reply to end of DATA command)
On Wed, 20 Jan 2010 11:10:50 +0100, Martijn de Munnik mart...@youngguns.nl wrote: On Wed, 20 Jan 2010 10:56:39 +0100, Martijn de Munnik mart...@youngguns.nl wrote: Hi list, I have a problem with delivering mail to a host and get this error: host mx2.amsterdam.nl[145.222.14.10] said: 421 enepmx02.amsterdam.nl Error: timeout exceeded (in reply to end of DATA command) I also found a mail in the queue to mx-cluster1.one.com with exactly the same problem, this mail is also ~600kb. This error only seems to occur with 'large' mails. Currently I have a mail of ~600KB and ~8MB stuck in the queue. I don't think this is a postfix issue on our site but an issue with the mailserver on the other site. What can cause such issues? I tried the http://ftp.nluug.nl/mail/postfix/faq.html#timeouts Cisco PIX workaround but that doesn't make any difference? I think the remote site is running PIX: Connected to mx2.amsterdam.nl (145.222.14.10). Escape character is '^]'. 220 ** postconf -n address_verify_map = btree:${data_directory}/verify alias_maps = hash:/opt/csw/etc/postfix/maps/aliases body_checks = regexp:/opt/csw/etc/postfix/maps/body_checks broken_sasl_auth_clients = yes command_directory = /opt/csw/sbin config_directory = /etc/postfix content_filter = amavisfeed:localhost:10024 daemon_directory = /opt/csw/libexec/postfix data_directory = /opt/csw/var/lib/postfix default_database_type = hash delay_warning_time = 4h disable_vrfy_command = yes header_checks = regexp:/opt/csw/etc/postfix/maps/header_checks home_mailbox = Maildir/ html_directory = /opt/csw/share/doc/postfix/html inet_interfaces = all mailbox_command = /opt/csw/bin/procmail-wrapper -o -a $DOMAIN -d $LOGNAME mailbox_size_limit = 0 mailq_path = /opt/csw/bin/mailq manpage_directory = /opt/csw/share/man maximal_backoff_time = 8000s maximal_queue_lifetime = 7d message_size_limit = 52428800 mime_header_checks = regexp:/opt/csw/etc/postfix/maps/mime_header_checks minimal_backoff_time = 1000s mydestination = $myhostname, localhost.$mydomain myhostname = stevie.youngguns.nl mynetworks_style = host myorigin = $myhostname newaliases_path = /opt/csw/bin/newaliases readme_directory = /opt/csw/share/doc/postfix/README_FILES receive_override_options = no_address_mappings recipient_delimiter = + relay_domains = slagenlandwonen.nl, wfcommunicatie.nl, gooischebrink.com, interjute.nl, melamo.nl, fair-play.nl, loopbaankamer.nl, ospl.nl, ospl.de, printcontrol.nl, dankers-schilderwerken.nl, promonta.nl, interim-denbosch.nl, vanherpt.biz, rodersana.nl relayhost = sample_directory = /opt/csw/share/doc/postfix/samples sendmail_path = /opt/csw/sbin/sendmail smtp_bind_address = 213.207.90.2 smtp_helo_timeout = 60s smtp_pix_workaround_delay_time = 10s smtp_send_xforward_command = yes smtp_skip_quit_response = yes smtp_tls_loglevel = 1 smtp_tls_security_level = may smtp_tls_session_cache_database = btree:${data_directory}/smtp_scache smtpd_banner = $myhostname ESMTP smtpd_client_connection_count_limit = 10 smtpd_data_restrictions = reject_unauth_pipelining smtpd_delay_reject = yes smtpd_hard_error_limit = 12 smtpd_helo_required = yes smtpd_recipient_limit = 100 smtpd_recipient_restrictions = permit_mynetworks, permit_sasl_authenticated, check_client_access hash:/opt/csw/etc/postfix/maps/relay_access, reject_non_fqdn_recipient, reject_non_fqdn_sender, reject_unknown_sender_domain, reject_unauth_destination, reject_non_fqdn_helo_hostname, reject_rbl_client virbl.dnsbl.bit.nl, check_policy_service inet:127.0.0.1:12525, check_sender_mx_access cidr:/opt/csw/etc/postfix/maps/bogon_networks, check_client_access cidr:/opt/csw/etc/postfix/maps/drop, check_sender_mx_access cidr:/opt/csw/etc/postfix/maps/drop, check_sender_ns_access cidr:/opt/csw/etc/postfix/maps/drop, reject_unverified_recipient, check_client_accesscidr:/opt/csw/etc/postfix/maps/dnswl_header, check_client_accesscidr:/opt/csw/etc/postfix/maps/dnswl_permit, check_policy_service inet:127.0.0.1:10023 smtpd_sasl_auth_enable = yes smtpd_sasl_authenticated_header = yes smtpd_sasl_local_domain = $myhostname smtpd_sasl_path = private/auth smtpd_sasl_security_options = noanonymous smtpd_sasl_type = dovecot smtpd_soft_error_limit = 3 smtpd_tls_cert_file = /home/yghosting/ssl/secure-youngguns-nl.pem smtpd_tls_key_file = /home/yghosting/ssl/secure-youngguns-nl.key smtpd_tls_loglevel = 1 smtpd_tls_security_level = may smtpd_tls_session_cache_database = btree:${data_directory}/smtpd_scache soft_bounce = no strict_rfc821_envelopes = yes tls_random_source = dev:/dev/urandom transport_maps = hash:/opt/csw/etc/postfix/maps/transport unknown_address_reject_code = 550 unknown_hostname_reject_code = 550 unknown_local_recipient_reject_code = 550 unverified_recipient_reject_code = 550 virtual_alias_maps = hash:/opt/csw/etc/postfix/maps/virtual
Re: Error: timeout exceeded (in reply to end of DATA command)
On 2010-01-20 Martijn de Munnik wrote: I have a problem with delivering mail to a host and get this error: host mx2.amsterdam.nl[145.222.14.10] said: 421 enepmx02.amsterdam.nl Error: timeout exceeded (in reply to end of DATA command) This error only seems to occur with 'large' mails. Currently I have a mail of ~600KB and ~8MB stuck in the queue. I don't think this is a postfix issue on our site but an issue with the mailserver on the other site. What can cause such issues? AFAICS: slow connections, high connection concurrency (leaving little bandwidth per connection), proxy filter or policy service taking too long for checking the mail, etc. However, since the timeout occurs on the remote side, you should contact the administrator of that server about the issue. Regards Ansgar Wiechers -- Abstractions save us time working, but they don't save us time learning. --Joel Spolsky
Re: Error: timeout exceeded (in reply to end of DATA command)
Martijn de Munnik: Hi list, I have a problem with delivering mail to a host and get this error: host mx2.amsterdam.nl[145.222.14.10] said: 421 enepmx02.amsterdam.nl Error: timeout exceeded (in reply to end of DATA command) This error only seems to occur with 'large' mails. Currently I have a mail of ~600KB and ~8MB stuck in the queue. I don't think this is a postfix issue on our site but an issue with the mailserver on the other site. What can cause such issues? Record a tcpdump trace. The way the session fails will indicate the kind of problem (MTU, Window scaling, and so on). http://www.postfix.org/DEBUG_README.html Wietse
Re: Error: timeout exceeded (in reply to end of DATA command)
On Wed, 20 Jan 2010 07:20:01 -0500 (EST), wie...@porcupine.org (Wietse Venema) wrote: Martijn de Munnik: Hi list, I have a problem with delivering mail to a host and get this error: host mx2.amsterdam.nl[145.222.14.10] said: 421 enepmx02.amsterdam.nl Error: timeout exceeded (in reply to end of DATA command) This error only seems to occur with 'large' mails. Currently I have a mail of ~600KB and ~8MB stuck in the queue. I don't think this is a postfix issue on our site but an issue with the mailserver on the other site. What can cause such issues? Record a tcpdump trace. The way the session fails will indicate the kind of problem (MTU, Window scaling, and so on). http://www.postfix.org/DEBUG_README.html Wietse Ok, I tried that and I'm not really sure where to look for. I opened the tcpdump file in wireshark and there are a lot of warnings and notes in the file. -- Notes: Duplicate ACK(#1) [145.222.14.10 - 213.207.90.2] Duplicate ACK(#2) [145.222.14.10 - 213.207.90.2] Duplicate ACK(#3) [145.222.14.10 - 213.207.90.2] Duplicate ACK(#4) [145.222.14.10 - 213.207.90.2] . . . Duplicate ACK(#44) [145.222.14.10 - 213.207.90.2] Retransmission (suspected) [213.207.90.2 - 145.222.14.10] Warnings: Fast retransmission (suspected) [213.207.90.2 - 145.222.14.10] Out-Of-Order segment [213.207.90.2 - 145.222.14.10] -- This is abracadabra for me ;) Martijn -- YoungGuns Kasteleinenkampweg 7b 5222 AX 's-Hertogenbosch T. 073 623 56 40 F. 073 623 56 39 www.youngguns.nl KvK 18076568
Re: Error: timeout exceeded (in reply to end of DATA command)
Martijn de Munnik: On Wed, 20 Jan 2010 07:20:01 -0500 (EST), wie...@porcupine.org (Wietse Venema) wrote: Martijn de Munnik: Hi list, I have a problem with delivering mail to a host and get this error: host mx2.amsterdam.nl[145.222.14.10] said: 421 enepmx02.amsterdam.nl Error: timeout exceeded (in reply to end of DATA command) This error only seems to occur with 'large' mails. Currently I have a mail of ~600KB and ~8MB stuck in the queue. I don't think this is a postfix issue on our site but an issue with the mailserver on the other site. What can cause such issues? Record a tcpdump trace. The way the session fails will indicate the kind of problem (MTU, Window scaling, and so on). http://www.postfix.org/DEBUG_README.html Wietse Ok, I tried that and I'm not really sure where to look for. I opened the tcpdump file in wireshark and there are a lot of warnings and notes in the file. -- Notes: Duplicate ACK(#1) [145.222.14.10 - 213.207.90.2] Duplicate ACK(#2) [145.222.14.10 - 213.207.90.2] Duplicate ACK(#3) [145.222.14.10 - 213.207.90.2] Duplicate ACK(#4) [145.222.14.10 - 213.207.90.2] . . . Duplicate ACK(#44) [145.222.14.10 - 213.207.90.2] Retransmission (suspected) [213.207.90.2 - 145.222.14.10] Warnings: Fast retransmission (suspected) [213.207.90.2 - 145.222.14.10] Out-Of-Order segment [213.207.90.2 - 145.222.14.10] -- This is abracadabra for me ;) If you can make the tcpdump -nr /file/name output available then people who understand TCP/IP can look at it. Wietse
Re: Error: timeout exceeded (in reply to end of DATA command)
Here's the TCP initial handshake: 17:30:44.951789 IP 213.207.90.2.48147 145.222.14.10.25: S 50514820:50514820(0) win 49640 mss 1460,nop,wscale 0,nop,nop,sackOK 17:30:44.954496 IP 145.222.14.10.25 213.207.90.2.48147: S 4148480248:4148480248(0) ack 50514821 win 5840 mss 1380,nop,wscale 2 17:30:44.954519 IP 213.207.90.2.48147 145.222.14.10.25: . ack 1 win 49680 Later, as the receiver processes the network packets, it acknowledges the data received, sends its receive window size (how much more it is willing to receive). Above, with wscale 2 the server at 145.222.14.10 announces that its TCP receive window value needs to be multiplied by a factor of 4 (binary number shifted left by 2). But, there is a broken router in the path that does not understand window scaling. Here is an example of what gets f-ed up: 17:30:45.41 IP 213.207.90.2.48147 145.222.14.10.25: . 20853:22233(1380) ack 137 win 49680 17:30:45.412230 IP 213.207.90.2.48147 145.222.14.10.25: . 22233:23613(1380) ack 137 win 49680 17:30:45.412249 IP 213.207.90.2.48147 145.222.14.10.25: P 23613:24993(1380) ack 137 win 49680 17:30:45.412747 IP 145.222.14.10.25 213.207.90.2.48147: P ack 8433 win 5800 17:30:45.412748 IP 145.222.14.10.25 213.207.90.2.48147: P ack 8433 win 5800 17:30:45.412749 IP 145.222.14.10.25 213.207.90.2.48147: P ack 8433 win 5800 The receiver says they can receive bytes 8433-31633, but the broken router does not know that 5800 needs to be multiplied by 4, and it thinks the receiver can receive only bytes 8433-14233. The broken router then throws away the bytes with higher sequence numbers than 14233. Workaround: turn off window scaling support on the sender's kernel. Wietse
Re: Error: timeout exceeded (in reply to end of DATA command)
On Wed, Jan 20, 2010 at 03:22:56PM -0500, Wietse Venema wrote: The broken router then throws away the bytes with higher sequence numbers than 14233. Workaround: turn off window scaling support on the sender's kernel. This problem is sufficiently common, that on Linux MTAs I always add: net.ipv4.tcp_window_scaling = 0 to sysctl.conf. Adjust for other systems as necessary. This hurts long-haul throughput, but email tolerates latency, provided most of your outbound traffic is not a high-bandwidth channel to Mars (but then you would not be using TCP anyway...) -- Viktor. Disclaimer: off-list followups get on-list replies or get ignored. Please do not ignore the Reply-To header. To unsubscribe from the postfix-users list, visit http://www.postfix.org/lists.html or click the link below: mailto:majord...@postfix.org?body=unsubscribe%20postfix-users If my response solves your problem, the best way to thank me is to not send an it worked, thanks follow-up. If you must respond, please put It worked, thanks in the Subject so I can delete these quickly.
Re: Error: timeout exceeded (in reply to end of DATA command)
On Jan 20, 2010, at 9:28 PM, Victor Duchovni wrote: On Wed, Jan 20, 2010 at 03:22:56PM -0500, Wietse Venema wrote: The broken router then throws away the bytes with higher sequence numbers than 14233. Workaround: turn off window scaling support on the sender's kernel. This problem is sufficiently common, that on Linux MTAs I always add: net.ipv4.tcp_window_scaling = 0 I'm running Solaris 10 x86 and I did sudo ndd -set /dev/tcp tcp_wscale_always 0 before I did this the value was 1. After I did this I flushed the queue but the messages stay stuck in the queue with the same problem. I'm not sure this is the right kernel parameter for Solaris? /etc/system has no specific setting for tcp, so everything is default Solaris 10. Wietse, the broken router you mentioned, could that be a Cisco PIX on the receivers site? Jan 20 22:58:43 stevie.youngguns.nl postfix/smtp[18765]: [ID 197553 mail.info] 8A5553BA0C: enabling PIX workarounds: disable_esmtp delay_dotcrlf for mx2.amsterdam.nl[145.222.14.10]:25 Thanks, Martijn to sysctl.conf. Adjust for other systems as necessary. This hurts long-haul throughput, but email tolerates latency, provided most of your outbound traffic is not a high-bandwidth channel to Mars (but then you would not be using TCP anyway...) -- Viktor. Disclaimer: off-list followups get on-list replies or get ignored. Please do not ignore the Reply-To header. To unsubscribe from the postfix-users list, visit http://www.postfix.org/lists.html or click the link below: mailto:majord...@postfix.org?body=unsubscribe%20postfix-users If my response solves your problem, the best way to thank me is to not send an it worked, thanks follow-up. If you must respond, please put It worked, thanks in the Subject so I can delete these quickly.
Re: Error: timeout exceeded (in reply to end of DATA command)
Martijn de Munnik: On Jan 20, 2010, at 9:28 PM, Victor Duchovni wrote: On Wed, Jan 20, 2010 at 03:22:56PM -0500, Wietse Venema wrote: The broken router then throws away the bytes with higher sequence numbers than 14233. Workaround: turn off window scaling support on the sender's kernel. This problem is sufficiently common, that on Linux MTAs I always add: net.ipv4.tcp_window_scaling = 0 I'm running Solaris 10 x86 and I did sudo ndd -set /dev/tcp tcp_wscale_always 0 before I did this the value was 1. After I did this I flushed the queue but the messages stay stuck in the queue with the same problem. I'm not sure this is the right kernel parameter for Solaris? You can do ndd /dev/tcp \? to find out what parameters are supported. On my Solaris9 and Solaris10 test boxes it is called tcp_wscale_always. According to Solaris10 documentation: When this parameter is enabled, which is the default setting [since Solaris10], TCP always sends a SYN segment with the window scale option, even if the window scale option value is 0. So, please make another recording and report if setting this to 0 causes Solaris to stop sending wscale 0 options. If the sender does not send a wscale option, then the receiver is not supposed to send a wscale option, either. Your tcpdump recording should confirm that. /etc/system has no specific setting for tcp, so everything is default Solaris 10. Wietse, the broken router you mentioned, could that be a Cisco PIX on the receivers site? I maintain no list of CISCO PIX firewall bugs. It can also be a router at your end. Never exclude that possibility. Jan 20 22:58:43 stevie.youngguns.nl postfix/smtp[18765]: [ID 197553 mail.info] 8A5553BA0C: enabling PIX workarounds: disable_esmtp delay_dotcrlf for mx2.amsterdam.nl[145.222.14.10]:25 That's a PIX bug workaround for the fact that their engone parses only one packet at a time and therefore misses SMTP commands and CRLF.CRLF when these sit across packet boundaries. Wietse
Re: Error: timeout exceeded (in reply to end of DATA command)
Wietse Venema: You can do ndd /dev/tcp \? to find out what parameters are supported. On my Solaris9 and Solaris10 test boxes it is called tcp_wscale_always. According to Solaris10 documentation: When this parameter is enabled, which is the default setting [since Solaris10], TCP always sends a SYN segment with the window scale option, even if the window scale option value is 0. With the default tcp_wscale_always setting, making a connection from a Solaris 10 box to FreeBSD 8.0: 20:13:59.808828 IP 168.100.189.17.32799 168.100.189.10.25: Flags [S], seq 11835, win 49640, options [mss 1460,nop,wscale 0,nop,nop,sackOK], length 0 20:13:59.808892 IP 168.100.189.10.25 168.100.189.17.32799: Flags [S.], seq 538094055, ack 11836, win 65535, options [mss 1460,nop,wscale 3,sackOK,eol], length 0 20:13:59.809327 IP 168.100.189.17.32799 168.100.189.10.25: Flags [.], ack 1, win 49640, length 0 Same system with tcp_wscale_always set to zero: 20:14:52.736959 IP 168.100.189.17.32800 168.100.189.10.25: Flags [S], seq 131413865, win 49640, options [mss 1460,nop,nop,sackOK], length 0 20:14:52.737016 IP 168.100.189.10.25 168.100.189.17.32800: Flags [S.], seq 3072042607, ack 131413866, win 65535, options [mss 1460,sackOK,eol], length 0 20:14:52.737581 IP 168.100.189.17.32800 168.100.189.10.25: Flags [.], ack 1, win 49640, length 0 Thus, Solaris 10 does not send wscale, and neither should the remote server. If this does not make your mail move, then you need to collect another tcpdump recording. In that case mail was not moving because of multiple problems. Wietse