Error: timeout exceeded (in reply to end of DATA command)

2011-06-15 Thread Tomasz Iwanowski

Hi All,

I manage a local intranet mail server which collects mails from our local users
and sends them all via our public mail hub server.

Everything was fine until few weeks ago. Some small part of our emails (about 
10%) hangs
in mail's server queue with error: timeout exceeded (in reply to end of DATA 
command)).
Some emails stay in queue untill timeout expires (3 days) and returns to a 
sender with error.
There are emails which stay in queue for a few hours (with the same error)
but after serveral retries finally leave our server and reach recepients.

Noticed following facts:
- every timeout error is: (in reply to end of DATA command), there are no other 
errors
- problem does not depend on email size (there were delayed emails with size 
range from 1kb to 5mb)

Turned off: tcp window scalling, tcp ecn, smtp_connection_cache_on_demand - 
without success.

In my opinion problem is on the mail hub server but I have no proofs.
Talking with mail hub server's administrator but he said, that no others have 
problems with mails
and that must be a problem with my email server.
Now I stuck with my undelivered emails.


Best regards.
Tom.



Logs

System: Debian Wheezy
postfix ver 2.8.3-1



-Queue ID- --Size-- Arrival Time -Sender/Recipient---
4A9409365 4130 Tue Jun 14 09:59:13  a...@aaa.aaa.aa
(host mailhub.aaa.aaa[NN.NN.N.NN] said: 421 4.4.2 mailhub.aaa.aaa Error: timeout exceeded (in reply to end of 
DATA command))



postconf -n
alias_database = hash:/etc/aliases
alias_maps = hash:/etc/aliases
append_dot_mydomain = no
biff = no
bounce_queue_lifetime = 2d
config_directory = /etc/postfix
content_filter = vscan:[127.0.0.1]:10024
delay_warning_time = 24h
disable_dns_lookups = yes
header_checks = regexp:/etc/postfix/header_checks
home_mailbox = Maildir/
inet_interfaces = merlin, 127.0.0.1
mailbox_command = /usr/bin/procmail -a $EXTENSION DEFAULT=$HOME/Maildir/ 
MAILDIR=$HOME/Maildir
mailbox_size_limit = 51200
maximal_queue_lifetime = 3d
message_size_limit = 10240
mydestination = $myhostname, $myhostname.$mydomain, localhost, AAA
mydomain = AAA.AAA.AAA
myhostname = 
mynetworks = 127.0.0.0/8, NN.NN.NN.NN/24
myorigin = /etc/mailname
notify_classes = bounce, 2bounce, policy, protocol, resource, software
recipient_delimiter = +
relayhost =
smtp_host_lookup = native
smtpd_banner = $myhostname ESMTP $mail_name (Debian/GNU)
transport_maps = hash:/etc/postfix/transport, regexp:/etc/sympa/transport



Re: Error: timeout exceeded (in reply to end of DATA command)

2011-06-15 Thread Wietse Venema
Tomasz Iwanowski:
 Hi All,
 
 I manage a local intranet mail server which collects mails from our local 
 users
 and sends them all via our public mail hub server.
 
 Everything was fine until few weeks ago. Some small part of our emails (about 
 10%) hangs
 in mail's server queue with error: timeout exceeded (in reply to end of DATA 
 command)).
 Some emails stay in queue untill timeout expires (3 days) and returns to a 
 sender with error.
 There are emails which stay in queue for a few hours (with the same error)
 but after serveral retries finally leave our server and reach recepients.
 
 Noticed following facts:
 - every timeout error is: (in reply to end of DATA command), there are no 
 other errors
 - problem does not depend on email size (there were delayed emails with size 
 range from 1kb to 5mb)
 
 Turned off: tcp window scalling, tcp ecn, smtp_connection_cache_on_demand - 
 without success.
 
 In my opinion problem is on the mail hub server but I have no proofs.
 Talking with mail hub server's administrator but he said, that no others have 
 problems with mails
 and that must be a problem with my email server.
 Now I stuck with my undelivered emails.

Try collecting a tcpdump recording.

http://www.postfix.org/DEBUG_README.html#sniffer

Command:
# tcpdump -s 0 -w /file/name host server-ip-address and port 25

After some time, kill -INT the tcpdump process.

Look in the logfile for a session that breaks, and find that session
in the tcpdump recording.

# tcpdump -nr /file/name | less

Note the client tcp port, then extract that session:

# tcpdump -nr /file/name -w file/name2 port xxx

Then, contact Victor or me off-list.

Wietse


Re: Error: timeout exceeded (in reply to end of DATA command)

2011-06-15 Thread Victor Duchovni
On Wed, Jun 15, 2011 at 10:38:44AM -0400, Wietse Venema wrote:

 Command:
 # tcpdump -s 0 -w /file/name host server-ip-address and port 25
 
 After some time, kill -INT the tcpdump process.
 
 Look in the logfile for a session that breaks, and find that session
 in the tcpdump recording.
 
 # tcpdump -nr /file/name | less
 
 Note the client tcp port, then extract that session:
 
 # tcpdump -nr /file/name -w file/name2 port xxx

The second tcpdump may also need the -s 0 option to retain the
full TCP payload.

-- 
Viktor.


Re: Error: timeout exceeded (in reply to end of DATA command)

2010-01-21 Thread Wietse Venema
It's nice for a change to work with someone who provides actual
information, instead of that dork from last week who was just
venting his opinions and not giving people a chance to help.

In this case it looks like a Solaris TCP bug when sending data over
a connection that suffers from heavy packet loss.  Apparently,
Solaris 10 stops retransmitting, and therefore the other side times
out.

I'm not wrapping the packets this time, so you will have to
widen your window to read this message comfortably.

Wietse

The three-way TCP handshake show no wscale options, so we don't
have to worry about borked firewalls mis-handling this:

09:16:40.786945 IP 213.207.90.2.59301  145.222.14.10.25: S 
2466228028:2466228028(0) win 49640 mss 1460,nop,nop,sackOK
09:16:40.789806 IP 145.222.14.10.25  213.207.90.2.59301: S 
3886146351:3886146351(0) ack 2466228029 win 5840 mss 1380
09:16:40.789829 IP 213.207.90.2.59301  145.222.14.10.25: . ack 1 win 49680

The handshake shows a round-trip time of 2.9ms. The receiver's MTU
is smaller than 1460, which suggests that his packets are encapsulated
in some other protocol. It does not matter for the problem at hand.

Next is a segment from the middle of transmission. The sender has
sent a full window up to byte 22233, but apparently there was major
packet loss after byte 5673.

09:16:51.869847 IP 145.222.14.10.25  213.207.90.2.59301: P ack 5673 win 16560
09:16:51.869861 IP 213.207.90.2.59301  145.222.14.10.25: P 20853:22233(1380) 
ack 137 win 49680
09:16:51.869874 IP 213.207.90.2.59301  145.222.14.10.25: . 5673:7053(1380) ack 
137 win 49680
09:16:51.869976 IP 145.222.14.10.25  213.207.90.2.59301: P ack 5673 win 16560
09:16:51.869977 IP 145.222.14.10.25  213.207.90.2.59301: P ack 5673 win 16560
09:16:51.869978 IP 145.222.14.10.25  213.207.90.2.59301: P ack 5673 win 16560
09:16:51.870110 IP 145.222.14.10.25  213.207.90.2.59301: P ack 5673 win 16560
09:16:51.870111 IP 145.222.14.10.25  213.207.90.2.59301: P ack 5673 win 16560
09:16:51.870376 IP 145.222.14.10.25  213.207.90.2.59301: P ack 5673 win 16560
09:16:56.216513 IP 213.207.90.2.59301  145.222.14.10.25: . 5673:7053(1380) ack 
137 win 49680
09:16:56.260383 IP 145.222.14.10.25  213.207.90.2.59301: . ack 7053 win 19320
09:16:56.260414 IP 213.207.90.2.59301  145.222.14.10.25: . 7053:8433(1380) ack 
137 win 49680
09:16:56.260421 IP 213.207.90.2.59301  145.222.14.10.25: . 8433:9813(1380) ack 
137 win 49680
09:16:56.263577 IP 145.222.14.10.25  213.207.90.2.59301: . ack 8433 win 22080
09:16:56.263588 IP 213.207.90.2.59301  145.222.14.10.25: . 9813:11193(1380) 
ack 137 win 49680
09:16:56.263610 IP 213.207.90.2.59301  145.222.14.10.25: . 11193:12573(1380) 
ack 137 win 49680
09:16:56.263844 IP 145.222.14.10.25  213.207.90.2.59301: . ack 9813 win 24840
09:16:56.263855 IP 213.207.90.2.59301  145.222.14.10.25: . 12573:13953(1380) 
ack 137 win 49680
09:16:56.263865 IP 213.207.90.2.59301  145.222.14.10.25: . 13953:15333(1380) 
ack 137 win 49680
09:16:56.266641 IP 145.222.14.10.25  213.207.90.2.59301: P ack 9813 win 24840
09:16:56.266776 IP 145.222.14.10.25  213.207.90.2.59301: . ack 11193 win 28980
09:16:56.266777 IP 145.222.14.10.25  213.207.90.2.59301: P ack 11193 win 28980
09:16:56.266800 IP 213.207.90.2.59301  145.222.14.10.25: . 15333:16713(1380) 
ack 137 win 49680

Sequences like this repeat through the entire session. This network
connection is so bad that your machine can send only 107656 bytes
in 85 seconds or 1.27 kbyte/s.

I notice there are many sequences like this:

09:16:51.869844 IP 145.222.14.10.25  213.207.90.2.59301: P ack 5673 win 16560
09:16:51.869845 IP 145.222.14.10.25  213.207.90.2.59301: P ack 5673 win 16560
09:16:51.869846 IP 145.222.14.10.25  213.207.90.2.59301: P ack 5673 win 16560
09:16:51.869847 IP 145.222.14.10.25  213.207.90.2.59301: P ack 5673 win 16560

and this:

09:16:51.869976 IP 145.222.14.10.25  213.207.90.2.59301: P ack 5673 win 16560
09:16:51.869977 IP 145.222.14.10.25  213.207.90.2.59301: P ack 5673 win 16560
09:16:51.869978 IP 145.222.14.10.25  213.207.90.2.59301: P ack 5673 win 16560

and this:

09:17:20.597737 IP 145.222.14.10.25  213.207.90.2.59301: P ack 40173 win 32767
09:17:20.597738 IP 145.222.14.10.25  213.207.90.2.59301: P ack 40173 win 32767
09:17:20.597739 IP 145.222.14.10.25  213.207.90.2.59301: P ack 40173 win 32767

And many more.  This is consistent with network congestion. These
identical ACKs would have been sent several ms separated from each
other (remember, the round-trip time is about 3ms). Apparently the
ACKs were queued up behind other data that was clogging some pipe
somewere, and so they ended up traveling together.

Network congestion is also consistent with increased packet loss.

And now for the surprise: after 09:18:16.515592 the sender stops
retransmitting. That would be a sender TCP bug.

The first thing that happens next is 30s later, at 09:18:46.512567,
when the server reports a 421 timeout error and closes the connection.

09:18:16.512400 IP 213.207.90.2.59301  

Re: Error: timeout exceeded (in reply to end of DATA command)

2010-01-21 Thread Martijn de Munnik
On Thu, 21 Jan 2010 10:30:54 -0500 (EST), wie...@porcupine.org (Wietse
Venema) wrote:
 It's nice for a change to work with someone who provides actual
 information, instead of that dork from last week who was just
 venting his opinions and not giving people a chance to help.

I don't know how one can help me if I don't supply info ;)
 
 In this case it looks like a Solaris TCP bug when sending data over
 a connection that suffers from heavy packet loss.  Apparently,
 Solaris 10 stops retransmitting, and therefore the other side times
 out.

Ok, this is useful help. I did reroute the message via a linux postfix
server and now it is successfully delivered. I try to find out if I can
solve the Solaris bug but at least our customer is happy now!
 
 I'm not wrapping the packets this time, so you will have to
 widen your window to read this message comfortably.

With your explanation I'm able to read the tcp dump, one day I will fully
understand it...
 
   Wietse
 
 The three-way TCP handshake show no wscale options, so we don't
 have to worry about borked firewalls mis-handling this:
 
 09:16:40.786945 IP 213.207.90.2.59301  145.222.14.10.25: S
 2466228028:2466228028(0) win 49640 mss 1460,nop,nop,sackOK
 09:16:40.789806 IP 145.222.14.10.25  213.207.90.2.59301: S
 3886146351:3886146351(0) ack 2466228029 win 5840 mss 1380
 09:16:40.789829 IP 213.207.90.2.59301  145.222.14.10.25: . ack 1 win
49680
 
 The handshake shows a round-trip time of 2.9ms. The receiver's MTU
 is smaller than 1460, which suggests that his packets are encapsulated
 in some other protocol. It does not matter for the problem at hand.
 
 Next is a segment from the middle of transmission. The sender has
 sent a full window up to byte 22233, but apparently there was major
 packet loss after byte 5673.
 
 09:16:51.869847 IP 145.222.14.10.25  213.207.90.2.59301: P ack 5673 win
 16560
 09:16:51.869861 IP 213.207.90.2.59301  145.222.14.10.25: P
 20853:22233(1380) ack 137 win 49680
 09:16:51.869874 IP 213.207.90.2.59301  145.222.14.10.25: .
 5673:7053(1380) ack 137 win 49680
 09:16:51.869976 IP 145.222.14.10.25  213.207.90.2.59301: P ack 5673 win
 16560
 09:16:51.869977 IP 145.222.14.10.25  213.207.90.2.59301: P ack 5673 win
 16560
 09:16:51.869978 IP 145.222.14.10.25  213.207.90.2.59301: P ack 5673 win
 16560
 09:16:51.870110 IP 145.222.14.10.25  213.207.90.2.59301: P ack 5673 win
 16560
 09:16:51.870111 IP 145.222.14.10.25  213.207.90.2.59301: P ack 5673 win
 16560
 09:16:51.870376 IP 145.222.14.10.25  213.207.90.2.59301: P ack 5673 win
 16560
 09:16:56.216513 IP 213.207.90.2.59301  145.222.14.10.25: .
 5673:7053(1380) ack 137 win 49680
 09:16:56.260383 IP 145.222.14.10.25  213.207.90.2.59301: . ack 7053 win
 19320
 09:16:56.260414 IP 213.207.90.2.59301  145.222.14.10.25: .
 7053:8433(1380) ack 137 win 49680
 09:16:56.260421 IP 213.207.90.2.59301  145.222.14.10.25: .
 8433:9813(1380) ack 137 win 49680
 09:16:56.263577 IP 145.222.14.10.25  213.207.90.2.59301: . ack 8433 win
 22080
 09:16:56.263588 IP 213.207.90.2.59301  145.222.14.10.25: .
 9813:11193(1380) ack 137 win 49680
 09:16:56.263610 IP 213.207.90.2.59301  145.222.14.10.25: .
 11193:12573(1380) ack 137 win 49680
 09:16:56.263844 IP 145.222.14.10.25  213.207.90.2.59301: . ack 9813 win
 24840
 09:16:56.263855 IP 213.207.90.2.59301  145.222.14.10.25: .
 12573:13953(1380) ack 137 win 49680
 09:16:56.263865 IP 213.207.90.2.59301  145.222.14.10.25: .
 13953:15333(1380) ack 137 win 49680
 09:16:56.266641 IP 145.222.14.10.25  213.207.90.2.59301: P ack 9813 win
 24840
 09:16:56.266776 IP 145.222.14.10.25  213.207.90.2.59301: . ack 11193
win
 28980
 09:16:56.266777 IP 145.222.14.10.25  213.207.90.2.59301: P ack 11193
win
 28980
 09:16:56.266800 IP 213.207.90.2.59301  145.222.14.10.25: .
 15333:16713(1380) ack 137 win 49680
 
 Sequences like this repeat through the entire session. This network
 connection is so bad that your machine can send only 107656 bytes
 in 85 seconds or 1.27 kbyte/s.
 
 I notice there are many sequences like this:
 
 09:16:51.869844 IP 145.222.14.10.25  213.207.90.2.59301: P ack 5673 win
 16560
 09:16:51.869845 IP 145.222.14.10.25  213.207.90.2.59301: P ack 5673 win
 16560
 09:16:51.869846 IP 145.222.14.10.25  213.207.90.2.59301: P ack 5673 win
 16560
 09:16:51.869847 IP 145.222.14.10.25  213.207.90.2.59301: P ack 5673 win
 16560
 
 and this:
 
 09:16:51.869976 IP 145.222.14.10.25  213.207.90.2.59301: P ack 5673 win
 16560
 09:16:51.869977 IP 145.222.14.10.25  213.207.90.2.59301: P ack 5673 win
 16560
 09:16:51.869978 IP 145.222.14.10.25  213.207.90.2.59301: P ack 5673 win
 16560
 
 and this:
 
 09:17:20.597737 IP 145.222.14.10.25  213.207.90.2.59301: P ack 40173
win
 32767
 09:17:20.597738 IP 145.222.14.10.25  213.207.90.2.59301: P ack 40173
win
 32767
 09:17:20.597739 IP 145.222.14.10.25  213.207.90.2.59301: P ack 40173
win
 32767
 
 And many more.  This is consistent with network congestion. These
 identical ACKs would have been sent several ms separated from each
 other 

Re: Error: timeout exceeded (in reply to end of DATA command)

2010-01-21 Thread lst_hoe02

Zitat von Wietse Venema wie...@porcupine.org:


It's nice for a change to work with someone who provides actual
information, instead of that dork from last week who was just
venting his opinions and not giving people a chance to help.

In this case it looks like a Solaris TCP bug when sending data over
a connection that suffers from heavy packet loss.  Apparently,
Solaris 10 stops retransmitting, and therefore the other side times
out.


Without following the thread closely and without examining the dump  
data, it reminds me of an outstanding problem we have with one site  
showing the same timeout after data for mail exceeding around 150kb.  
As far as i have found out it seems to be a expensive firewall running  
on solaris with extended protection on, munging with SACK and  
window-scaling at some point. The problem manifests in the dump with  
many retransmits because of missing segments (the remote has 10x  
bandwidth than we have) which were reordered by SACK and the remote  
end totaly ditching one missing segment at some point and sending over  
and over again some later data until transfer totaly stops and the  
timeout jump in. I can provide raw dump data in private for such a  
case. Unfortunately the remote side was not that interested to solve  
the problem so i give up on this.


Regards

Andreas






smime.p7s
Description: S/MIME Signatur


Error: timeout exceeded (in reply to end of DATA command)

2010-01-20 Thread Martijn de Munnik
Hi list,

I have a problem with delivering mail to a host and get this error:

host mx2.amsterdam.nl[145.222.14.10] said: 421 enepmx02.amsterdam.nl
Error: timeout exceeded (in reply to end of DATA command)

This error only seems to occur with 'large' mails. Currently I have a mail
of ~600KB and ~8MB stuck in the queue. I don't think this is a postfix
issue on our site but an issue with the mailserver on the other site. What
can cause such issues?

Thanks,
Martijn

-- 
YoungGuns
Kasteleinenkampweg 7b
5222 AX 's-Hertogenbosch
T. 073 623 56 40
F. 073 623 56 39
www.youngguns.nl
KvK 18076568


Re: Error: timeout exceeded (in reply to end of DATA command)

2010-01-20 Thread Martijn de Munnik
On Wed, 20 Jan 2010 10:56:39 +0100, Martijn de Munnik
mart...@youngguns.nl wrote:
 Hi list,
 
 I have a problem with delivering mail to a host and get this error:
 
 host mx2.amsterdam.nl[145.222.14.10] said: 421 enepmx02.amsterdam.nl
 Error: timeout exceeded (in reply to end of DATA command)
 
 This error only seems to occur with 'large' mails. Currently I have a
mail
 of ~600KB and ~8MB stuck in the queue. I don't think this is a postfix
 issue on our site but an issue with the mailserver on the other site.
What
 can cause such issues?

I tried the http://ftp.nluug.nl/mail/postfix/faq.html#timeouts Cisco PIX
workaround but that doesn't make any difference? I think the remote site is
running PIX:

Connected to mx2.amsterdam.nl (145.222.14.10).
Escape character is '^]'.
220 **

postconf -n

address_verify_map = btree:${data_directory}/verify
alias_maps = hash:/opt/csw/etc/postfix/maps/aliases
body_checks = regexp:/opt/csw/etc/postfix/maps/body_checks
broken_sasl_auth_clients = yes
command_directory = /opt/csw/sbin
config_directory = /etc/postfix
content_filter = amavisfeed:localhost:10024
daemon_directory = /opt/csw/libexec/postfix
data_directory = /opt/csw/var/lib/postfix
default_database_type = hash
delay_warning_time = 4h
disable_vrfy_command = yes
header_checks = regexp:/opt/csw/etc/postfix/maps/header_checks
home_mailbox = Maildir/
html_directory = /opt/csw/share/doc/postfix/html
inet_interfaces = all
mailbox_command = /opt/csw/bin/procmail-wrapper -o -a $DOMAIN -d $LOGNAME
mailbox_size_limit = 0
mailq_path = /opt/csw/bin/mailq
manpage_directory = /opt/csw/share/man
maximal_backoff_time = 8000s
maximal_queue_lifetime = 7d
message_size_limit = 52428800
mime_header_checks = regexp:/opt/csw/etc/postfix/maps/mime_header_checks
minimal_backoff_time = 1000s
mydestination = $myhostname, localhost.$mydomain
myhostname = stevie.youngguns.nl
mynetworks_style = host
myorigin = $myhostname
newaliases_path = /opt/csw/bin/newaliases
readme_directory = /opt/csw/share/doc/postfix/README_FILES
receive_override_options = no_address_mappings
recipient_delimiter = +
relay_domains = slagenlandwonen.nl, wfcommunicatie.nl, gooischebrink.com,
interjute.nl, melamo.nl, fair-play.nl, loopbaankamer.nl, ospl.nl, ospl.de,
printcontrol.nl, dankers-schilderwerken.nl, promonta.nl,
interim-denbosch.nl, vanherpt.biz, rodersana.nl
relayhost =
sample_directory = /opt/csw/share/doc/postfix/samples
sendmail_path = /opt/csw/sbin/sendmail
smtp_bind_address = 213.207.90.2
smtp_helo_timeout = 60s
smtp_pix_workaround_delay_time = 10s
smtp_send_xforward_command = yes
smtp_skip_quit_response = yes
smtp_tls_loglevel = 1
smtp_tls_security_level = may
smtp_tls_session_cache_database = btree:${data_directory}/smtp_scache
smtpd_banner = $myhostname ESMTP
smtpd_client_connection_count_limit = 10
smtpd_data_restrictions = reject_unauth_pipelining
smtpd_delay_reject = yes
smtpd_hard_error_limit = 12
smtpd_helo_required = yes
smtpd_recipient_limit = 100
smtpd_recipient_restrictions = permit_mynetworks,  
permit_sasl_authenticated,   check_client_access
hash:/opt/csw/etc/postfix/maps/relay_access,   reject_non_fqdn_recipient,  
reject_non_fqdn_sender,   reject_unknown_sender_domain,  
reject_unauth_destination,   reject_non_fqdn_helo_hostname,  
reject_rbl_client virbl.dnsbl.bit.nl,   check_policy_service
inet:127.0.0.1:12525,   check_sender_mx_access
cidr:/opt/csw/etc/postfix/maps/bogon_networks,   check_client_access   
cidr:/opt/csw/etc/postfix/maps/drop,   check_sender_mx_access
cidr:/opt/csw/etc/postfix/maps/drop,   check_sender_ns_access
cidr:/opt/csw/etc/postfix/maps/drop,   reject_unverified_recipient,  
check_client_accesscidr:/opt/csw/etc/postfix/maps/dnswl_header,  
check_client_accesscidr:/opt/csw/etc/postfix/maps/dnswl_permit,  
check_policy_service inet:127.0.0.1:10023
smtpd_sasl_auth_enable = yes
smtpd_sasl_authenticated_header = yes
smtpd_sasl_local_domain = $myhostname
smtpd_sasl_path = private/auth
smtpd_sasl_security_options = noanonymous
smtpd_sasl_type = dovecot
smtpd_soft_error_limit = 3
smtpd_tls_cert_file = /home/yghosting/ssl/secure-youngguns-nl.pem
smtpd_tls_key_file = /home/yghosting/ssl/secure-youngguns-nl.key
smtpd_tls_loglevel = 1
smtpd_tls_security_level = may
smtpd_tls_session_cache_database = btree:${data_directory}/smtpd_scache
soft_bounce = no
strict_rfc821_envelopes = yes
tls_random_source = dev:/dev/urandom
transport_maps = hash:/opt/csw/etc/postfix/maps/transport
unknown_address_reject_code = 550
unknown_hostname_reject_code = 550
unknown_local_recipient_reject_code = 550
unverified_recipient_reject_code = 550
virtual_alias_maps = hash:/opt/csw/etc/postfix/maps/virtual


 
 Thanks,
 Martijn

-- 
YoungGuns
Kasteleinenkampweg 7b
5222 AX 's-Hertogenbosch
T. 073 623 56 40
F. 073 623 56 39
www.youngguns.nl
KvK 18076568


Re: Error: timeout exceeded (in reply to end of DATA command)

2010-01-20 Thread Martijn de Munnik
On Wed, 20 Jan 2010 11:10:50 +0100, Martijn de Munnik
mart...@youngguns.nl wrote:
 On Wed, 20 Jan 2010 10:56:39 +0100, Martijn de Munnik
 mart...@youngguns.nl wrote:
 Hi list,
 
 I have a problem with delivering mail to a host and get this error:
 
 host mx2.amsterdam.nl[145.222.14.10] said: 421 enepmx02.amsterdam.nl
 Error: timeout exceeded (in reply to end of DATA command)

I also found a mail in the queue to mx-cluster1.one.com with exactly the
same problem, this mail is also ~600kb.

 
 This error only seems to occur with 'large' mails. Currently I have a
 mail
 of ~600KB and ~8MB stuck in the queue. I don't think this is a postfix
 issue on our site but an issue with the mailserver on the other site.
 What
 can cause such issues?
 
 I tried the http://ftp.nluug.nl/mail/postfix/faq.html#timeouts Cisco PIX
 workaround but that doesn't make any difference? I think the remote site
is
 running PIX:
 
 Connected to mx2.amsterdam.nl (145.222.14.10).
 Escape character is '^]'.
 220 **
 
 postconf -n
 
 address_verify_map = btree:${data_directory}/verify
 alias_maps = hash:/opt/csw/etc/postfix/maps/aliases
 body_checks = regexp:/opt/csw/etc/postfix/maps/body_checks
 broken_sasl_auth_clients = yes
 command_directory = /opt/csw/sbin
 config_directory = /etc/postfix
 content_filter = amavisfeed:localhost:10024
 daemon_directory = /opt/csw/libexec/postfix
 data_directory = /opt/csw/var/lib/postfix
 default_database_type = hash
 delay_warning_time = 4h
 disable_vrfy_command = yes
 header_checks = regexp:/opt/csw/etc/postfix/maps/header_checks
 home_mailbox = Maildir/
 html_directory = /opt/csw/share/doc/postfix/html
 inet_interfaces = all
 mailbox_command = /opt/csw/bin/procmail-wrapper -o -a $DOMAIN -d
$LOGNAME
 mailbox_size_limit = 0
 mailq_path = /opt/csw/bin/mailq
 manpage_directory = /opt/csw/share/man
 maximal_backoff_time = 8000s
 maximal_queue_lifetime = 7d
 message_size_limit = 52428800
 mime_header_checks = regexp:/opt/csw/etc/postfix/maps/mime_header_checks
 minimal_backoff_time = 1000s
 mydestination = $myhostname, localhost.$mydomain
 myhostname = stevie.youngguns.nl
 mynetworks_style = host
 myorigin = $myhostname
 newaliases_path = /opt/csw/bin/newaliases
 readme_directory = /opt/csw/share/doc/postfix/README_FILES
 receive_override_options = no_address_mappings
 recipient_delimiter = +
 relay_domains = slagenlandwonen.nl, wfcommunicatie.nl,
gooischebrink.com,
 interjute.nl, melamo.nl, fair-play.nl, loopbaankamer.nl, ospl.nl,
ospl.de,
 printcontrol.nl, dankers-schilderwerken.nl, promonta.nl,
 interim-denbosch.nl, vanherpt.biz, rodersana.nl
 relayhost =
 sample_directory = /opt/csw/share/doc/postfix/samples
 sendmail_path = /opt/csw/sbin/sendmail
 smtp_bind_address = 213.207.90.2
 smtp_helo_timeout = 60s
 smtp_pix_workaround_delay_time = 10s
 smtp_send_xforward_command = yes
 smtp_skip_quit_response = yes
 smtp_tls_loglevel = 1
 smtp_tls_security_level = may
 smtp_tls_session_cache_database = btree:${data_directory}/smtp_scache
 smtpd_banner = $myhostname ESMTP
 smtpd_client_connection_count_limit = 10
 smtpd_data_restrictions = reject_unauth_pipelining
 smtpd_delay_reject = yes
 smtpd_hard_error_limit = 12
 smtpd_helo_required = yes
 smtpd_recipient_limit = 100
 smtpd_recipient_restrictions = permit_mynetworks,  
 permit_sasl_authenticated,   check_client_access
 hash:/opt/csw/etc/postfix/maps/relay_access,  
reject_non_fqdn_recipient,  
 reject_non_fqdn_sender,   reject_unknown_sender_domain,  
 reject_unauth_destination,   reject_non_fqdn_helo_hostname,  
 reject_rbl_client virbl.dnsbl.bit.nl,   check_policy_service
 inet:127.0.0.1:12525,   check_sender_mx_access
 cidr:/opt/csw/etc/postfix/maps/bogon_networks,   check_client_access   
 cidr:/opt/csw/etc/postfix/maps/drop,   check_sender_mx_access
 cidr:/opt/csw/etc/postfix/maps/drop,   check_sender_ns_access
 cidr:/opt/csw/etc/postfix/maps/drop,   reject_unverified_recipient,  
 check_client_accesscidr:/opt/csw/etc/postfix/maps/dnswl_header,  
 check_client_accesscidr:/opt/csw/etc/postfix/maps/dnswl_permit,  
 check_policy_service inet:127.0.0.1:10023
 smtpd_sasl_auth_enable = yes
 smtpd_sasl_authenticated_header = yes
 smtpd_sasl_local_domain = $myhostname
 smtpd_sasl_path = private/auth
 smtpd_sasl_security_options = noanonymous
 smtpd_sasl_type = dovecot
 smtpd_soft_error_limit = 3
 smtpd_tls_cert_file = /home/yghosting/ssl/secure-youngguns-nl.pem
 smtpd_tls_key_file = /home/yghosting/ssl/secure-youngguns-nl.key
 smtpd_tls_loglevel = 1
 smtpd_tls_security_level = may
 smtpd_tls_session_cache_database = btree:${data_directory}/smtpd_scache
 soft_bounce = no
 strict_rfc821_envelopes = yes
 tls_random_source = dev:/dev/urandom
 transport_maps = hash:/opt/csw/etc/postfix/maps/transport
 unknown_address_reject_code = 550
 unknown_hostname_reject_code = 550
 unknown_local_recipient_reject_code = 550
 unverified_recipient_reject_code = 550
 virtual_alias_maps = hash:/opt/csw/etc/postfix/maps/virtual
 
 
 
 

Re: Error: timeout exceeded (in reply to end of DATA command)

2010-01-20 Thread Ansgar Wiechers
On 2010-01-20 Martijn de Munnik wrote:
 I have a problem with delivering mail to a host and get this error:
 
 host mx2.amsterdam.nl[145.222.14.10] said: 421 enepmx02.amsterdam.nl
 Error: timeout exceeded (in reply to end of DATA command)
 
 This error only seems to occur with 'large' mails. Currently I have a
 mail of ~600KB and ~8MB stuck in the queue. I don't think this is a
 postfix issue on our site but an issue with the mailserver on the
 other site. What can cause such issues?

AFAICS: slow connections, high connection concurrency (leaving little
bandwidth per connection), proxy filter or policy service taking too
long for checking the mail, etc.

However, since the timeout occurs on the remote side, you should contact
the administrator of that server about the issue.

Regards
Ansgar Wiechers
-- 
Abstractions save us time working, but they don't save us time learning.
--Joel Spolsky


Re: Error: timeout exceeded (in reply to end of DATA command)

2010-01-20 Thread Wietse Venema
Martijn de Munnik:
 Hi list,
 
 I have a problem with delivering mail to a host and get this error:
 
 host mx2.amsterdam.nl[145.222.14.10] said: 421 enepmx02.amsterdam.nl
 Error: timeout exceeded (in reply to end of DATA command)
 
 This error only seems to occur with 'large' mails. Currently I have a mail
 of ~600KB and ~8MB stuck in the queue. I don't think this is a postfix
 issue on our site but an issue with the mailserver on the other site. What
 can cause such issues?

Record a tcpdump trace. The way the session fails will indicate
the kind of problem (MTU, Window scaling, and so on).

http://www.postfix.org/DEBUG_README.html

Wietse


Re: Error: timeout exceeded (in reply to end of DATA command)

2010-01-20 Thread Martijn de Munnik
On Wed, 20 Jan 2010 07:20:01 -0500 (EST), wie...@porcupine.org (Wietse
Venema) wrote:
 Martijn de Munnik:
 Hi list,
 
 I have a problem with delivering mail to a host and get this error:
 
 host mx2.amsterdam.nl[145.222.14.10] said: 421 enepmx02.amsterdam.nl
 Error: timeout exceeded (in reply to end of DATA command)
 
 This error only seems to occur with 'large' mails. Currently I have a
 mail
 of ~600KB and ~8MB stuck in the queue. I don't think this is a postfix
 issue on our site but an issue with the mailserver on the other site.
 What
 can cause such issues?
 
 Record a tcpdump trace. The way the session fails will indicate
 the kind of problem (MTU, Window scaling, and so on).
 
 http://www.postfix.org/DEBUG_README.html
 
   Wietse

Ok, I tried that and I'm not really sure where to look for. I opened the
tcpdump file in wireshark and there are a lot of warnings and notes in the
file.

--
Notes:
Duplicate ACK(#1) [145.222.14.10 - 213.207.90.2]
Duplicate ACK(#2) [145.222.14.10 - 213.207.90.2]
Duplicate ACK(#3) [145.222.14.10 - 213.207.90.2]
Duplicate ACK(#4) [145.222.14.10 - 213.207.90.2]
.
.
.
Duplicate ACK(#44) [145.222.14.10 - 213.207.90.2]
Retransmission (suspected) [213.207.90.2 - 145.222.14.10]

Warnings:
Fast retransmission (suspected) [213.207.90.2 - 145.222.14.10]
Out-Of-Order segment [213.207.90.2 - 145.222.14.10]
--

This is abracadabra for me ;)

Martijn
-- 
YoungGuns
Kasteleinenkampweg 7b
5222 AX 's-Hertogenbosch
T. 073 623 56 40
F. 073 623 56 39
www.youngguns.nl
KvK 18076568


Re: Error: timeout exceeded (in reply to end of DATA command)

2010-01-20 Thread Wietse Venema
Martijn de Munnik:
 On Wed, 20 Jan 2010 07:20:01 -0500 (EST), wie...@porcupine.org (Wietse
 Venema) wrote:
  Martijn de Munnik:
  Hi list,
  
  I have a problem with delivering mail to a host and get this error:
  
  host mx2.amsterdam.nl[145.222.14.10] said: 421 enepmx02.amsterdam.nl
  Error: timeout exceeded (in reply to end of DATA command)
  
  This error only seems to occur with 'large' mails. Currently I have a
  mail
  of ~600KB and ~8MB stuck in the queue. I don't think this is a postfix
  issue on our site but an issue with the mailserver on the other site.
  What
  can cause such issues?
  
  Record a tcpdump trace. The way the session fails will indicate
  the kind of problem (MTU, Window scaling, and so on).
  
  http://www.postfix.org/DEBUG_README.html
  
  Wietse
 
 Ok, I tried that and I'm not really sure where to look for. I opened the
 tcpdump file in wireshark and there are a lot of warnings and notes in the
 file.
 
 --
 Notes:
 Duplicate ACK(#1) [145.222.14.10 - 213.207.90.2]
 Duplicate ACK(#2) [145.222.14.10 - 213.207.90.2]
 Duplicate ACK(#3) [145.222.14.10 - 213.207.90.2]
 Duplicate ACK(#4) [145.222.14.10 - 213.207.90.2]
 .
 .
 .
 Duplicate ACK(#44) [145.222.14.10 - 213.207.90.2]
 Retransmission (suspected) [213.207.90.2 - 145.222.14.10]
 
 Warnings:
 Fast retransmission (suspected) [213.207.90.2 - 145.222.14.10]
 Out-Of-Order segment [213.207.90.2 - 145.222.14.10]
 --
 
 This is abracadabra for me ;)

If you can make the tcpdump -nr /file/name output available then
people who understand TCP/IP can look at it.

Wietse


Re: Error: timeout exceeded (in reply to end of DATA command)

2010-01-20 Thread Wietse Venema
Here's the TCP initial handshake:

17:30:44.951789 IP 213.207.90.2.48147  145.222.14.10.25: S 
50514820:50514820(0) win 49640 mss 1460,nop,wscale 0,nop,nop,sackOK
17:30:44.954496 IP 145.222.14.10.25  213.207.90.2.48147: S 
4148480248:4148480248(0) ack 50514821 win 5840 mss 1380,nop,wscale 2
17:30:44.954519 IP 213.207.90.2.48147  145.222.14.10.25: . ack 1 win 49680

Later, as the receiver processes the network packets, it acknowledges
the data received, sends its receive window size (how much more it
is willing to receive).

Above, with wscale 2 the server at 145.222.14.10 announces that
its TCP receive window value needs to be multiplied by a factor of
4 (binary number shifted left by 2).

But, there is a broken router in the path that does not understand
window scaling.

Here is an example of what gets f-ed up:

17:30:45.41 IP 213.207.90.2.48147  145.222.14.10.25: . 20853:22233(1380) 
ack 137 win 49680
17:30:45.412230 IP 213.207.90.2.48147  145.222.14.10.25: . 22233:23613(1380) 
ack 137 win 49680
17:30:45.412249 IP 213.207.90.2.48147  145.222.14.10.25: P 23613:24993(1380) 
ack 137 win 49680
17:30:45.412747 IP 145.222.14.10.25  213.207.90.2.48147: P ack 8433 win 5800
17:30:45.412748 IP 145.222.14.10.25  213.207.90.2.48147: P ack 8433 win 5800
17:30:45.412749 IP 145.222.14.10.25  213.207.90.2.48147: P ack 8433 win 5800

The receiver says they can receive bytes 8433-31633, but the broken
router does not know that 5800 needs to be multiplied by 4, and it
thinks the receiver can receive only bytes 8433-14233.

The broken router then throws away the bytes with higher sequence
numbers than 14233.

Workaround: turn off window scaling support on the sender's kernel.

Wietse


Re: Error: timeout exceeded (in reply to end of DATA command)

2010-01-20 Thread Victor Duchovni
On Wed, Jan 20, 2010 at 03:22:56PM -0500, Wietse Venema wrote:

 The broken router then throws away the bytes with higher sequence
 numbers than 14233.
 
 Workaround: turn off window scaling support on the sender's kernel.

This problem is sufficiently common, that on Linux MTAs I always add:

net.ipv4.tcp_window_scaling = 0

to sysctl.conf. Adjust for other systems as necessary. This hurts
long-haul throughput, but email tolerates latency, provided most of your
outbound traffic is not a high-bandwidth channel to Mars (but then you
would not be using TCP anyway...)

-- 
Viktor.

Disclaimer: off-list followups get on-list replies or get ignored.
Please do not ignore the Reply-To header.

To unsubscribe from the postfix-users list, visit
http://www.postfix.org/lists.html or click the link below:
mailto:majord...@postfix.org?body=unsubscribe%20postfix-users

If my response solves your problem, the best way to thank me is to not
send an it worked, thanks follow-up. If you must respond, please put
It worked, thanks in the Subject so I can delete these quickly.


Re: Error: timeout exceeded (in reply to end of DATA command)

2010-01-20 Thread Martijn de Munnik

On Jan 20, 2010, at 9:28 PM, Victor Duchovni wrote:

 On Wed, Jan 20, 2010 at 03:22:56PM -0500, Wietse Venema wrote:
 
 The broken router then throws away the bytes with higher sequence
 numbers than 14233.
 
 Workaround: turn off window scaling support on the sender's kernel.
 
 This problem is sufficiently common, that on Linux MTAs I always add:
 
net.ipv4.tcp_window_scaling = 0
I'm running Solaris 10 x86 and I did

sudo ndd -set /dev/tcp tcp_wscale_always 0

before I did this the value was 1. After I did this I flushed the queue but the 
messages stay stuck in the queue with the same problem. I'm not sure this is 
the right kernel parameter for Solaris?

/etc/system has no specific setting for tcp, so everything is default Solaris 
10.

Wietse, the broken router you mentioned, could that be a Cisco PIX on the 
receivers site?

Jan 20 22:58:43 stevie.youngguns.nl postfix/smtp[18765]: [ID 197553 mail.info] 
8A5553BA0C: enabling PIX workarounds: disable_esmtp delay_dotcrlf for 
mx2.amsterdam.nl[145.222.14.10]:25

Thanks,
Martijn
 
 to sysctl.conf. Adjust for other systems as necessary. This hurts
 long-haul throughput, but email tolerates latency, provided most of your
 outbound traffic is not a high-bandwidth channel to Mars (but then you
 would not be using TCP anyway...)
 
 -- 
   Viktor.
 
 Disclaimer: off-list followups get on-list replies or get ignored.
 Please do not ignore the Reply-To header.
 
 To unsubscribe from the postfix-users list, visit
 http://www.postfix.org/lists.html or click the link below:
 mailto:majord...@postfix.org?body=unsubscribe%20postfix-users
 
 If my response solves your problem, the best way to thank me is to not
 send an it worked, thanks follow-up. If you must respond, please put
 It worked, thanks in the Subject so I can delete these quickly.
 



Re: Error: timeout exceeded (in reply to end of DATA command)

2010-01-20 Thread Wietse Venema
Martijn de Munnik:
 
 On Jan 20, 2010, at 9:28 PM, Victor Duchovni wrote:
 
  On Wed, Jan 20, 2010 at 03:22:56PM -0500, Wietse Venema wrote:
  
  The broken router then throws away the bytes with higher sequence
  numbers than 14233.
  
  Workaround: turn off window scaling support on the sender's kernel.
  
  This problem is sufficiently common, that on Linux MTAs I always add:
  
 net.ipv4.tcp_window_scaling = 0
 I'm running Solaris 10 x86 and I did
 
 sudo ndd -set /dev/tcp tcp_wscale_always 0
 
 before I did this the value was 1. After I did this I flushed the
 queue but the messages stay stuck in the queue with the same
 problem. I'm not sure this is the right kernel parameter for
 Solaris?

You can do

ndd /dev/tcp \?

to find out what parameters are supported. On my Solaris9 and
Solaris10 test boxes it is called tcp_wscale_always.

According to Solaris10 documentation:

When this parameter is enabled, which is the default setting
[since Solaris10], TCP always sends a SYN segment with the
window scale option, even if the window scale option value is
0.

So, please make another recording and report if setting this to 0
causes Solaris to stop sending wscale 0 options.

If the sender does not send a wscale option, then the receiver
is not supposed to send a wscale option, either. Your tcpdump
recording should confirm that.

 /etc/system has no specific setting for tcp, so everything is
 default Solaris 10.
 
 Wietse, the broken router you mentioned, could that be a Cisco
 PIX on the receivers site?

I maintain no list of CISCO PIX firewall bugs.

It can also be a router at your end. Never exclude that possibility.

 Jan 20 22:58:43 stevie.youngguns.nl postfix/smtp[18765]: [ID 197553
 mail.info] 8A5553BA0C: enabling PIX workarounds: disable_esmtp
 delay_dotcrlf for mx2.amsterdam.nl[145.222.14.10]:25

That's a PIX bug workaround for the fact that their engone parses
only one packet at a time and therefore misses SMTP commands
and CRLF.CRLF when these sit across packet boundaries.

Wietse


Re: Error: timeout exceeded (in reply to end of DATA command)

2010-01-20 Thread Wietse Venema
Wietse Venema:
 You can do
 
 ndd /dev/tcp \?
 
 to find out what parameters are supported. On my Solaris9 and
 Solaris10 test boxes it is called tcp_wscale_always.
 
 According to Solaris10 documentation:
 
 When this parameter is enabled, which is the default setting
 [since Solaris10], TCP always sends a SYN segment with the
 window scale option, even if the window scale option value is
 0.

With the default tcp_wscale_always setting, making a connection
from a Solaris 10 box to FreeBSD 8.0:

20:13:59.808828 IP 168.100.189.17.32799  168.100.189.10.25: Flags
[S], seq 11835, win 49640, options [mss 1460,nop,wscale
0,nop,nop,sackOK], length 0
20:13:59.808892 IP 168.100.189.10.25  168.100.189.17.32799: Flags
[S.], seq 538094055, ack 11836, win 65535, options [mss
1460,nop,wscale 3,sackOK,eol], length 0
20:13:59.809327 IP 168.100.189.17.32799  168.100.189.10.25: Flags
[.], ack 1, win 49640, length 0

Same system with tcp_wscale_always set to zero:

20:14:52.736959 IP 168.100.189.17.32800  168.100.189.10.25: Flags
[S], seq 131413865, win 49640, options [mss 1460,nop,nop,sackOK],
length 0
20:14:52.737016 IP 168.100.189.10.25  168.100.189.17.32800: Flags
[S.], seq 3072042607, ack 131413866, win 65535, options [mss
1460,sackOK,eol], length 0
20:14:52.737581 IP 168.100.189.17.32800  168.100.189.10.25: Flags
[.], ack 1, win 49640, length 0

Thus, Solaris 10 does not send wscale, and neither should the
remote server.

If this does not make your mail move, then you need to collect
another tcpdump recording.

In that case mail was not moving because of multiple problems.

Wietse