[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy
This problem was resolved off-list. Greg Sims: > Wietse & Viktor, > > All is not lost. Restarting BIND on Ray08 solved the problem of > c=30!! I am sorry that I did not review/restart this service earlier. > Your comments related to the 5 second intervals and DNS timeouts > caused me to look there. Unfortunately we went down some dead ends because I missed that the Postfix SMTP client will make DNS MX and address queries even when reusing connections. I am updating Postfix documentation. With the crippled DNS resolver, the Postfix 'type c' delays were dominated by the number of DNS lookups. This resulted in larger delays for domains with many MX records such as gmail.com, compared to domains with fewer MX records such as yahoo.com or outlook. These differences in delays were not the result of receiver-side throttling. Wietse ___ Postfix-users mailing list -- postfix-users@postfix.org To unsubscribe send an email to postfix-users-le...@postfix.org
[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy
On Thu, May 23, 2024 at 05:48:29PM -0400, Wietse Venema via Postfix-users wrote: > Greg Sims via Postfix-users: > > We see conn_use about 24% of the time: > > But none of the sessions shown in your message have that. > > Do they also have multiple-of-5-second type 'c' delays? Indeed those multiples of 5s sure feel like DNS lookup delays. Unclear why this would correlated with enabling TLS connection reuse, I'd conjecture that's not the case, and that similar delays would also be seen for non-TLS connections to the same destination. It is also unclear whether the delays are the OP's systems looking up Google's addresses, or Google's systems looking up the OP's reverse and forward DNS, though when I query Quad8 DNS for the OP's MTA, the response is prompt. -- Viktor. ___ Postfix-users mailing list -- postfix-users@postfix.org To unsubscribe send an email to postfix-users-le...@postfix.org
[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy
Greg Sims via Postfix-users: > We see conn_use about 24% of the time: But none of the sessions shown in your message have that. Do they also have multiple-of-5-second type 'c' delays? Wietse ___ Postfix-users mailing list -- postfix-users@postfix.org To unsubscribe send an email to postfix-users-le...@postfix.org
[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy
On Thu, May 23, 2024 at 7:07 AM Greg Sims wrote: > > Thank you Viktor. All recommended changes have been made. I hope to > collect useful "collate" data with our next distribution at Noon today > pacific. > Still having problems with the inbound smtpd from our private network flooding "collate". This could be caused by the process number always being the same. May 23 11:30:07 mail01 postfix/smtpd[12433]: E050630F1BE6: client=web01-p[192.168.122.11] May 23 11:30:08 mail01 postfix/smtpd[12433]: 2245330F1BEA: client=web01-p[192.168.122.11] May 23 11:30:08 mail01 postfix/smtpd[12433]: 5711630F1BEC: client=web01-p[192.168.122.11] root@mail01 0523zh]# ll total 99M -rw-r--r-- 1 root root 2.6M May 23 12:45 0523zh.log -rw-r--r-- 1 root root 94M May 23 12:46 0523zh.log.collate -rw-r--r-- 1 root root 1.9M May 23 12:51 0523zh.log.collate.no_priv I removed the smtpd records with grep -v to create the last file above with no private ip addresses (no_priv). One thing I noticed changing with the updates to main.cf and master.cf. I now see only "Untrusted TLS connection established". Last night (before the changes) we had 40K Trusted and TLS_Anonymous: 35, TLS_Untrusted: 78. Could this have something to do with deleting the default Certs associated with TLS? And now some "collate" data. The run starts at 11:30 -- we already have a long delay of 10 seconds: May 23 11:30:07 mail01 postfix/smtpd[12438]: discarding EHLO keywords: CHUNKING May 23 11:30:11 mail01 postfix/cleanup[12440]: F3B3630F1C5C: message-id=<62994d8ff3eb719d8ddc674c2cae6b38@swift.generated> May 23 11:30:11 mail01 postfix/qmgr[11016]: F3B3630F1C5C: from==hotmail@devotion.raystedman.org>, size=50829, nrcpt=1 (queue active) May 23 11:30:21 mail01 postfix/t124/smtp[12472]: Untrusted TLS connection established to hotmail-com.olc.protection.outlook.com[52.101.42.8]:25: TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-384) server-signature RSA-PSS (2048 bits) server-digest SHA256 May 23 11:30:21 mail01 postfix/t124/smtp[12472]: F3B3630F1C5C: to=<@hotmail.com>, relay=hotmail-com.olc.protection.outlook.com[52.101.42.8]:25, delay=11, delays=0.01/0.01/10/0.38, dsn=2.6.0, status=sent (250 2.6.0 <62994d8ff3eb719d8ddc674c2cae6b38@swift.generated> [InternalId=21569325775951, Hostname=CO3P220MB0782.NAMP220.PROD.OUTLOOK.COM] 60415 bytes in 0.104, 566.038 KB/sec Queued mail for delivery -> 250 2.1.5) May 23 11:30:21 mail01 postfix/qmgr[11016]: F3B3630F1C5C: removed Now 20 seconds: May 23 11:30:07 mail01 postfix/smtpd[12433]: discarding EHLO keywords: CHUNKING May 23 11:30:09 mail01 postfix/cleanup[12441]: 3634330F1BF4: message-id=<826855425363d60cc81d5b8f49e83579@swift.generated> May 23 11:30:09 mail01 postfix/qmgr[11016]: 3634330F1BF4: from==yahoo@devotion.raystedman.org>, size=50821, nrcpt=1 (queue active) May 23 11:30:29 mail01 postfix/t123/smtp[12455]: Untrusted TLS connection established to mta6.am0.yahoodns.net[98.136.96.75]:25: TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256 May 23 11:30:30 mail01 postfix/t123/smtp[12455]: 3634330F1BF4: to=<@yahoo.com>, relay=mta6.am0.yahoodns.net[98.136.96.75]:25, delay=21, delays=0.01/0.01/20/0.54, dsn=2.0.0, status=sent (250 ok dirdel) May 23 11:30:30 mail01 postfix/qmgr[11016]: 3634330F1BF4: removed Towards the end of the run -- 30 seconds: May 23 11:30:07 mail01 postfix/smtpd[12433]: discarding EHLO keywords: CHUNKING May 23 11:33:49 mail01 postfix/cleanup[12690]: BFB3B30F279B: message-id=<6e7c80dbfbec093a18061cdca1ae4b9c@swift.generated> May 23 11:33:49 mail01 postfix/qmgr[11016]: BFB3B30F279B: from==gmail@devotion.raystedman.org>, size=50821, nrcpt=1 (queue active). May 23 11:35:28 mail01 postfix/t121/smtp[12668]: Untrusted TLS connection established to gmail-smtp-in.l.google.com[142.251.2.27]:25: TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-256) server-digest SHA256 May 23 11:35:29 mail01 postfix/t121/smtp[12668]: BFB3B30F279B: to=<@gmail.com>, relay=gmail-smtp-in.l.google.com[142.251.2.27]:25, delay=100, delays=0.01/69/30/0.8, dsn=2.0.0, status=sent (250 2.0.0 OK 1716489329 d2e1a72fcca58-6f4d2b41c18si755622b3a.334 - gsmtp) May 23 11:35:29 mail01 postfix/qmgr[11016]: BFB3B30F279B: removed Max connections from hotmail: May 23 11:30:07 mail01 postfix/smtpd[12433]: discarding EHLO keywords: CHUNKING May 23 11:32:45 mail01 postfix/cleanup[12681]: 9433130F50E9: message-id=<6ace7587b9302b23a29b052a5503afb9@swift.generated> May 23 11:32:45 mail01 postfix/qmgr[11016]: 9433130F50E9: from==hotmail@devotion.raystedman.org>, size=50833, nrcpt=1 (queue active) May 23 11:32:45 mail01 postfix/t121/smtp[12559]: Untrusted TLS connection established to hotmail-com.olc.protection.outlook.com[52.101.11.16]:25: TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy
Thank you Viktor. All recommended changes have been made. I hope to collect useful "collate" data with our next distribution at Noon today pacific. I hope you have a great day! Greg > [root@mail01 postfix]# postconf -nf > > [root@mail01 postfix]# postconf -Mf ___ Postfix-users mailing list -- postfix-users@postfix.org To unsubscribe send an email to postfix-users-le...@postfix.org
[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy
On Wed, May 22, 2024 at 12:19:03PM -0500, Greg Sims wrote: > [root@mail01 postfix]# postconf -nf > maximal_backoff_time = 16m > minimal_backoff_time = 2m > queue_run_delay = 2m FWIW (not related to your immediate issue) I would not recommend such a short maximal backoff, you're potentially filling the active queue with messages that will again tempfail , and perhaps take a long time to do so. I typically go with: minimal_backoff_time = 225s maximal_backoff_time = 7200s queue_run_delay = 225s > sender_dependent_default_transport_maps = > regexp:/etc/postfix/sender_relay.regexp, randmap:{t121,t122,t123,t124} See below... > smtp_tls_security_level = may > smtp_tls_CAfile = /etc/pki/tls/certs/ca-bundle.crt > smtp_tls_CApath = /etc/pki/tls/certs The CAfile and CApath are not needed (just waste memory) until you set something other than "may" as the security level for at least some destinations. > smtpd_tls_cert_file = /etc/pki/tls/certs/postfix.pem > smtpd_tls_key_file = /etc/pki/tls/private/postfix.key > smtpd_tls_security_level = none The certificates are not needed while the security level is "none". > [root@mail01 postfix]# postconf -Mf > ... > t121 unix - - n - 64 smtp > -o smtp_bind_address=209.73.152.121 > -o smtp_helo_name=mail01-t121.raystedman.org > -o syslog_name=t121 > t122 unix - - n - 64 smtp > -o smtp_bind_address=209.73.152.122 > -o smtp_helo_name=mail01-t122.raystedman.org > -o syslog_name=t122 > t123 unix - - n - 64 smtp > -o smtp_bind_address=209.73.152.123 > -o smtp_helo_name=mail01-t123.raystedman.org > -o syslog_name=t123 > t124 unix - - n - 64 smtp > -o smtp_bind_address=209.73.152.124 > -o smtp_helo_name=mail01-t124.raystedman.org > -o syslog_name=t124 I think the reason that "collate" is not working is your chose of "syslog_name", it does not match any of the expected forms: postfix postfix-instance postfix/sometag postfix-instance/sometag The Perl RE in question is: my $instre = qr{(?x) \A # Absolute line start (?:\S+ \s+){3} # Timestamp, adjust for other time formats \S+ \s+ # Hostname (postfix(?:-[^/\s]+)?) # Capture instance name stopping before first '/' (?:/\S+)* # Optional non-captured '/'-delimited qualifiers / # Final '/' before the daemon program name }; On Wed, May 22, 2024 at 02:10:48PM -0500, Greg Sims wrote: > Postfix is now configured for logging at /var/log/postfix. > * added to main.cf: > maillog_file = /var/log/postfix/0522zh.log > * added to master.cf >postlog unix-dgram n - n - 1 postlogd Good. > I then: > > [root@mail01 0522zh]# perl ../collate.pl 0522zh.log >0522zh.log.collate > [root@mail01 0522zh]# ll > total 95M > -rw-r--r-- 1 root root 2.5M May 22 11:38 0522zh.log > -rw-r--r-- 1 root root 92M May 22 11:55 0522zh.log.collate > > The output of collate does not appear to be useful -- it is the same > as when we are logging to memory. It fails to matcht he delivery agent log entries. -- Viktor. ___ Postfix-users mailing list -- postfix-users@postfix.org To unsubscribe send an email to postfix-users-le...@postfix.org
[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy
Greg Sims via Postfix-users: > > It is assumed that you're not a victim of systemd-journald log mangling. > > It may be dropping some messages, and recording others out of order, > > breaking "collate". On Linux systems where systemd is doing the > > logging, you'll want to have Postfix writing its own log files directly, > > bypassing syslog. I have: I think that the nost useful info is logged by the Postfix SMTP client, especially including the low-level connection info that may not have a queue ID. Wietse ___ Postfix-users mailing list -- postfix-users@postfix.org To unsubscribe send an email to postfix-users-le...@postfix.org
[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy
Alexander Leidinger via Postfix-users wrote in : |Am 2024-05-22 01:22, schrieb Greg Sims via Postfix-users: |> TLS connection reuse is being used. About 10% of the connections are |> reused for large volume ISPs. Small volume ISPs do not see connection |> reuse. I believe this is as expected. |> |> I did some testing of our DNS setup. A DNS query using dig is less |> than 20 msec for both our primary and secondary dns servers in |> /etc/resolv.conf -- see below. | |If all else fails: |The truth can often be seen on the wire. Make a packet trace which |covers the time from "here is the mail you have to send to google" to a |successful delivery and inspect it in wireshark. For TLS traffic you |need the cert/key in wireshark. Do not only trace the smtp traffic, but |all traffic. Inspect what the system is doing (e.g. DNS lookups) and |correlate that to the traffic you see (you can change how timestamps are |displayed in wireshark). This may indicate where those 25 seconds are |spend. | |This is a steep learning curve if you are not familiar already with |interpreting network packets, the smtp protocol, DNS, and wireshark. If |those skills are already available, it may lead to detecting the cause |of what you see faster than the back and forth with guesses here on the |mailinglist. I have not closely followed the thread, but if it is about "hangs" or "delays" on startup, and here "startup" is a wide field given that postfix is biologically recreating itself anew out of its own "all the time", it might be worth .. that: [.]calls the function uname(2)∞ in order to query the nodename of the box (sometimes the real one is needed instead of the one represented by the internal variable hostname†). One may have varying success by ensuring that the real host‐ name and ‘localhost’ have entries in /etc/hosts, or, more generally, that the name service is properly setup – and does hostname(1)∞ return the ex‐ pected value? Does this local hostname have a domain suffix? RFC 6762 standardized the link-local top-level domain ‘.local’, try again after adding an (additional) entry with this extension. (postfix uses gethostname, but it could still end up uname.) I myself have further problems with ssh(1) and my VMs, because that is all (boxed and) DHCP via dnsmasq, and that does not realize that despite the IPv4 it gives a host authoritatively the same hostname should not cause IPv6 queries over the network (ie i occasionally forget to do use the -4 flag, but i reported that years ago as i think it should not be needed since we have authoritatively served a name, no, and despite that say nothing, i have "Host 203.0.113.*" in ssh config, but somehow it skips that for some, uses the stuff from "Host *", whatever *that* is.) Just an idea, Ciao, --steffen | |Der Kragenbaer,The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt) ___ Postfix-users mailing list -- postfix-users@postfix.org To unsubscribe send an email to postfix-users-le...@postfix.org
[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy
> It is assumed that you're not a victim of systemd-journald log mangling. > It may be dropping some messages, and recording others out of order, > breaking "collate". On Linux systems where systemd is doing the > logging, you'll want to have Postfix writing its own log files directly, > bypassing syslog. I have: > > main.cf: > maillog_file = /var/log/postfix/log > > $ ls -ld /var/log/postfix{,/*} > drwxrwxr-x. 2 root postdrop 47 May 22 23:40 /var/log/postfix > -rw--- 1 root root 226 May 22 23:40 /var/log/postfix/log > -rw--- 1 root postdrop 25747944 May 22 23:40 > /var/log/postfix/log.20240522-234048.gz > > See https://www.postfix.org/MAILLOG_README.html and note that rotation > does not (yet) delete old files, that's up to your cron job. Postfix is now configured for logging at /var/log/postfix. * added to main.cf: maillog_file = /var/log/postfix/0522zh.log * added to master.cf postlog unix-dgram n - n - 1 postlogd I then: [root@mail01 0522zh]# perl ../collate.pl 0522zh.log >0522zh.log.collate [root@mail01 0522zh]# ll total 95M -rw-r--r-- 1 root root 2.5M May 22 11:38 0522zh.log -rw-r--r-- 1 root root 92M May 22 11:55 0522zh.log.collate The output of collate does not appear to be useful -- it is the same as when we are logging to memory. I then ran: [root@mail01 0522zh]# cat 0522zh.log.collate | grep -v '192.168.122.11' >0522zh.log.collate.no_priv [root@mail01 0522zh]# ll total 96M -rw-r--r-- 1 root root 2.5M May 22 11:38 0522zh.log -rw-r--r-- 1 root root 92M May 22 11:55 0522zh.log.collate # huge file size once again -rw-r--r-- 1 root root 865K May 22 12:03 0522zh.log.collate.no_priv. # reasonable file size after last grep I scanned the last file. All I see is sequences like this: May 22 11:30:03 mail01 postfix/smtpd[3177]: discarding EHLO keywords: CHUNKING May 22 11:31:22 mail01 postfix/cleanup[3357]: 5592A30F1CB9: message-id=<9d907dff3c240b43a5d92c8b41fcebfa@swift.generated> May 22 11:31:22 mail01 postfix/qmgr[2563]: 5592A30F1CB9: from==gmail@devotion.raystedman.org>, size=45131, nrcpt=1 (queue active) May 22 11:31:43 mail01 postfix/qmgr[2563]: 5592A30F1CB9: removed May 22 11:30:03 mail01 postfix/smtpd[3177]: discarding EHLO keywords: CHUNKING May 22 11:31:22 mail01 postfix/cleanup[3357]: 93D9D30F1E10: message-id=<2e60eb7be562876e35be104949b788c3@swift.generated> May 22 11:31:22 mail01 postfix/qmgr[2563]: 93D9D30F1E10: from==gmail@devotion.raystedman.org>, size=45151, nrcpt=1 (queue active) May 22 11:31:43 mail01 postfix/qmgr[2563]: 93D9D30F1E10: removed This is not the useful data you were looking for. I can not get this to work. I would be glad to give you the log file offline. Best, Greg ___ Postfix-users mailing list -- postfix-users@postfix.org To unsubscribe send an email to postfix-users-le...@postfix.org
[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy
> This is perhaps a good time to ask you for your full configuration, > not just cherry-picked individual settings. Please post the outputs of: > > $ postconf -nf > $ postconf -Mf > > with all whitespace (including linebreaks) preserved. [root@mail01 postfix]# postconf -nf alias_database = hash:/etc/aliases alias_maps = hash:/etc/aliases command_directory = /usr/sbin compatibility_level = 2 daemon_directory = /usr/libexec/postfix data_directory = /var/lib/postfix debug_peer_level = 2 debugger_command = PATH=/bin:/usr/bin:/usr/local/bin:/usr/X11R6/bin ddd $daemon_directory/$process_name $process_id & sleep 5 html_directory = no inet_interfaces = all inet_protocols = ipv4 mail_owner = postfix maillog_file = /var/log/postfix/0522zh.log mailq_path = /usr/bin/mailq.postfix manpage_directory = /usr/share/man maximal_backoff_time = 16m meta_directory = /etc/postfix minimal_backoff_time = 2m mydestination = mynetworks = 192.168.122.0/24, 127.0.0.0/8 myorigin = $mydomain newaliases_path = /usr/bin/newaliases.postfix queue_directory = /var/spool/postfix queue_run_delay = 2m readme_directory = /usr/share/doc/postfix/README_FILES sample_directory = /usr/share/doc/postfix/samples sender_dependent_default_transport_maps = regexp:/etc/postfix/sender_relay.regexp, randmap:{t121,t122,t123,t124} sendmail_path = /usr/sbin/sendmail.postfix setgid_group = postdrop shlib_directory = /usr/lib64/postfix smtp_connection_cache_on_demand = yes smtp_tls_CAfile = /etc/pki/tls/certs/ca-bundle.crt smtp_tls_CApath = /etc/pki/tls/certs smtp_tls_connection_reuse = yes smtp_tls_loglevel = 1 smtp_tls_security_level = may smtpd_tls_cert_file = /etc/pki/tls/certs/postfix.pem smtpd_tls_key_file = /etc/pki/tls/private/postfix.key smtpd_tls_security_level = none unknown_local_recipient_reject_code = 550 [root@mail01 postfix]# [root@mail01 postfix]# [root@mail01 postfix]# [root@mail01 postfix]# postconf -Mf postlogunix-dgram n - n - 1 postlogd smtp inet n - n - - smtpd tlsproxy unix - - n - 0 tlsproxy pickup unix n - n 60 1 pickup cleanupunix n - n - 0 cleanup qmgr unix n - n 300 1 qmgr tlsmgr unix - - n 1000? 1 tlsmgr rewriteunix - - n - - trivial-rewrite bounce unix - - n - 0 bounce defer unix - - n - 0 bounce trace unix - - n - 0 bounce verify unix - - n - 1 verify flush unix n - n 1000? 0 flush proxymap unix - - n - - proxymap proxywrite unix - - n - 1 proxymap smtp unix - - n - - smtp relay unix - - n - - smtp -o syslog_name=postfix/$service_name showq unix n - n - - showq error unix - - n - - error retry unix - - n - - error discardunix - - n - - discard local unix - n n - - local virtualunix - n n - - virtual lmtp unix - - n - - lmtp anvil unix - - n - 1 anvil scache unix - - n - 1 scache rsmunix - - n - - smtp -o smtp_bind_address=209.73.152.117 t121 unix - - n - 64 smtp -o smtp_bind_address=209.73.152.121 -o smtp_helo_name=mail01-t121.raystedman.org -o syslog_name=t121 t122 unix - - n - 64 smtp -o smtp_bind_address=209.73.152.122 -o smtp_helo_name=mail01-t122.raystedman.org -o syslog_name=t122 t123 unix - - n - 64 smtp -o smtp_bind_address=209.73.152.123 -o smtp_helo_name=mail01-t123.raystedman.org -o syslog_name=t123 t124 unix - - n - 64 smtp -o smtp_bind_address=209.73.152.124 -o smtp_helo_name=mail01-t124.raystedman.org -o syslog_name=t124 [root@mail01 postfix]# ___ Postfix-users mailing list -- postfix-users@postfix.org To unsubscribe send an email to postfix-users-le...@postfix.org
[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy
> > If the delay is with sending or receiving RSET, then the SMTP client > log "conversation with XXX timed out". I don't know if that has a > queue ID logged with that, though. Just grep for 'conversation with'. [root@mail01 postfix]# journalctl -u postfix.service | grep 'conversation with' returns null. ___ Postfix-users mailing list -- postfix-users@postfix.org To unsubscribe send an email to postfix-users-le...@postfix.org
[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy
Wietse Venema via Postfix-users: > Greg Sims via Postfix-users: > > May 22 03:13:22 mail01.raystedman.org t123/smtp[46725]: > > 604BE30A4ACA: to=<@gmail.com>, > > relay=gmail-smtp-in.l.google.com[142.251.2.26]:25, conn_use=2, > > delay=1576, delays=0.05/1550/25/0.84, dsn=2.0.0, status=sent (250 > > 2.0.0 OK 1716372802 41be03b00d2f7-6578166a383si2117154a12.283 - > > gsmtp) > > These 25s are unexpected. > > - When Postfix saves/reuses a connection it does not rmember the > time spent setting up the connection. Technically, there is no > msg_stats_print() call in the code path that saves a connection to > the connection cache, and there is no msg_stats_scan() call in the > code path that retrieves a connection from cache. > > That means it took the Postfix SMTP client 25s to: > > - Receive a delivery request from the queue manager, > > - Retrieve a connection from the connection cache, > > - Send the RSET command to the remote SMTP server, > > - Retrieve the RSET server response. > > That is, the clock for 'connection setup' stops when the Postfix > SMTP client builds the MAIL FROM command. This is also when he > clock is started for the time spent delivering the message. If the delay is with snding or receiving RSET, then the SMTP client log "conversation with XXX timed out". I don't know if hat has a queue ID logged with thata, though. Just grep for 'conversation with'. Wietse ___ Postfix-users mailing list -- postfix-users@postfix.org To unsubscribe send an email to postfix-users-le...@postfix.org
[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy
> It is assumed that you're not a victim of systemd-journald log mangling. > It may be dropping some messages, and recording others out of order, > breaking "collate". On Linux systems where systemd is doing the > logging, you'll want to have Postfix writing its own log files directly, > bypassing syslog. I have: > > main.cf: > maillog_file = /var/log/postfix/log > > $ ls -ld /var/log/postfix{,/*} > drwxrwxr-x. 2 root postdrop 47 May 22 23:40 /var/log/postfix > -rw--- 1 root root 226 May 22 23:40 /var/log/postfix/log > -rw--- 1 root postdrop 25747944 May 22 23:40 > /var/log/postfix/log.20240522-234048.gz > We are collecting the log files as input with: journalctl -u postfix.service | grep 'May 22' >0522.log It seems that this may be the issue with "collate". I will add logging in main.cf. We will be sending our daily devotion email in Chinese starting at 12:30 pacific time. I hope to have "collate" data soon afterwards. Best, Greg ___ Postfix-users mailing list -- postfix-users@postfix.org To unsubscribe send an email to postfix-users-le...@postfix.org
[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy
Greg Sims via Postfix-users: > May 22 03:13:22 mail01.raystedman.org t123/smtp[46725]: > 604BE30A4ACA: to=<@gmail.com>, > relay=gmail-smtp-in.l.google.com[142.251.2.26]:25, conn_use=2, > delay=1576, delays=0.05/1550/25/0.84, dsn=2.0.0, status=sent (250 > 2.0.0 OK 1716372802 41be03b00d2f7-6578166a383si2117154a12.283 - > gsmtp) These 25s are unexpected. - When Postfix saves/reuses a connection it does not rmember the time spent setting up the connection. Technically, there is no msg_stats_print() call in the code path that saves a connection to the connection cache, and there is no msg_stats_scan() call in the code path that retrieves a connection from cache. That means it took the Postfix SMTP client 25s to: - Receive a delivery request from the queue manager, - Retrieve a connection from the connection cache, - Send the RSET command to the remote SMTP server, - Retrieve the RSET server response. That is, the clock for 'connection setup' stops when the Postfix SMTP client builds the MAIL FROM command. This is also when he clock is started for the time spent delivering the message. > Another 25 seconds by google without conn_use: > > May 22 03:12:57 mail01.raystedman.org t121/smtp[47173]: > 4224D30C1817: to=<@gmail.com>, > relay=gmail-smtp-in.l.google.com[142.250.141.27]:25, delay=1543, > delays=0.01/1517/25/0.79, dsn=2.0.0, status=sent (250 2.0.0 OK > 1716372777 41be03b00d2f7-66d4526311dsi4920983a12.891 - gsmtp) Yup. Wiets ___ Postfix-users mailing list -- postfix-users@postfix.org To unsubscribe send an email to postfix-users-le...@postfix.org
[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy
On Wed, May 22, 2024 at 08:15:41AM -0500, Greg Sims via Postfix-users wrote: > I am having problems with "collate". I greped a 10 minute portion of > our mail.log which created a 6.8M file. I ran "collate" on this file > and collected the output -- a 796M file. I looked at the file and it > seems to be filled with records like the following: Collate produces blank-line-terminated records, one per input message, showing all the work associated with delivering that message to all its recipients. If there are any partially processed messages in the input, the corresponding "incomplete" records are output last. Here's a typical example: May 22 21:12:44 amnesiac postfix/pickup[3455088]: B46358DF27A: uid=1000 from= May 22 21:12:44 amnesiac postfix/cleanup[3455782]: B46358DF27A: message-id= May 22 21:12:44 amnesiac postfix/qmgr[2546418]: B46358DF27A: from=, size=2652, nrcpt=1 (queue active) May 22 21:12:48 amnesiac postfix/smtp[3455784]: Verified TLS connection established to list.sys4.de[188.68.34.52]:25: TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384 May 22 21:12:49 amnesiac postfix/smtp[3455784]: B46358DF27A: to=, relay=list.sys4.de[188.68.34.52]:25, delay=4.9, delays=0.08/0.01/4/0.74, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 4VkpYG01YKzybg) May 22 21:12:49 amnesiac postfix/qmgr[2546418]: B46358DF27A: removed The "log level 1" TLS diagnostic message is included, even though it does not list the queue-id, it was matched by process name and pid. It is assumed that you're not a victim of systemd-journald log mangling. It may be dropping some messages, and recording others out of order, breaking "collate". On Linux systems where systemd is doing the logging, you'll want to have Postfix writing its own log files directly, bypassing syslog. I have: main.cf: maillog_file = /var/log/postfix/log $ ls -ld /var/log/postfix{,/*} drwxrwxr-x. 2 root postdrop 47 May 22 23:40 /var/log/postfix -rw--- 1 root root 226 May 22 23:40 /var/log/postfix/log -rw--- 1 root postdrop 25747944 May 22 23:40 /var/log/postfix/log.20240522-234048.gz See https://www.postfix.org/MAILLOG_README.html and note that rotation does not (yet) delete old files, that's up to your cron job. -- Viktor. ___ Postfix-users mailing list -- postfix-users@postfix.org To unsubscribe send an email to postfix-users-le...@postfix.org
[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy
I am having problems with "collate". I greped a 10 minute portion of our mail.log which created a 6.8M file. I ran "collate" on this file and collected the output -- a 796M file. I looked at the file and it seems to be filled with records like the following: May 22 02:10:00 mail01.raystedman.org postfix/smtpd[45725]: 122FC30C0DBA: client=web01-p[192.168.122.11] May 22 02:10:00 mail01.raystedman.org postfix/smtpd[45725]: 47ECB30C0DBD: client=web01-p[192.168.122.11] May 22 02:10:00 mail01.raystedman.org postfix/smtpd[45725]: 7BC8E30C178F: client=web01-p[192.168.122.11] I performed the following on the output file: [root@mail01 postfix]# cat 0522.02\:1x.log.collate | grep 122FC30C0DBA | wc 2816 19716 295764 This looks like 2,816 records of output for one email's smtpd record -- all with the same date/time stamp. I can see a few sequences like the following: May 22 02:10:00 mail01.raystedman.org postfix/qmgr[38428]: A01F430BFE03: from==att@devotion.raystedman.org>, size=34464, nrcpt=1 (queue active) May 22 02:10:03 mail01.raystedman.org postfix/qmgr[38428]: A01F430BFE03: removed Is this what you are looking for? Best, Greg ___ Postfix-users mailing list -- postfix-users@postfix.org To unsubscribe send an email to postfix-users-le...@postfix.org
[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy
I have data collection homework to do -- and I will be happy to do it! Config data and "collate" is next after morning meetings. Here is some summary data by ISP from the logs: Email Ave Max Conn Relay SentDelayDelay Reused. google.com21,466 6681,5914,335 outlook.com 8,0568 2432,168 yahoodns.net 6,6178 381,135 prodigy.net 1,1267 420 icloud.com 6797 1980 comcast.net531 10 2240 This may supply some input on the amount of Connection Reuse we are seeing. I'm not sure why google.com is not reusing connections a higher percentage of time. We are always trying to send to gmail.com in the presence of the 25 second delay. Here is a 25 second instance by google with conn_use=2: [root@mail01 postfix]# journalctl -u postfix.service | grep 'May 22' | grep 604BE30A4ACA May 22 02:47:06 mail01.raystedman.org postfix/smtpd[45725]: 604BE30A4ACA: client=web01-p[192.168.122.11] May 22 02:47:06 mail01.raystedman.org postfix/cleanup[47234]: 604BE30A4ACA: message-id=. <8796a03d5ff2313c654830c3fb8cee03@swift.generated> May 22 02:47:06 mail01.raystedman.org postfix/qmgr[46605]: 604BE30A4ACA: from==gmail@devotion.raystedman.org>, size=34484, nrcpt=1 (queue active) May 22 03:13:22 mail01.raystedman.org t123/smtp[46725]: 604BE30A4ACA: to=<@gmail.com>, relay=gmail-smtp-in.l.google.com[142.251.2.26]:25, conn_use=2, delay=1576, delays=0.05/1550/25/0.84, dsn=2.0.0, status=sent (250 2.0.0 OK 1716372802 41be03b00d2f7-6578166a383si2117154a12.283 - gsmtp) May 22 03:13:22 mail01.raystedman.org postfix/qmgr[46605]: 604BE30A4ACA: removed Another 25 seconds by google without conn_use: [root@mail01 postfix]# journalctl -u postfix.service | grep 'May 22' | grep 4224D30C1817 May 22 02:47:14 mail01.raystedman.org postfix/smtpd[45721]: 4224D30C1817: client=web01-p[192.168.122.11] May 22 02:47:14 mail01.raystedman.org postfix/cleanup[47236]: 4224D30C1817: message-id=<3b852728e8ba8e7e4a0e83c4641dd5c5@swift.generated> May 22 02:47:14 mail01.raystedman.org postfix/qmgr[46605]: 4224D30C1817: from==gmail@devotion.raystedman.org>, size=34488, nrcpt=1 (queue active) May 22 03:12:57 mail01.raystedman.org t121/smtp[47173]: 4224D30C1817: to=<@gmail.com>, relay=gmail-smtp-in.l.google.com[142.250.141.27]:25, delay=1543, delays=0.01/1517/25/0.79, dsn=2.0.0, status=sent (250 2.0.0 OK 1716372777 41be03b00d2f7-66d4526311dsi4920983a12.891 - gsmtp) May 22 03:12:57 mail01.raystedman.org postfix/qmgr[46605]: 4224D30C1817: removed ___ Postfix-users mailing list -- postfix-users@postfix.org To unsubscribe send an email to postfix-users-le...@postfix.org
[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy
Le 22/05/2024 à 12:35, Greg Sims via Postfix-users a écrit : Thank you again for your feedback on this issue. I watched the workload in real time this morning and now have more insight into what is happening. It appears the large ISPs are using TLS connection as a way to throttle incoming traffic. I looked at the inbound mail queue and found most of the traffic going to gmail.com. I believe this is because of the 20 & 25 seconds delays google.com is injecting into the TLS connection. I can also see no delays for the smaller ISPs with little traffic. Please see some data below. Best, Greg This is a sample of delays= for google.com -- 20 and 25 second delays: 0.01/11/20/0.73, 0.01/9.5/20/0.77, 0.01/0/25/0.74, 0.01/7.6/25/0.91, 0.01/6.9/25/1.1, 0.01/13/20/4.6, 0.01/14/25/0.56, 0.01/14/25/1.1, 0.01/0/0.22/0.72, 0.01/0/0.39/0.79, Here are delays= from yahoo -- 5.3 second delays: 0.01/0/10/4.1, 0.05/0/5.1/0.95, 0.01/0/5.2/0.68, 0.01/0/5.3/0.57, 0.01/0/0.45/0.42, 0.01/0/5.3/0.42, 0.01/0/5.3/0.34, 0.01/0.39/5.1/0.73, 0.01/0/1.2/2, 0.01/0/5.3/0.34, And from icloud -- little delay: 0.01/0/0.14/2.5, 0.01/0.02/0.15/2.7, 0.01/0/0.34/11, 0.01/0/0.13/4.9, 0.01/0/0.06/4.1, 0.01/0/0.13/1.5, Check all your public DNS records/config which concern you outboud server : PTR, A of your public IP, A and possibly PTR of your EHLO name if different. SPF/DKIM/DMARC record of the outgoing Mail From envelope/From header. Could be caused by bad records or DNS missconfiguration/blackhole or DNS interoperability issues when distant check your configuration. For the TCP session establishment check PTR/A/and A of the returned PTR if different first. But it could be a client issue too. So a complete hosts tcpdump or debug_peer_list session for selected test email (use a user defined specific test transport to enable debug_peer on a specific email address) would produce more interesting timing trace. Emmanuel.___ Postfix-users mailing list -- postfix-users@postfix.org To unsubscribe send an email to postfix-users-le...@postfix.org
[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy
Greg Sims via Postfix-users: > This is a sample of delays= for google.com -- 20 and 25 second delays: > > 0.01/11/20/0.73, > 0.01/9.5/20/0.77, > 0.01/0/25/0.74, > 0.01/7.6/25/0.91, > 0.01/6.9/25/1.1, > 0.01/13/20/4.6, > 0.01/14/25/0.56, > 0.01/14/25/1.1, > 0.01/0/0.22/0.72, > 0.01/0/0.39/0.79, So there is some reuse. You still need to answer Viktor's request for detail logging. That will reveal (to us, at least) what the those 20-25 seconds are going into. It is DNS, TCP connection setup, TLS, and so on. > Here are delays= from yahoo -- 5.3 second delays: > > 0.01/0/10/4.1, > 0.05/0/5.1/0.95, > 0.01/0/5.2/0.68, > 0.01/0/5.3/0.57, > 0.01/0/0.45/0.42, > 0.01/0/5.3/0.42, > 0.01/0/5.3/0.34, > 0.01/0.39/5.1/0.73, > 0.01/0/1.2/2, > 0.01/0/5.3/0.34, Some reuse, mostlyinsane delays. > And from icloud -- little delay: > > 0.01/0/0.14/2.5, > 0.01/0.02/0.15/2.7, > 0.01/0/0.34/11, > 0.01/0/0.13/4.9, > 0.01/0/0.06/4.1, > 0.01/0/0.13/1.5, I wonder what portion of that are reused connections. Well, you have your data collection home work. Wietse ___ Postfix-users mailing list -- postfix-users@postfix.org To unsubscribe send an email to postfix-users-le...@postfix.org
[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy
On Wed, May 22, 2024 at 05:35:25AM -0500, Greg Sims wrote: > Thank you again for your feedback on this issue. You're welcome, but I don't see anything in your reply that responds directly to my requests for more detailed configuration and log data. > I watched the workload in real time this morning and now have more > insight into what is happening. It appears the large ISPs are using > TLS connection as a way to throttle incoming traffic. That conclusion seems hasty. The delays could be for any number of reasons other than deliberate tarpitting, and it is not clear why such tarpitting would be TLS-specific. > I looked at the inbound mail queue and found most of the traffic going > to gmail.com. I believe this is because of the 20 & 25 seconds delays > google.com is injecting into the TLS connection. To properly understand what is happening, you should first run "collate" and see if that adds sufficient detail to pin down the delays, and if not, then "smtp -v", and look for possible "RSET" timeouts. Perhaps Gmail drops idle TLS connections faster than Postfix's 2s cache timeout? Did you by any chance tweak the smtp/scache connection cache timeouts? Longer timeouts risk the remote end closing the connection first, which then causes problems trying to reuse it, ... and so raising the idle timeouts can be rather counterproductive... I'd be surprised if Google's front-end load-balancers drop TLS sessions much faster than cleartext connections, ... but first you need to determine what's happening during those 20s, for which more detailed logging is needed, with "collate" being the simplest to try. You may need to tweak the code slightly to match the date format used by your log server. -- Viktor. ___ Postfix-users mailing list -- postfix-users@postfix.org To unsubscribe send an email to postfix-users-le...@postfix.org
[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy
Thank you again for your feedback on this issue. I watched the workload in real time this morning and now have more insight into what is happening. It appears the large ISPs are using TLS connection as a way to throttle incoming traffic. I looked at the inbound mail queue and found most of the traffic going to gmail.com. I believe this is because of the 20 & 25 seconds delays google.com is injecting into the TLS connection. I can also see no delays for the smaller ISPs with little traffic. Please see some data below. Best, Greg This is a sample of delays= for google.com -- 20 and 25 second delays: 0.01/11/20/0.73, 0.01/9.5/20/0.77, 0.01/0/25/0.74, 0.01/7.6/25/0.91, 0.01/6.9/25/1.1, 0.01/13/20/4.6, 0.01/14/25/0.56, 0.01/14/25/1.1, 0.01/0/0.22/0.72, 0.01/0/0.39/0.79, Here are delays= from yahoo -- 5.3 second delays: 0.01/0/10/4.1, 0.05/0/5.1/0.95, 0.01/0/5.2/0.68, 0.01/0/5.3/0.57, 0.01/0/0.45/0.42, 0.01/0/5.3/0.42, 0.01/0/5.3/0.34, 0.01/0.39/5.1/0.73, 0.01/0/1.2/2, 0.01/0/5.3/0.34, And from icloud -- little delay: 0.01/0/0.14/2.5, 0.01/0.02/0.15/2.7, 0.01/0/0.34/11, 0.01/0/0.13/4.9, 0.01/0/0.06/4.1, 0.01/0/0.13/1.5, ___ Postfix-users mailing list -- postfix-users@postfix.org To unsubscribe send an email to postfix-users-le...@postfix.org
[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy
On Tue, May 21, 2024 at 08:31:51AM -0500, Greg Sims wrote: > Changes: > * certs back to defaults > * smtp_tls_loglevel = 1 Better. Now it is time to post a more detailed transcript of a single message (the sender and recipient addresses can be obfuscated if you wish, the recipient domain would ideally stay untouched). See: https://github.com/vdukhovni/postfix/tree/master/postfix/auxiliary/collate > Here is a set of delays from the logs: > > delays=0.01/2639/25/0.41 > delays=0.01/2639/25/0.58 > delays=0.01/2641/25/0.58 > delays=0.01/2644/25/0.69 > delays=0.01/2643/25/0.58 > delays=0.01/2640/25/0.57 As noted by Wietse that 25s "connection setup" time is the source of the problem, but it might represent more than one connection attempt, as it is the time from the beginning of the delivery attempt to the start of the mail transaction that delivered that envelope recipient, and so if there are multiple connections attempted (or DNS lookup issues, ...) this would include the associated latency. A complete message history (for one such message) from "collate" may show some of the associated activity. You should also check for any relevant related logs from "tlsproxy". If that's still not illuminating, it may be helpful to run at least one smtp(8)-based transport in verbose mode ("-v" flag in master.cf), and to see all the activity that went on behind the scenes. Also, do you have a TLS client session cache configured? I have: smtp_tls_session_cache_database = btree:${data_directory}/smtp_scache this can substantially speed up TLS handshakes with the remote servers, and perhaps also avoid running into potential (speculative) TLS session creation rate limits. Have you changed any of the SMTP client timeouts? DNS aside, the only significant timeout that is south of 20s is: smtp_rset_timeout = 20s which is relevant because the first command to sanity-check a cached connection is "RSET", which is issued just before starting the new mail transaction. If for some reason the remote servers are tarpitting "RSET", that could be the source of the problem. This is perhaps a good time to ask you for your full configuration, not just cherry-picked individual settings. Please post the outputs of: $ postconf -nf $ postconf -Mf with all whitespace (including linebreaks) preserved. -- Viktor. ___ Postfix-users mailing list -- postfix-users@postfix.org To unsubscribe send an email to postfix-users-le...@postfix.org
[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy
Am 2024-05-22 01:22, schrieb Greg Sims via Postfix-users: TLS connection reuse is being used. About 10% of the connections are reused for large volume ISPs. Small volume ISPs do not see connection reuse. I believe this is as expected. I did some testing of our DNS setup. A DNS query using dig is less than 20 msec for both our primary and secondary dns servers in /etc/resolv.conf -- see below. If all else fails: The truth can often be seen on the wire. Make a packet trace which covers the time from "here is the mail you have to send to google" to a successful delivery and inspect it in wireshark. For TLS traffic you need the cert/key in wireshark. Do not only trace the smtp traffic, but all traffic. Inspect what the system is doing (e.g. DNS lookups) and correlate that to the traffic you see (you can change how timestamps are displayed in wireshark). This may indicate where those 25 seconds are spend. This is a steep learning curve if you are not familiar already with interpreting network packets, the smtp protocol, DNS, and wireshark. If those skills are already available, it may lead to detecting the cause of what you see faster than the back and forth with guesses here on the mailinglist. Bye, Alexander. -- http://www.Leidinger.net alexan...@leidinger.net: PGP 0x8F31830F9F2772BF http://www.FreeBSD.orgnetch...@freebsd.org : PGP 0x8F31830F9F2772BF signature.asc Description: OpenPGP digital signature ___ Postfix-users mailing list -- postfix-users@postfix.org To unsubscribe send an email to postfix-users-le...@postfix.org
[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy
TLS connection reuse is being used. About 10% of the connections are reused for large volume ISPs. Small volume ISPs do not see connection reuse. I believe this is as expected. I did some testing of our DNS setup. A DNS query using dig is less than 20 msec for both our primary and secondary dns servers in /etc/resolv.conf -- see below. Thanks, Greg [rsm@mail01 ~]$ cat /etc/resolv.conf # Generated by NetworkManager search raystedman.org nameserver 192.168.122.1 nameserver 8.8.4.4 [rsm@mail01 ~]$ dig @192.168.122.1 google.com ; <<>> DiG 9.11.36-RedHat-9.11.36-11.el8_9.1 <<>> @192.168.122.1 google.com ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 45410 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 1232 ;; QUESTION SECTION: ;google.com. IN A ;; ANSWER SECTION: google.com. 172 IN A 142.250.191.78 ;; Query time: 1 msec <<< ;; SERVER: 192.168.122.1#53(192.168.122.1) ;; WHEN: Tue May 21 15:56:06 PDT 2024 ;; MSG SIZE rcvd: 55 [rsm@mail01 ~]$ dig @8.8.4.4 google.com ; <<>> DiG 9.11.36-RedHat-9.11.36-11.el8_9.1 <<>> @8.8.4.4 google.com ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 35863 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 512 ;; QUESTION SECTION: ;google.com. IN A ;; ANSWER SECTION: google.com. 300 IN A 142.250.191.46 ;; Query time: 16 msec ;; SERVER: 8.8.4.4#53(8.8.4.4) ;; WHEN: Tue May 21 15:57:47 PDT 2024 ;; MSG SIZE rcvd: 55 ___ Postfix-users mailing list -- postfix-users@postfix.org To unsubscribe send an email to postfix-users-le...@postfix.org
[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy
Jaroslaw Rafa via Postfix-users: > Dnia 21.05.2024 o godz. 16:38:21 Wietse Venema via Postfix-users pisze: > > > delays=0.01/2639/25/0.41 > > > delays=0.01/2639/25/0.58 > > > delays=0.01/2641/25/0.58 > > > delays=0.01/2644/25/0.69 > > > delays=0.01/2643/25/0.58 > > > delays=0.01/2640/25/0.57 > [...] > > c=25s. It takes a whopping 25 econds for connection setup, including DNS, > > EHLO, and STARTTLS. And if that is not bad enough already, > > your connectuions also are not reysed. Reusing a connection would > > reduce the 25s dramatically. > > I wonder why it is *always* 25 seconds, while the other delays slightly > vary, which would be normally expected. Is it possible that something in > your sending pipeline is introducing an artificial delay when setting up the > connection, for example something tries (unsuccessfully) to connect to > somewhere and only *after* the attempt times out, Postfix continues with > connection setup? If timeout in question is 25 seconds, this would explain > why the value is so constant here... DNS queries time out after 5s, so that could be a reason, oo (bad server(s) in /etc/resolv.conf). Wietse ___ Postfix-users mailing list -- postfix-users@postfix.org To unsubscribe send an email to postfix-users-le...@postfix.org
[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy
Dnia 21.05.2024 o godz. 16:38:21 Wietse Venema via Postfix-users pisze: > > delays=0.01/2639/25/0.41 > > delays=0.01/2639/25/0.58 > > delays=0.01/2641/25/0.58 > > delays=0.01/2644/25/0.69 > > delays=0.01/2643/25/0.58 > > delays=0.01/2640/25/0.57 [...] > c=25s. It takes a whopping 25 econds for connection setup, including DNS, > EHLO, and STARTTLS. And if that is not bad enough already, > your connectuions also are not reysed. Reusing a connection would > reduce the 25s dramatically. I wonder why it is *always* 25 seconds, while the other delays slightly vary, which would be normally expected. Is it possible that something in your sending pipeline is introducing an artificial delay when setting up the connection, for example something tries (unsuccessfully) to connect to somewhere and only *after* the attempt times out, Postfix continues with connection setup? If timeout in question is 25 seconds, this would explain why the value is so constant here... -- Regards, Jaroslaw Rafa r...@rafa.eu.org -- "In a million years, when kids go to school, they're gonna know: once there was a Hushpuppy, and she lived with her daddy in the Bathtub." ___ Postfix-users mailing list -- postfix-users@postfix.org To unsubscribe send an email to postfix-users-le...@postfix.org
[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy
Greg Sims via Postfix-users: > Thank you Viktor. > > Answers: > * smtp_connection_cache_on_demand = yes -- this was configured > > Changes: > * certs back to defaults > * smtp_tls_loglevel = 1 > > Before enabling TLS our send rate was about 4K emails per minute -- we > are now seeing 300 to 500 per minute. > > The email creation process is sending new mail via a private network. > We are in the middle of an ip address migration so -- email creation > is currently single threaded with 100 uSec delays between emails. I > can increase the concurrency/speed of the email creation process(es) > -- I fear it would only increase the size of the inbound email queue. > > Here is a set of delays from the logs: > > delays=0.01/2639/25/0.41 > delays=0.01/2639/25/0.58 > delays=0.01/2641/25/0.58 > delays=0.01/2644/25/0.69 > delays=0.01/2643/25/0.58 > delays=0.01/2640/25/0.57 > > I scanned a large section of the logs | grep status=sent. These > delays are consistent throughout the peak demand period. I'm the person who implemented the delays=a/b/c/d breakdown. See https://www.postfix.org/postconf.5.html#delay_logging_resolution_limit In your case: a=0.01s. You have negligible delay between between Postfix reciving a message into the incoming queue, and the queue manager moving it onoto the active queue (i.e. scheduled by the queue manager for delivery). b=2640 Messages are waiting a long timr for their turn to be delivered. This is a lot of time because... c=25s. It takes a whopping 25 econds for connection setup, including DNS, EHLO, and STARTTLS. And if that is not bad enough already, your connectuions also are not reysed. Reusing a connection would reduce the 25s dramatically. d=.05s, the time to actually deliver the message. Wietse ___ Postfix-users mailing list -- postfix-users@postfix.org To unsubscribe send an email to postfix-users-le...@postfix.org
[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy
Thank you Viktor. Answers: * smtp_connection_cache_on_demand = yes -- this was configured Changes: * certs back to defaults * smtp_tls_loglevel = 1 Before enabling TLS our send rate was about 4K emails per minute -- we are now seeing 300 to 500 per minute. The email creation process is sending new mail via a private network. We are in the middle of an ip address migration so -- email creation is currently single threaded with 100 uSec delays between emails. I can increase the concurrency/speed of the email creation process(es) -- I fear it would only increase the size of the inbound email queue. Here is a set of delays from the logs: delays=0.01/2639/25/0.41 delays=0.01/2639/25/0.58 delays=0.01/2641/25/0.58 delays=0.01/2644/25/0.69 delays=0.01/2643/25/0.58 delays=0.01/2640/25/0.57 I scanned a large section of the logs | grep status=sent. These delays are consistent throughout the peak demand period. Best, Greg On Tue, May 21, 2024 at 7:12 AM Viktor Dukhovni via Postfix-users wrote: > > On Tue, May 21, 2024 at 06:51:08AM -0500, Greg Sims via Postfix-users wrote: > > > Our main.cf contains: > > smtpd_tls_cert_file = > > smtpd_tls_key_file = > > smtpd_tls_security_level = none > > There's no point in configuring SMTP server certificates when TLS is > disabled in the SMTP server. If the idea is to make tlsproxy(8) > "happy", then try: "smtpd_tls_cert_file = none" and leave the key file > setting at its default empty value. > > > smtp_tls_security_level = may > > Fine. > > > We have used connection caching in the past so we added: > > * smtp_tls_connection_reuse=yes > > Presumably, you also have "smtp_connection_cache_on_demand = yes". > > > * uncommented tlsproxy / maxproc = 0 in master.cf > > * commented smtp_tls_loglevel -- no logging as a result > > Level 1 is recommended when TLS is enabled. The additional logging > overhead is quite modest. > > > The outbound message transfer rate of our configuration is less than > > 500 emails per minute. I noted the following: > > > > * we have four entries in master.cf for smtp -- each has a > > unique ip address with maxproc=32; these are used with randmap{} in > > main.cf > > Fine. > > > * queuing of inbound email is high > > * the inbound email queue contains over 3,000 emails > > So that's ~6 minutes of backlog... Plausibly due to downstream rate > and/or concurrency limits > > > * email average delay is over 400 seconds and 1,100 > > seconds for google.com where most of the email is delivered > > If the backlog isn't growing, can you post the averages of the > delays=a/b/c/d components? > > > * htop shows > > * cpu utilization is low with Load average under 0.10 on > > four physical processors and 4GB memory / 500MB used > > CPU was unlikely to be the problem. > > > * only one tlsproxy process running -- the highest cpu > > utilization process most of the time > > Fine. More would be spawned if it got too busy to serve client requests > in a timely manner. > > > * 20 processes exist for each of our four ip address/smtp > > entries in master.cf with maxproc=32 > > > > I am concerned that the queuing of inbound email is caused by there > > only being one tlsproxy process. > > More plausibly the real issue is message delivery latency to the various > destinations. > > > maxproc=0 seems to allow for an unlimited number. We seem to have > > plenty of smtp processes as postfix is not starting more of them to > > reach the maxproc=32. > > You could configure separate tlsproxy(8) services for each of the > smtp(8) transports by overriding "tlsproxy_service_name" in master.cf, > that will give you multiple tlsproxy(8) processes, but I guessing won't > change much, if the issue is downstream delays. > > -- > Viktor. > ___ > Postfix-users mailing list -- postfix-users@postfix.org > To unsubscribe send an email to postfix-users-le...@postfix.org ___ Postfix-users mailing list -- postfix-users@postfix.org To unsubscribe send an email to postfix-users-le...@postfix.org
[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy
On Tue, May 21, 2024 at 06:51:08AM -0500, Greg Sims via Postfix-users wrote: > Our main.cf contains: > smtpd_tls_cert_file = > smtpd_tls_key_file = > smtpd_tls_security_level = none There's no point in configuring SMTP server certificates when TLS is disabled in the SMTP server. If the idea is to make tlsproxy(8) "happy", then try: "smtpd_tls_cert_file = none" and leave the key file setting at its default empty value. > smtp_tls_security_level = may Fine. > We have used connection caching in the past so we added: > * smtp_tls_connection_reuse=yes Presumably, you also have "smtp_connection_cache_on_demand = yes". > * uncommented tlsproxy / maxproc = 0 in master.cf > * commented smtp_tls_loglevel -- no logging as a result Level 1 is recommended when TLS is enabled. The additional logging overhead is quite modest. > The outbound message transfer rate of our configuration is less than > 500 emails per minute. I noted the following: > > * we have four entries in master.cf for smtp -- each has a > unique ip address with maxproc=32; these are used with randmap{} in > main.cf Fine. > * queuing of inbound email is high > * the inbound email queue contains over 3,000 emails So that's ~6 minutes of backlog... Plausibly due to downstream rate and/or concurrency limits > * email average delay is over 400 seconds and 1,100 > seconds for google.com where most of the email is delivered If the backlog isn't growing, can you post the averages of the delays=a/b/c/d components? > * htop shows > * cpu utilization is low with Load average under 0.10 on > four physical processors and 4GB memory / 500MB used CPU was unlikely to be the problem. > * only one tlsproxy process running -- the highest cpu > utilization process most of the time Fine. More would be spawned if it got too busy to serve client requests in a timely manner. > * 20 processes exist for each of our four ip address/smtp > entries in master.cf with maxproc=32 > > I am concerned that the queuing of inbound email is caused by there > only being one tlsproxy process. More plausibly the real issue is message delivery latency to the various destinations. > maxproc=0 seems to allow for an unlimited number. We seem to have > plenty of smtp processes as postfix is not starting more of them to > reach the maxproc=32. You could configure separate tlsproxy(8) services for each of the smtp(8) transports by overriding "tlsproxy_service_name" in master.cf, that will give you multiple tlsproxy(8) processes, but I guessing won't change much, if the issue is downstream delays. -- Viktor. ___ Postfix-users mailing list -- postfix-users@postfix.org To unsubscribe send an email to postfix-users-le...@postfix.org