[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy

2024-05-26 Thread Wietse Venema via Postfix-users
This problem was resolved off-list.

Greg Sims:
> Wietse & Viktor,
> 
> All is not lost.  Restarting BIND on Ray08 solved the problem of
> c=30!!  I am sorry that I did not review/restart this service earlier.
> Your comments related to the 5 second intervals and DNS timeouts
> caused me to look there.

Unfortunately we went down some dead ends because I missed that
the Postfix SMTP client will make DNS MX and address queries even
when reusing connections. I am updating Postfix documentation.

With the crippled DNS resolver, the Postfix 'type c' delays were
dominated by the number of DNS lookups. This resulted in larger
delays for domains with many MX records such as gmail.com, compared
to domains with fewer MX records such as yahoo.com or outlook. These
differences in delays were not the result of receiver-side throttling.

Wietse
___
Postfix-users mailing list -- postfix-users@postfix.org
To unsubscribe send an email to postfix-users-le...@postfix.org


[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy

2024-05-23 Thread Viktor Dukhovni via Postfix-users
On Thu, May 23, 2024 at 05:48:29PM -0400, Wietse Venema via Postfix-users wrote:

> Greg Sims via Postfix-users:
> > We see conn_use about 24% of the time:
> 
> But none of the sessions shown in your message have that.
> 
> Do they also have multiple-of-5-second type 'c' delays?

Indeed those multiples of 5s sure feel like DNS lookup delays.  Unclear
why this would correlated with enabling TLS connection reuse, I'd
conjecture that's not the case, and that similar delays would also be
seen for non-TLS connections to the same destination.

It is also unclear whether the delays are the OP's systems looking up
Google's addresses, or Google's systems looking up the OP's reverse and
forward DNS, though when I query Quad8 DNS for the OP's MTA, the
response is prompt.

-- 
Viktor.
___
Postfix-users mailing list -- postfix-users@postfix.org
To unsubscribe send an email to postfix-users-le...@postfix.org


[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy

2024-05-23 Thread Wietse Venema via Postfix-users
Greg Sims via Postfix-users:
> We see conn_use about 24% of the time:

But none of the sessions shown in your message have that.

Do they also have multiple-of-5-second type 'c' delays?

Wietse 
___
Postfix-users mailing list -- postfix-users@postfix.org
To unsubscribe send an email to postfix-users-le...@postfix.org


[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy

2024-05-23 Thread Greg Sims via Postfix-users
On Thu, May 23, 2024 at 7:07 AM Greg Sims  wrote:
>
> Thank you Viktor.  All recommended changes have been made. I hope to
> collect useful "collate" data with our next distribution at Noon today
> pacific.
>

Still having problems with the inbound smtpd from our private network
flooding "collate".  This could be caused by the process number always
being the same.

  May 23 11:30:07 mail01 postfix/smtpd[12433]: E050630F1BE6:
client=web01-p[192.168.122.11]
  May 23 11:30:08 mail01 postfix/smtpd[12433]: 2245330F1BEA:
client=web01-p[192.168.122.11]
  May 23 11:30:08 mail01 postfix/smtpd[12433]: 5711630F1BEC:
client=web01-p[192.168.122.11]

  root@mail01 0523zh]# ll
  total 99M
  -rw-r--r-- 1 root root 2.6M May 23 12:45 0523zh.log
  -rw-r--r-- 1 root root  94M May 23 12:46 0523zh.log.collate
  -rw-r--r-- 1 root root 1.9M May 23 12:51 0523zh.log.collate.no_priv

I removed the smtpd records with grep -v to create the last file above
with no private ip addresses (no_priv).

One thing I noticed changing with the updates to main.cf and
master.cf.  I now see only "Untrusted TLS connection established".
Last night (before the changes) we had 40K Trusted and TLS_Anonymous:
35, TLS_Untrusted: 78.  Could this have something to do with deleting
the default Certs associated with TLS?

And now some "collate" data.

The run starts at 11:30 -- we already have a long delay of 10 seconds:

  May 23 11:30:07 mail01 postfix/smtpd[12438]: discarding EHLO
keywords: CHUNKING
  May 23 11:30:11 mail01 postfix/cleanup[12440]: F3B3630F1C5C:
message-id=<62994d8ff3eb719d8ddc674c2cae6b38@swift.generated>
  May 23 11:30:11 mail01 postfix/qmgr[11016]: F3B3630F1C5C:
from==hotmail@devotion.raystedman.org>,
size=50829, nrcpt=1 (queue active)
  May 23 11:30:21 mail01 postfix/t124/smtp[12472]: Untrusted TLS
connection established to
hotmail-com.olc.protection.outlook.com[52.101.42.8]:25: TLSv1.3 with
cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE
(P-384) server-signature RSA-PSS (2048 bits) server-digest SHA256
  May 23 11:30:21 mail01 postfix/t124/smtp[12472]: F3B3630F1C5C:
to=<@hotmail.com>,
relay=hotmail-com.olc.protection.outlook.com[52.101.42.8]:25,
delay=11, delays=0.01/0.01/10/0.38, dsn=2.6.0, status=sent (250 2.6.0
<62994d8ff3eb719d8ddc674c2cae6b38@swift.generated>
[InternalId=21569325775951,
Hostname=CO3P220MB0782.NAMP220.PROD.OUTLOOK.COM] 60415 bytes in 0.104,
566.038 KB/sec Queued mail for delivery -> 250 2.1.5)
  May 23 11:30:21 mail01 postfix/qmgr[11016]: F3B3630F1C5C: removed

Now 20 seconds:

  May 23 11:30:07 mail01 postfix/smtpd[12433]: discarding EHLO
keywords: CHUNKING
  May 23 11:30:09 mail01 postfix/cleanup[12441]: 3634330F1BF4:
message-id=<826855425363d60cc81d5b8f49e83579@swift.generated>
  May 23 11:30:09 mail01 postfix/qmgr[11016]: 3634330F1BF4:
from==yahoo@devotion.raystedman.org>,
size=50821, nrcpt=1 (queue active)
  May 23 11:30:29 mail01 postfix/t123/smtp[12455]: Untrusted TLS
connection established to mta6.am0.yahoodns.net[98.136.96.75]:25:
TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange
X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256
  May 23 11:30:30 mail01 postfix/t123/smtp[12455]: 3634330F1BF4:
to=<@yahoo.com>, relay=mta6.am0.yahoodns.net[98.136.96.75]:25,
delay=21, delays=0.01/0.01/20/0.54, dsn=2.0.0, status=sent (250 ok
dirdel)
  May 23 11:30:30 mail01 postfix/qmgr[11016]: 3634330F1BF4: removed

Towards the end of the run -- 30 seconds:

  May 23 11:30:07 mail01 postfix/smtpd[12433]: discarding EHLO
keywords: CHUNKING
  May 23 11:33:49 mail01 postfix/cleanup[12690]: BFB3B30F279B:
message-id=<6e7c80dbfbec093a18061cdca1ae4b9c@swift.generated>
  May 23 11:33:49 mail01 postfix/qmgr[11016]: BFB3B30F279B:
from==gmail@devotion.raystedman.org>,
size=50821, nrcpt=1 (queue active).
  May 23 11:35:28 mail01 postfix/t121/smtp[12668]: Untrusted TLS
connection established to gmail-smtp-in.l.google.com[142.251.2.27]:25:
TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange
X25519 server-signature ECDSA (P-256) server-digest SHA256
  May 23 11:35:29 mail01 postfix/t121/smtp[12668]: BFB3B30F279B:
to=<@gmail.com>,
relay=gmail-smtp-in.l.google.com[142.251.2.27]:25, delay=100,
delays=0.01/69/30/0.8, dsn=2.0.0, status=sent (250 2.0.0 OK
1716489329 d2e1a72fcca58-6f4d2b41c18si755622b3a.334 - gsmtp)
  May 23 11:35:29 mail01 postfix/qmgr[11016]: BFB3B30F279B: removed

Max connections from hotmail:

  May 23 11:30:07 mail01 postfix/smtpd[12433]: discarding EHLO
keywords: CHUNKING
  May 23 11:32:45 mail01 postfix/cleanup[12681]: 9433130F50E9:
message-id=<6ace7587b9302b23a29b052a5503afb9@swift.generated>
  May 23 11:32:45 mail01 postfix/qmgr[11016]: 9433130F50E9:
from==hotmail@devotion.raystedman.org>,
size=50833, nrcpt=1 (queue active)
  May 23 11:32:45 mail01 postfix/t121/smtp[12559]: Untrusted TLS
connection established to
hotmail-com.olc.protection.outlook.com[52.101.11.16]:25: TLSv1.3 with
cipher TLS_AES_256_GCM_SHA384 (256/256 bits) 

[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy

2024-05-23 Thread Greg Sims via Postfix-users
Thank you Viktor.  All recommended changes have been made. I hope to
collect useful "collate" data with our next distribution at Noon today
pacific.

I hope you have a great day!  Greg

>   [root@mail01 postfix]# postconf -nf
>
>   [root@mail01 postfix]# postconf -Mf
___
Postfix-users mailing list -- postfix-users@postfix.org
To unsubscribe send an email to postfix-users-le...@postfix.org


[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy

2024-05-22 Thread Viktor Dukhovni via Postfix-users
On Wed, May 22, 2024 at 12:19:03PM -0500, Greg Sims wrote:

>   [root@mail01 postfix]# postconf -nf
>   maximal_backoff_time = 16m
>   minimal_backoff_time = 2m
>   queue_run_delay = 2m

FWIW (not related to your immediate issue) I would not recommend such a
short maximal backoff, you're potentially filling the active queue with
messages that will again tempfail , and perhaps take a long time to do
so.  I typically go with:

minimal_backoff_time = 225s
maximal_backoff_time = 7200s
queue_run_delay = 225s

>   sender_dependent_default_transport_maps =
>   regexp:/etc/postfix/sender_relay.regexp, randmap:{t121,t122,t123,t124}

See below...


>   smtp_tls_security_level = may
>   smtp_tls_CAfile = /etc/pki/tls/certs/ca-bundle.crt
>   smtp_tls_CApath = /etc/pki/tls/certs

The CAfile and CApath are not needed (just waste memory) until you set
something other than "may" as the security level for at least some
destinations.

>   smtpd_tls_cert_file = /etc/pki/tls/certs/postfix.pem
>   smtpd_tls_key_file = /etc/pki/tls/private/postfix.key
>   smtpd_tls_security_level = none

The certificates are not needed while the security level is "none".

>   [root@mail01 postfix]# postconf -Mf
> ...
>   t121   unix  -   -   n   -   64  smtp
>   -o smtp_bind_address=209.73.152.121
>   -o smtp_helo_name=mail01-t121.raystedman.org
>   -o syslog_name=t121
>   t122   unix  -   -   n   -   64  smtp
>   -o smtp_bind_address=209.73.152.122
>   -o smtp_helo_name=mail01-t122.raystedman.org
>   -o syslog_name=t122
>   t123   unix  -   -   n   -   64  smtp
>   -o smtp_bind_address=209.73.152.123
>   -o smtp_helo_name=mail01-t123.raystedman.org
>   -o syslog_name=t123
>   t124   unix  -   -   n   -   64  smtp
>   -o smtp_bind_address=209.73.152.124
>   -o smtp_helo_name=mail01-t124.raystedman.org
>   -o syslog_name=t124

I think the reason that "collate" is not working is your chose of
"syslog_name", it does not match any of the expected forms:

postfix
postfix-instance
postfix/sometag
postfix-instance/sometag

The Perl RE in question is:

my $instre = qr{(?x)
\A  # Absolute line start
(?:\S+ \s+){3}  # Timestamp, adjust for other time formats
\S+ \s+ # Hostname
(postfix(?:-[^/\s]+)?)  # Capture instance name stopping before 
first '/'
(?:/\S+)*   # Optional non-captured '/'-delimited 
qualifiers
/   # Final '/' before the daemon program name
};


On Wed, May 22, 2024 at 02:10:48PM -0500, Greg Sims wrote:

> Postfix is now configured for logging at /var/log/postfix.
>   * added to main.cf:
>   maillog_file = /var/log/postfix/0522zh.log
>   * added to master.cf
>postlog   unix-dgram n  -   n   -   1   postlogd

Good.

> I then:
> 
>   [root@mail01 0522zh]# perl ../collate.pl 0522zh.log >0522zh.log.collate
>   [root@mail01 0522zh]# ll
>   total 95M
>   -rw-r--r-- 1 root root 2.5M May 22 11:38 0522zh.log
>   -rw-r--r-- 1 root root  92M May 22 11:55 0522zh.log.collate
> 
> The output of collate does not appear to be useful -- it is the same
> as when we are logging to memory.

It fails to matcht he delivery agent log entries.

-- 
Viktor.
___
Postfix-users mailing list -- postfix-users@postfix.org
To unsubscribe send an email to postfix-users-le...@postfix.org


[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy

2024-05-22 Thread Wietse Venema via Postfix-users
Greg Sims via Postfix-users:
> > It is assumed that you're not a victim of systemd-journald log mangling.
> > It may be dropping some messages, and recording others out of order,
> > breaking "collate".  On Linux systems where systemd is doing the
> > logging, you'll want to have Postfix writing its own log files directly,
> > bypassing syslog.  I have:

I think that the nost useful info is logged by the Postfix SMTP
client, especially including the low-level connection info that may
not have a queue ID.

Wietse
___
Postfix-users mailing list -- postfix-users@postfix.org
To unsubscribe send an email to postfix-users-le...@postfix.org


[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy

2024-05-22 Thread Steffen Nurpmeso via Postfix-users
Alexander Leidinger via Postfix-users wrote in
 :
 |Am 2024-05-22 01:22, schrieb Greg Sims via Postfix-users:
 |> TLS connection reuse is being used.  About 10% of the connections are
 |> reused for large volume ISPs. Small volume ISPs do not see connection
 |> reuse.  I believe this is as expected.
 |> 
 |> I did some testing of our DNS setup.  A DNS query using dig is less
 |> than 20 msec for both our primary and secondary dns servers in
 |> /etc/resolv.conf -- see below.
 |
 |If all else fails:
 |The truth can often be seen on the wire. Make a packet trace which 
 |covers the time from "here is the mail you have to send to google" to a 
 |successful delivery and inspect it in wireshark. For TLS traffic you 
 |need the cert/key in wireshark. Do not only trace the smtp traffic, but 
 |all traffic. Inspect what the system is doing (e.g. DNS lookups) and 
 |correlate that to the traffic you see (you can change how timestamps are 
 |displayed in wireshark). This may indicate where those 25 seconds are 
 |spend.
 |
 |This is a steep learning curve if you are not familiar already with 
 |interpreting network packets, the smtp protocol, DNS, and wireshark. If 
 |those skills are already available, it may lead to detecting the cause 
 |of what you see faster than the back and forth with guesses here on the 
 |mailinglist.

I have not closely followed the thread, but if it is about "hangs"
or "delays" on startup, and here "startup" is a wide field given
that postfix is biologically recreating itself anew out of its own
"all the time", it might be worth .. that:

  [.]calls the function
   uname(2)∞ in order to query the nodename of the box (sometimes the real
   one is needed instead of the one represented by the internal variable
   hostname†).  One may have varying success by ensuring that the real host‐
   name and ‘localhost’ have entries in /etc/hosts, or, more generally, that
   the name service is properly setup – and does hostname(1)∞ return the ex‐
   pected value?  Does this local hostname have a domain suffix?  RFC 6762
   standardized the link-local top-level domain ‘.local’, try again after
   adding an (additional) entry with this extension.

(postfix uses gethostname, but it could still end up uname.)
I myself have further problems with ssh(1) and my VMs, because
that is all (boxed and) DHCP via dnsmasq, and that does not
realize that despite the IPv4 it gives a host authoritatively the
same hostname should not cause IPv6 queries over the network (ie
i occasionally forget to do use the -4 flag, but i reported that
years ago as i think it should not be needed since we have
authoritatively served a name, no, and despite that say nothing,
i have "Host 203.0.113.*" in ssh config, but somehow it skips
that for some, uses the stuff from "Host *", whatever *that* is.)

Just an idea,
Ciao,

--steffen
|
|Der Kragenbaer,The moon bear,
|der holt sich munter   he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)
___
Postfix-users mailing list -- postfix-users@postfix.org
To unsubscribe send an email to postfix-users-le...@postfix.org


[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy

2024-05-22 Thread Greg Sims via Postfix-users
> It is assumed that you're not a victim of systemd-journald log mangling.
> It may be dropping some messages, and recording others out of order,
> breaking "collate".  On Linux systems where systemd is doing the
> logging, you'll want to have Postfix writing its own log files directly,
> bypassing syslog.  I have:
>
> main.cf:
> maillog_file = /var/log/postfix/log
>
> $ ls -ld /var/log/postfix{,/*}
> drwxrwxr-x. 2 root postdrop   47 May 22 23:40 /var/log/postfix
> -rw---  1 root root  226 May 22 23:40 /var/log/postfix/log
> -rw---  1 root postdrop 25747944 May 22 23:40 
> /var/log/postfix/log.20240522-234048.gz
>
> See https://www.postfix.org/MAILLOG_README.html and note that rotation
> does not (yet) delete old files, that's up to your cron job.

Postfix is now configured for logging at /var/log/postfix.
  * added to main.cf:
  maillog_file = /var/log/postfix/0522zh.log
  * added to master.cf
   postlog   unix-dgram n  -   n   -   1   postlogd

I then:

  [root@mail01 0522zh]# perl ../collate.pl 0522zh.log >0522zh.log.collate
  [root@mail01 0522zh]# ll
  total 95M
  -rw-r--r-- 1 root root 2.5M May 22 11:38 0522zh.log
  -rw-r--r-- 1 root root  92M May 22 11:55 0522zh.log.collate

The output of collate does not appear to be useful -- it is the same
as when we are logging to memory.

I then ran:

  [root@mail01 0522zh]# cat 0522zh.log.collate | grep -v
'192.168.122.11' >0522zh.log.collate.no_priv
  [root@mail01 0522zh]# ll
  total 96M
  -rw-r--r-- 1 root root 2.5M May 22 11:38 0522zh.log
  -rw-r--r-- 1 root root  92M May 22 11:55 0522zh.log.collate
   # huge file size once again
  -rw-r--r-- 1 root root 865K May 22 12:03 0522zh.log.collate.no_priv.
  # reasonable file size after last grep

I scanned the last file.  All I see is sequences like this:

  May 22 11:30:03 mail01 postfix/smtpd[3177]: discarding EHLO keywords: CHUNKING
  May 22 11:31:22 mail01 postfix/cleanup[3357]: 5592A30F1CB9:
message-id=<9d907dff3c240b43a5d92c8b41fcebfa@swift.generated>
  May 22 11:31:22 mail01 postfix/qmgr[2563]: 5592A30F1CB9:
from==gmail@devotion.raystedman.org>,
size=45131, nrcpt=1 (queue active)
  May 22 11:31:43 mail01 postfix/qmgr[2563]: 5592A30F1CB9: removed

  May 22 11:30:03 mail01 postfix/smtpd[3177]: discarding EHLO keywords: CHUNKING
  May 22 11:31:22 mail01 postfix/cleanup[3357]: 93D9D30F1E10:
message-id=<2e60eb7be562876e35be104949b788c3@swift.generated>
  May 22 11:31:22 mail01 postfix/qmgr[2563]: 93D9D30F1E10:
from==gmail@devotion.raystedman.org>,
size=45151, nrcpt=1 (queue active)
  May 22 11:31:43 mail01 postfix/qmgr[2563]: 93D9D30F1E10: removed

This is not the useful data you were looking for.  I can not get this
to work.  I would be glad to give you the log file offline.

Best, Greg
___
Postfix-users mailing list -- postfix-users@postfix.org
To unsubscribe send an email to postfix-users-le...@postfix.org


[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy

2024-05-22 Thread Greg Sims via Postfix-users
> This is perhaps a good time to ask you for your full configuration,
> not just cherry-picked individual settings.  Please post the outputs of:
>
> $ postconf -nf
> $ postconf -Mf
>
> with all whitespace (including linebreaks) preserved.

  [root@mail01 postfix]# postconf -nf
  alias_database = hash:/etc/aliases
  alias_maps = hash:/etc/aliases
  command_directory = /usr/sbin
  compatibility_level = 2
  daemon_directory = /usr/libexec/postfix
  data_directory = /var/lib/postfix
  debug_peer_level = 2
  debugger_command = PATH=/bin:/usr/bin:/usr/local/bin:/usr/X11R6/bin ddd
  $daemon_directory/$process_name $process_id & sleep 5
  html_directory = no
  inet_interfaces = all
  inet_protocols = ipv4
  mail_owner = postfix
  maillog_file = /var/log/postfix/0522zh.log
  mailq_path = /usr/bin/mailq.postfix
  manpage_directory = /usr/share/man
  maximal_backoff_time = 16m
  meta_directory = /etc/postfix
  minimal_backoff_time = 2m
  mydestination =
  mynetworks = 192.168.122.0/24, 127.0.0.0/8
  myorigin = $mydomain
  newaliases_path = /usr/bin/newaliases.postfix
  queue_directory = /var/spool/postfix
  queue_run_delay = 2m
  readme_directory = /usr/share/doc/postfix/README_FILES
  sample_directory = /usr/share/doc/postfix/samples
  sender_dependent_default_transport_maps =
  regexp:/etc/postfix/sender_relay.regexp, randmap:{t121,t122,t123,t124}
  sendmail_path = /usr/sbin/sendmail.postfix
  setgid_group = postdrop
  shlib_directory = /usr/lib64/postfix
  smtp_connection_cache_on_demand = yes
  smtp_tls_CAfile = /etc/pki/tls/certs/ca-bundle.crt
  smtp_tls_CApath = /etc/pki/tls/certs
  smtp_tls_connection_reuse = yes
  smtp_tls_loglevel = 1
  smtp_tls_security_level = may
  smtpd_tls_cert_file = /etc/pki/tls/certs/postfix.pem
  smtpd_tls_key_file = /etc/pki/tls/private/postfix.key
  smtpd_tls_security_level = none
  unknown_local_recipient_reject_code = 550
  [root@mail01 postfix]#
  [root@mail01 postfix]#
  [root@mail01 postfix]#
  [root@mail01 postfix]# postconf -Mf
  postlogunix-dgram n  -   n   -   1   postlogd
  smtp   inet  n   -   n   -   -   smtpd
  tlsproxy   unix  -   -   n   -   0   tlsproxy
  pickup unix  n   -   n   60  1   pickup
  cleanupunix  n   -   n   -   0   cleanup
  qmgr   unix  n   -   n   300 1   qmgr
  tlsmgr unix  -   -   n   1000?   1   tlsmgr
  rewriteunix  -   -   n   -   -   trivial-rewrite
  bounce unix  -   -   n   -   0   bounce
  defer  unix  -   -   n   -   0   bounce
  trace  unix  -   -   n   -   0   bounce
  verify unix  -   -   n   -   1   verify
  flush  unix  n   -   n   1000?   0   flush
  proxymap   unix  -   -   n   -   -   proxymap
  proxywrite unix  -   -   n   -   1   proxymap
  smtp   unix  -   -   n   -   -   smtp
  relay  unix  -   -   n   -   -   smtp
  -o syslog_name=postfix/$service_name
  showq  unix  n   -   n   -   -   showq
  error  unix  -   -   n   -   -   error
  retry  unix  -   -   n   -   -   error
  discardunix  -   -   n   -   -   discard
  local  unix  -   n   n   -   -   local
  virtualunix  -   n   n   -   -   virtual
  lmtp   unix  -   -   n   -   -   lmtp
  anvil  unix  -   -   n   -   1   anvil
  scache unix  -   -   n   -   1   scache
  rsmunix  -   -   n   -   -   smtp
  -o smtp_bind_address=209.73.152.117
  t121   unix  -   -   n   -   64  smtp
  -o smtp_bind_address=209.73.152.121
  -o smtp_helo_name=mail01-t121.raystedman.org
  -o syslog_name=t121
  t122   unix  -   -   n   -   64  smtp
  -o smtp_bind_address=209.73.152.122
  -o smtp_helo_name=mail01-t122.raystedman.org
  -o syslog_name=t122
  t123   unix  -   -   n   -   64  smtp
  -o smtp_bind_address=209.73.152.123
  -o smtp_helo_name=mail01-t123.raystedman.org
  -o syslog_name=t123
  t124   unix  -   -   n   -   64  smtp
  -o smtp_bind_address=209.73.152.124
  -o smtp_helo_name=mail01-t124.raystedman.org
  -o syslog_name=t124
  [root@mail01 postfix]#
___
Postfix-users mailing list -- postfix-users@postfix.org
To unsubscribe send an email to postfix-users-le...@postfix.org


[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy

2024-05-22 Thread Greg Sims via Postfix-users
>
> If the delay is with sending or receiving RSET, then the SMTP client
> log "conversation with XXX timed out". I don't know if that has a
> queue ID logged with that, though. Just grep for 'conversation with'.

[root@mail01 postfix]# journalctl -u postfix.service | grep 'conversation with'

returns null.
___
Postfix-users mailing list -- postfix-users@postfix.org
To unsubscribe send an email to postfix-users-le...@postfix.org


[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy

2024-05-22 Thread Wietse Venema via Postfix-users
Wietse Venema via Postfix-users:
> Greg Sims via Postfix-users:
> >   May 22 03:13:22 mail01.raystedman.org t123/smtp[46725]:
> > 604BE30A4ACA: to=<@gmail.com>,
> > relay=gmail-smtp-in.l.google.com[142.251.2.26]:25, conn_use=2,
> > delay=1576, delays=0.05/1550/25/0.84, dsn=2.0.0, status=sent (250
> > 2.0.0 OK  1716372802 41be03b00d2f7-6578166a383si2117154a12.283 -
> > gsmtp)
> 
> These 25s are unexpected.
> 
> - When Postfix saves/reuses a connection it does not rmember the
> time spent setting up the connection.  Technically, there is no
> msg_stats_print() call in the code path that saves a connection to
> the connection cache, and there is no msg_stats_scan() call in the
> code path that retrieves a connection from cache.
> 
> That means it took the Postfix SMTP client 25s to:
> 
> - Receive a delivery request from the queue manager,
> 
> - Retrieve a connection from the connection cache,
> 
> - Send the RSET command to the remote SMTP server,
> 
> - Retrieve the RSET server response.
> 
> That is, the clock for 'connection setup' stops when the Postfix
> SMTP client builds the MAIL FROM command. This is also when he 
> clock is started for the time spent delivering the message.

If the delay is with snding or receiving RSET, then the SMTP client
log "conversation with XXX timed out". I don't know if hat has a
queue ID logged with thata, though. Just grep for 'conversation with'.

Wietse
___
Postfix-users mailing list -- postfix-users@postfix.org
To unsubscribe send an email to postfix-users-le...@postfix.org


[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy

2024-05-22 Thread Greg Sims via Postfix-users
> It is assumed that you're not a victim of systemd-journald log mangling.
> It may be dropping some messages, and recording others out of order,
> breaking "collate".  On Linux systems where systemd is doing the
> logging, you'll want to have Postfix writing its own log files directly,
> bypassing syslog.  I have:
>
> main.cf:
> maillog_file = /var/log/postfix/log
>
> $ ls -ld /var/log/postfix{,/*}
> drwxrwxr-x. 2 root postdrop   47 May 22 23:40 /var/log/postfix
> -rw---  1 root root  226 May 22 23:40 /var/log/postfix/log
> -rw---  1 root postdrop 25747944 May 22 23:40 
> /var/log/postfix/log.20240522-234048.gz
>

We are collecting the log files as input with:

  journalctl -u postfix.service | grep 'May 22' >0522.log

It seems that this may be the issue with "collate".  I will add
logging in main.cf.

We will be sending our daily devotion email in Chinese starting at
12:30 pacific time.  I hope to have "collate" data soon afterwards.

Best, Greg
___
Postfix-users mailing list -- postfix-users@postfix.org
To unsubscribe send an email to postfix-users-le...@postfix.org


[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy

2024-05-22 Thread Wietse Venema via Postfix-users
Greg Sims via Postfix-users:
>   May 22 03:13:22 mail01.raystedman.org t123/smtp[46725]:
> 604BE30A4ACA: to=<@gmail.com>,
> relay=gmail-smtp-in.l.google.com[142.251.2.26]:25, conn_use=2,
> delay=1576, delays=0.05/1550/25/0.84, dsn=2.0.0, status=sent (250
> 2.0.0 OK  1716372802 41be03b00d2f7-6578166a383si2117154a12.283 -
> gsmtp)

These 25s are unexpected.

- When Postfix saves/reuses a connection it does not rmember the
time spent setting up the connection.  Technically, there is no
msg_stats_print() call in the code path that saves a connection to
the connection cache, and there is no msg_stats_scan() call in the
code path that retrieves a connection from cache.

That means it took the Postfix SMTP client 25s to:

- Receive a delivery request from the queue manager,

- Retrieve a connection from the connection cache,

- Send the RSET command to the remote SMTP server,

- Retrieve the RSET server response.

That is, the clock for 'connection setup' stops when the Postfix
SMTP client builds the MAIL FROM command. This is also when he 
clock is started for the time spent delivering the message.

> Another 25 seconds by google without conn_use:
> 
>   May 22 03:12:57 mail01.raystedman.org t121/smtp[47173]:
> 4224D30C1817: to=<@gmail.com>,
> relay=gmail-smtp-in.l.google.com[142.250.141.27]:25, delay=1543,
> delays=0.01/1517/25/0.79, dsn=2.0.0, status=sent (250 2.0.0 OK
> 1716372777 41be03b00d2f7-66d4526311dsi4920983a12.891 - gsmtp)

Yup.

Wiets
___
Postfix-users mailing list -- postfix-users@postfix.org
To unsubscribe send an email to postfix-users-le...@postfix.org


[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy

2024-05-22 Thread Viktor Dukhovni via Postfix-users
On Wed, May 22, 2024 at 08:15:41AM -0500, Greg Sims via Postfix-users wrote:

> I am having problems with "collate".  I greped a 10 minute portion of
> our mail.log which created a 6.8M file. I ran "collate" on this file
> and collected the output -- a 796M file.  I looked at the file and it
> seems to be filled with records like the following:

Collate produces blank-line-terminated records, one per input message,
showing all the work associated with delivering that message to all
its recipients.  If there are any partially processed messages in
the input, the corresponding "incomplete" records are output last.

Here's a typical example:

May 22 21:12:44 amnesiac postfix/pickup[3455088]: B46358DF27A: uid=1000 
from=
May 22 21:12:44 amnesiac postfix/cleanup[3455782]: B46358DF27A: 
message-id=
May 22 21:12:44 amnesiac postfix/qmgr[2546418]: B46358DF27A: 
from=, size=2652, nrcpt=1 (queue active)
May 22 21:12:48 amnesiac postfix/smtp[3455784]: Verified TLS connection 
established to list.sys4.de[188.68.34.52]:25: TLSv1.3 with cipher 
TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature 
ECDSA (secp384r1) server-digest SHA384
May 22 21:12:49 amnesiac postfix/smtp[3455784]: B46358DF27A: 
to=, relay=list.sys4.de[188.68.34.52]:25, delay=4.9, 
delays=0.08/0.01/4/0.74, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 
4VkpYG01YKzybg)
May 22 21:12:49 amnesiac postfix/qmgr[2546418]: B46358DF27A: removed

The "log level 1" TLS diagnostic message is included, even though it
does not list the queue-id, it was matched by process name and pid.

It is assumed that you're not a victim of systemd-journald log mangling.
It may be dropping some messages, and recording others out of order,
breaking "collate".  On Linux systems where systemd is doing the
logging, you'll want to have Postfix writing its own log files directly,
bypassing syslog.  I have:

main.cf:
maillog_file = /var/log/postfix/log

$ ls -ld /var/log/postfix{,/*}
drwxrwxr-x. 2 root postdrop   47 May 22 23:40 /var/log/postfix
-rw---  1 root root  226 May 22 23:40 /var/log/postfix/log
-rw---  1 root postdrop 25747944 May 22 23:40 
/var/log/postfix/log.20240522-234048.gz

See https://www.postfix.org/MAILLOG_README.html and note that rotation
does not (yet) delete old files, that's up to your cron job.

-- 
Viktor.
___
Postfix-users mailing list -- postfix-users@postfix.org
To unsubscribe send an email to postfix-users-le...@postfix.org


[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy

2024-05-22 Thread Greg Sims via Postfix-users
I am having problems with "collate".  I greped a 10 minute portion of
our mail.log which created a 6.8M file. I ran "collate" on this file
and collected the output -- a 796M file.  I looked at the file and it
seems to be filled with records like the following:

  May 22 02:10:00 mail01.raystedman.org postfix/smtpd[45725]:
122FC30C0DBA: client=web01-p[192.168.122.11]
  May 22 02:10:00 mail01.raystedman.org postfix/smtpd[45725]:
47ECB30C0DBD: client=web01-p[192.168.122.11]
  May 22 02:10:00 mail01.raystedman.org postfix/smtpd[45725]:
7BC8E30C178F: client=web01-p[192.168.122.11]

I performed the following on the output file:

  [root@mail01 postfix]# cat 0522.02\:1x.log.collate | grep 122FC30C0DBA | wc
   2816   19716  295764

This looks like 2,816 records of output for one email's smtpd record
-- all with the same date/time stamp.

I can see a few sequences like the following:

  May 22 02:10:00 mail01.raystedman.org postfix/qmgr[38428]:
A01F430BFE03: from==att@devotion.raystedman.org>,
size=34464, nrcpt=1 (queue active)
  May 22 02:10:03 mail01.raystedman.org postfix/qmgr[38428]:
A01F430BFE03: removed

Is this what you are looking for?

Best, Greg
___
Postfix-users mailing list -- postfix-users@postfix.org
To unsubscribe send an email to postfix-users-le...@postfix.org


[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy

2024-05-22 Thread Greg Sims via Postfix-users
I have data collection homework to do -- and I will be happy to do it!
 Config data and "collate" is next after morning meetings.

Here is some summary data by ISP  from the logs:

  Email Ave  Max  Conn
  Relay   SentDelayDelay   Reused.
google.com21,466  6681,5914,335
outlook.com  8,0568  2432,168
yahoodns.net 6,6178   381,135
prodigy.net  1,1267   420
icloud.com 6797  1980
comcast.net531   10  2240

This may supply some input on the amount of Connection Reuse we are
seeing.  I'm not sure why google.com is not reusing connections a
higher percentage of time.  We are always trying to send to gmail.com
in the presence of the 25 second delay.

Here is a 25 second instance by google with conn_use=2:

  [root@mail01 postfix]# journalctl -u postfix.service | grep 'May 22'
| grep 604BE30A4ACA
  May 22 02:47:06 mail01.raystedman.org postfix/smtpd[45725]:
604BE30A4ACA: client=web01-p[192.168.122.11]
  May 22 02:47:06 mail01.raystedman.org postfix/cleanup[47234]:
604BE30A4ACA: message-id=.
<8796a03d5ff2313c654830c3fb8cee03@swift.generated>
  May 22 02:47:06 mail01.raystedman.org postfix/qmgr[46605]:
604BE30A4ACA: from==gmail@devotion.raystedman.org>, size=34484,
nrcpt=1 (queue active)
  May 22 03:13:22 mail01.raystedman.org t123/smtp[46725]:
604BE30A4ACA: to=<@gmail.com>,
relay=gmail-smtp-in.l.google.com[142.251.2.26]:25, conn_use=2,
delay=1576, delays=0.05/1550/25/0.84, dsn=2.0.0, status=sent (250
2.0.0 OK  1716372802 41be03b00d2f7-6578166a383si2117154a12.283 -
gsmtp)
  May 22 03:13:22 mail01.raystedman.org postfix/qmgr[46605]:
604BE30A4ACA: removed

Another 25 seconds by google without conn_use:

  [root@mail01 postfix]# journalctl -u postfix.service | grep 'May 22'
| grep 4224D30C1817
  May 22 02:47:14 mail01.raystedman.org postfix/smtpd[45721]:
4224D30C1817: client=web01-p[192.168.122.11]
  May 22 02:47:14 mail01.raystedman.org postfix/cleanup[47236]:
4224D30C1817: message-id=<3b852728e8ba8e7e4a0e83c4641dd5c5@swift.generated>
  May 22 02:47:14 mail01.raystedman.org postfix/qmgr[46605]:
4224D30C1817: from==gmail@devotion.raystedman.org>,
size=34488, nrcpt=1 (queue active)
  May 22 03:12:57 mail01.raystedman.org t121/smtp[47173]:
4224D30C1817: to=<@gmail.com>,
relay=gmail-smtp-in.l.google.com[142.250.141.27]:25, delay=1543,
delays=0.01/1517/25/0.79, dsn=2.0.0, status=sent (250 2.0.0 OK
1716372777 41be03b00d2f7-66d4526311dsi4920983a12.891 - gsmtp)
  May 22 03:12:57 mail01.raystedman.org postfix/qmgr[46605]:
4224D30C1817: removed
___
Postfix-users mailing list -- postfix-users@postfix.org
To unsubscribe send an email to postfix-users-le...@postfix.org


[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy

2024-05-22 Thread Emmanuel Fusté via Postfix-users

Le 22/05/2024 à 12:35, Greg Sims via Postfix-users a écrit :

Thank you again for your feedback on this issue.

I watched the workload in real time this morning and now have more
insight into what is happening.  It appears the large ISPs are using
TLS connection as a way to throttle incoming traffic.  I looked at the
inbound mail queue and found most of the traffic going to gmail.com.
I believe this is because of the 20 & 25 seconds delays google.com is
injecting into the TLS connection.  I can also see no delays for the
smaller ISPs with little traffic.  Please see some data below.

Best, Greg

This is a sample of delays= for google.com -- 20 and 25 second delays:

   0.01/11/20/0.73,
   0.01/9.5/20/0.77,
   0.01/0/25/0.74,
   0.01/7.6/25/0.91,
   0.01/6.9/25/1.1,
   0.01/13/20/4.6,
   0.01/14/25/0.56,
   0.01/14/25/1.1,
   0.01/0/0.22/0.72,
   0.01/0/0.39/0.79,

Here are delays= from yahoo -- 5.3 second delays:

   0.01/0/10/4.1,
   0.05/0/5.1/0.95,
   0.01/0/5.2/0.68,
   0.01/0/5.3/0.57,
   0.01/0/0.45/0.42,
   0.01/0/5.3/0.42,
   0.01/0/5.3/0.34,
   0.01/0.39/5.1/0.73,
   0.01/0/1.2/2,
   0.01/0/5.3/0.34,

And from icloud -- little delay:

   0.01/0/0.14/2.5,
   0.01/0.02/0.15/2.7,
   0.01/0/0.34/11,
   0.01/0/0.13/4.9,
   0.01/0/0.06/4.1,
   0.01/0/0.13/1.5,

Check all your public DNS records/config  which concern you outboud 
server : PTR, A  of your public IP, A and possibly PTR of your EHLO name 
if different. SPF/DKIM/DMARC record of the outgoing Mail From 
envelope/From header.


Could be caused by bad records or DNS missconfiguration/blackhole or DNS 
interoperability issues when distant check your configuration. For the 
TCP session establishment check PTR/A/and A of the returned PTR if 
different first.


But it could be a client issue too. So a complete hosts tcpdump or 
debug_peer_list session for selected test email (use a user defined 
specific test transport to enable debug_peer on a specific email 
address) would produce more interesting timing trace.


Emmanuel.___
Postfix-users mailing list -- postfix-users@postfix.org
To unsubscribe send an email to postfix-users-le...@postfix.org


[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy

2024-05-22 Thread Wietse Venema via Postfix-users
Greg Sims via Postfix-users:
> This is a sample of delays= for google.com -- 20 and 25 second delays:
> 
>   0.01/11/20/0.73,
>   0.01/9.5/20/0.77,
>   0.01/0/25/0.74,
>   0.01/7.6/25/0.91,
>   0.01/6.9/25/1.1,
>   0.01/13/20/4.6,
>   0.01/14/25/0.56,
>   0.01/14/25/1.1,
>   0.01/0/0.22/0.72,
>   0.01/0/0.39/0.79,

So there is some reuse. You still need to answer Viktor's request
for detail logging. That will reveal (to us, at least) what the
those 20-25 seconds are going into. It is DNS, TCP connection setup,
TLS, and so on.

> Here are delays= from yahoo -- 5.3 second delays:
> 
>   0.01/0/10/4.1,
>   0.05/0/5.1/0.95,
>   0.01/0/5.2/0.68,
>   0.01/0/5.3/0.57,
>   0.01/0/0.45/0.42,
>   0.01/0/5.3/0.42,
>   0.01/0/5.3/0.34,
>   0.01/0.39/5.1/0.73,
>   0.01/0/1.2/2,
>   0.01/0/5.3/0.34,

Some reuse, mostlyinsane delays.

> And from icloud -- little delay:
> 
>   0.01/0/0.14/2.5,
>   0.01/0.02/0.15/2.7,
>   0.01/0/0.34/11,
>   0.01/0/0.13/4.9,
>   0.01/0/0.06/4.1,
>   0.01/0/0.13/1.5,

I wonder what portion of that are reused connections.

Well, you have your data collection home work.

Wietse
___
Postfix-users mailing list -- postfix-users@postfix.org
To unsubscribe send an email to postfix-users-le...@postfix.org


[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy

2024-05-22 Thread Viktor Dukhovni via Postfix-users
On Wed, May 22, 2024 at 05:35:25AM -0500, Greg Sims wrote:

> Thank you again for your feedback on this issue.

You're welcome, but I don't see anything in your reply that responds
directly to my requests for more detailed configuration and log data.

> I watched the workload in real time this morning and now have more
> insight into what is happening.  It appears the large ISPs are using
> TLS connection as a way to throttle incoming traffic.

That conclusion seems hasty.  The delays could be for any number of
reasons other than deliberate tarpitting, and it is not clear why such
tarpitting would be TLS-specific.

> I looked at the inbound mail queue and found most of the traffic going
> to gmail.com.  I believe this is because of the 20 & 25 seconds delays
> google.com is injecting into the TLS connection.

To properly understand what is happening, you should first run "collate"
and see if that adds sufficient detail to pin down the delays, and if
not, then "smtp -v", and look for possible "RSET" timeouts.  Perhaps
Gmail drops idle TLS connections faster than Postfix's 2s cache timeout?

Did you by any chance tweak the smtp/scache connection cache timeouts?
Longer timeouts risk the remote end closing the connection first, which
then causes problems trying to reuse it, ... and so raising the idle
timeouts can be rather counterproductive...

I'd be surprised if Google's front-end load-balancers drop TLS sessions
much faster than cleartext connections, ... but first you need to
determine what's happening during those 20s, for which more detailed
logging is needed, with "collate" being the simplest to try.

You may need to tweak the code slightly to match the date format used
by your log server.

-- 
Viktor.
___
Postfix-users mailing list -- postfix-users@postfix.org
To unsubscribe send an email to postfix-users-le...@postfix.org


[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy

2024-05-22 Thread Greg Sims via Postfix-users
Thank you again for your feedback on this issue.

I watched the workload in real time this morning and now have more
insight into what is happening.  It appears the large ISPs are using
TLS connection as a way to throttle incoming traffic.  I looked at the
inbound mail queue and found most of the traffic going to gmail.com.
I believe this is because of the 20 & 25 seconds delays google.com is
injecting into the TLS connection.  I can also see no delays for the
smaller ISPs with little traffic.  Please see some data below.

Best, Greg

This is a sample of delays= for google.com -- 20 and 25 second delays:

  0.01/11/20/0.73,
  0.01/9.5/20/0.77,
  0.01/0/25/0.74,
  0.01/7.6/25/0.91,
  0.01/6.9/25/1.1,
  0.01/13/20/4.6,
  0.01/14/25/0.56,
  0.01/14/25/1.1,
  0.01/0/0.22/0.72,
  0.01/0/0.39/0.79,

Here are delays= from yahoo -- 5.3 second delays:

  0.01/0/10/4.1,
  0.05/0/5.1/0.95,
  0.01/0/5.2/0.68,
  0.01/0/5.3/0.57,
  0.01/0/0.45/0.42,
  0.01/0/5.3/0.42,
  0.01/0/5.3/0.34,
  0.01/0.39/5.1/0.73,
  0.01/0/1.2/2,
  0.01/0/5.3/0.34,

And from icloud -- little delay:

  0.01/0/0.14/2.5,
  0.01/0.02/0.15/2.7,
  0.01/0/0.34/11,
  0.01/0/0.13/4.9,
  0.01/0/0.06/4.1,
  0.01/0/0.13/1.5,
___
Postfix-users mailing list -- postfix-users@postfix.org
To unsubscribe send an email to postfix-users-le...@postfix.org


[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy

2024-05-22 Thread Viktor Dukhovni via Postfix-users
On Tue, May 21, 2024 at 08:31:51AM -0500, Greg Sims wrote:

> Changes:
>   * certs back to defaults
>   * smtp_tls_loglevel = 1

Better.  Now it is time to post a more detailed transcript of a single
message (the sender and recipient addresses can be obfuscated if you
wish, the recipient domain would ideally stay untouched).

See:

https://github.com/vdukhovni/postfix/tree/master/postfix/auxiliary/collate

> Here is a set of delays from the logs:
> 
>   delays=0.01/2639/25/0.41
>   delays=0.01/2639/25/0.58
>   delays=0.01/2641/25/0.58
>   delays=0.01/2644/25/0.69
>   delays=0.01/2643/25/0.58
>   delays=0.01/2640/25/0.57

As noted by Wietse that 25s "connection setup" time is the source of the
problem, but it might represent more than one connection attempt, as it
is the time from the beginning of the delivery attempt to the start of
the mail transaction that delivered that envelope recipient, and so if
there are multiple connections attempted (or DNS lookup issues, ...)
this would include the associated latency.

A complete message history (for one such message) from "collate" may
show some of the associated activity.  You should also check for
any relevant related logs from "tlsproxy".

If that's still not illuminating, it may be helpful to run at least one
smtp(8)-based transport in verbose mode ("-v" flag in master.cf), and
to see all the activity that went on behind the scenes.

Also, do you have a TLS client session cache configured?

I have:

smtp_tls_session_cache_database = btree:${data_directory}/smtp_scache

this can substantially speed up TLS handshakes with the remote servers,
and perhaps also avoid running into potential (speculative) TLS session
creation rate limits.

Have you changed any of the SMTP client timeouts?  DNS aside, the only
significant timeout that is south of 20s is:

smtp_rset_timeout = 20s

which is relevant because the first command to sanity-check a cached
connection is "RSET", which is issued just before starting the new mail
transaction.  If for some reason the remote servers are tarpitting
"RSET", that could be the source of the problem.

This is perhaps a good time to ask you for your full configuration,
not just cherry-picked individual settings.  Please post the outputs of:

$ postconf -nf
$ postconf -Mf

with all whitespace (including linebreaks) preserved.

-- 
Viktor.
___
Postfix-users mailing list -- postfix-users@postfix.org
To unsubscribe send an email to postfix-users-le...@postfix.org


[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy

2024-05-22 Thread Alexander Leidinger via Postfix-users

Am 2024-05-22 01:22, schrieb Greg Sims via Postfix-users:

TLS connection reuse is being used.  About 10% of the connections are
reused for large volume ISPs. Small volume ISPs do not see connection
reuse.  I believe this is as expected.

I did some testing of our DNS setup.  A DNS query using dig is less
than 20 msec for both our primary and secondary dns servers in
/etc/resolv.conf -- see below.


If all else fails:
The truth can often be seen on the wire. Make a packet trace which 
covers the time from "here is the mail you have to send to google" to a 
successful delivery and inspect it in wireshark. For TLS traffic you 
need the cert/key in wireshark. Do not only trace the smtp traffic, but 
all traffic. Inspect what the system is doing (e.g. DNS lookups) and 
correlate that to the traffic you see (you can change how timestamps are 
displayed in wireshark). This may indicate where those 25 seconds are 
spend.


This is a steep learning curve if you are not familiar already with 
interpreting network packets, the smtp protocol, DNS, and wireshark. If 
those skills are already available, it may lead to detecting the cause 
of what you see faster than the back and forth with guesses here on the 
mailinglist.


Bye,
Alexander.

--
http://www.Leidinger.net alexan...@leidinger.net: PGP 0x8F31830F9F2772BF
http://www.FreeBSD.orgnetch...@freebsd.org  : PGP 0x8F31830F9F2772BF


signature.asc
Description: OpenPGP digital signature
___
Postfix-users mailing list -- postfix-users@postfix.org
To unsubscribe send an email to postfix-users-le...@postfix.org


[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy

2024-05-21 Thread Greg Sims via Postfix-users
TLS connection reuse is being used.  About 10% of the connections are
reused for large volume ISPs. Small volume ISPs do not see connection
reuse.  I believe this is as expected.

I did some testing of our DNS setup.  A DNS query using dig is less
than 20 msec for both our primary and secondary dns servers in
/etc/resolv.conf -- see below.

Thanks, Greg

  [rsm@mail01 ~]$ cat /etc/resolv.conf
  # Generated by NetworkManager
  search raystedman.org
  nameserver 192.168.122.1
  nameserver 8.8.4.4
  [rsm@mail01 ~]$ dig @192.168.122.1 google.com

  ; <<>> DiG 9.11.36-RedHat-9.11.36-11.el8_9.1 <<>> @192.168.122.1 google.com
  ; (1 server found)
  ;; global options: +cmd
  ;; Got answer:
  ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 45410
  ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

  ;; OPT PSEUDOSECTION:
  ; EDNS: version: 0, flags:; udp: 1232
  ;; QUESTION SECTION:
  ;google.com. IN A

  ;; ANSWER SECTION:
  google.com. 172 IN A 142.250.191.78

  ;; Query time: 1 msec <<<
  ;; SERVER: 192.168.122.1#53(192.168.122.1)
  ;; WHEN: Tue May 21 15:56:06 PDT 2024
  ;; MSG SIZE  rcvd: 55

  [rsm@mail01 ~]$ dig @8.8.4.4 google.com

  ; <<>> DiG 9.11.36-RedHat-9.11.36-11.el8_9.1 <<>> @8.8.4.4 google.com
  ; (1 server found)
  ;; global options: +cmd
  ;; Got answer:
  ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 35863
  ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

  ;; OPT PSEUDOSECTION:
  ; EDNS: version: 0, flags:; udp: 512
  ;; QUESTION SECTION:
  ;google.com. IN A

  ;; ANSWER SECTION:
  google.com. 300 IN A 142.250.191.46

  ;; Query time: 16 msec 
  ;; SERVER: 8.8.4.4#53(8.8.4.4)
  ;; WHEN: Tue May 21 15:57:47 PDT 2024
  ;; MSG SIZE  rcvd: 55
___
Postfix-users mailing list -- postfix-users@postfix.org
To unsubscribe send an email to postfix-users-le...@postfix.org


[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy

2024-05-21 Thread Wietse Venema via Postfix-users
Jaroslaw Rafa via Postfix-users:
> Dnia 21.05.2024 o godz. 16:38:21 Wietse Venema via Postfix-users pisze:
> > >   delays=0.01/2639/25/0.41
> > >   delays=0.01/2639/25/0.58
> > >   delays=0.01/2641/25/0.58
> > >   delays=0.01/2644/25/0.69
> > >   delays=0.01/2643/25/0.58
> > >   delays=0.01/2640/25/0.57
> [...]
> > c=25s. It takes a whopping 25 econds for connection setup, including DNS,
> > EHLO, and STARTTLS. And if that is not bad enough already,
> > your connectuions also are not reysed. Reusing a connection would
> > reduce the 25s dramatically.
> 
> I wonder why it is *always* 25 seconds, while the other delays slightly
> vary, which would be normally expected. Is it possible that something in
> your sending pipeline is introducing an artificial delay when setting up the
> connection, for example something tries (unsuccessfully) to connect to
> somewhere and only *after* the attempt times out, Postfix continues with
> connection setup? If timeout in question is 25 seconds, this would explain
> why the value is so constant here...

DNS queries time out after 5s, so that could be a reason, oo (bad
server(s) in /etc/resolv.conf).

Wietse
___
Postfix-users mailing list -- postfix-users@postfix.org
To unsubscribe send an email to postfix-users-le...@postfix.org


[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy

2024-05-21 Thread Jaroslaw Rafa via Postfix-users
Dnia 21.05.2024 o godz. 16:38:21 Wietse Venema via Postfix-users pisze:
> >   delays=0.01/2639/25/0.41
> >   delays=0.01/2639/25/0.58
> >   delays=0.01/2641/25/0.58
> >   delays=0.01/2644/25/0.69
> >   delays=0.01/2643/25/0.58
> >   delays=0.01/2640/25/0.57
[...]
> c=25s. It takes a whopping 25 econds for connection setup, including DNS,
> EHLO, and STARTTLS. And if that is not bad enough already,
> your connectuions also are not reysed. Reusing a connection would
> reduce the 25s dramatically.

I wonder why it is *always* 25 seconds, while the other delays slightly
vary, which would be normally expected. Is it possible that something in
your sending pipeline is introducing an artificial delay when setting up the
connection, for example something tries (unsuccessfully) to connect to
somewhere and only *after* the attempt times out, Postfix continues with
connection setup? If timeout in question is 25 seconds, this would explain
why the value is so constant here...
-- 
Regards,
   Jaroslaw Rafa
   r...@rafa.eu.org
--
"In a million years, when kids go to school, they're gonna know: once there
was a Hushpuppy, and she lived with her daddy in the Bathtub."
___
Postfix-users mailing list -- postfix-users@postfix.org
To unsubscribe send an email to postfix-users-le...@postfix.org


[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy

2024-05-21 Thread Wietse Venema via Postfix-users
Greg Sims via Postfix-users:
> Thank you Viktor.
> 
> Answers:
>   * smtp_connection_cache_on_demand = yes -- this was configured
> 
> Changes:
>   * certs back to defaults
>   * smtp_tls_loglevel = 1
> 
> Before enabling TLS our send rate was about 4K emails per minute -- we
> are now seeing 300 to 500 per minute.
> 
> The email creation process is sending new mail via a private network.
> We are in the middle of an ip address migration so -- email creation
> is currently single threaded with 100 uSec delays between emails. I
> can increase the concurrency/speed of the email creation process(es)
> -- I fear it would only increase the size of the inbound email queue.
> 
> Here is a set of delays from the logs:
> 
>   delays=0.01/2639/25/0.41
>   delays=0.01/2639/25/0.58
>   delays=0.01/2641/25/0.58
>   delays=0.01/2644/25/0.69
>   delays=0.01/2643/25/0.58
>   delays=0.01/2640/25/0.57
> 
> I scanned a large section of the logs | grep status=sent.  These
> delays are consistent throughout the peak demand period.

I'm the person who implemented the delays=a/b/c/d breakdown. 
See https://www.postfix.org/postconf.5.html#delay_logging_resolution_limit

In your case:

a=0.01s. You have negligible delay between between Postfix reciving
a message into the incoming queue, and the queue manager moving it
onoto the active queue (i.e. scheduled by the queue manager for
delivery).

b=2640 Messages are waiting a long timr for their turn to be
delivered. This is a lot of time because...

c=25s. It takes a whopping 25 econds for connection setup, including DNS,
EHLO, and STARTTLS. And if that is not bad enough already,
your connectuions also are not reysed. Reusing a connection would
reduce the 25s dramatically.

d=.05s, the time to actually deliver the message.

Wietse
___
Postfix-users mailing list -- postfix-users@postfix.org
To unsubscribe send an email to postfix-users-le...@postfix.org


[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy

2024-05-21 Thread Greg Sims via Postfix-users
Thank you Viktor.

Answers:
  * smtp_connection_cache_on_demand = yes -- this was configured

Changes:
  * certs back to defaults
  * smtp_tls_loglevel = 1

Before enabling TLS our send rate was about 4K emails per minute -- we
are now seeing 300 to 500 per minute.

The email creation process is sending new mail via a private network.
We are in the middle of an ip address migration so -- email creation
is currently single threaded with 100 uSec delays between emails. I
can increase the concurrency/speed of the email creation process(es)
-- I fear it would only increase the size of the inbound email queue.

Here is a set of delays from the logs:

  delays=0.01/2639/25/0.41
  delays=0.01/2639/25/0.58
  delays=0.01/2641/25/0.58
  delays=0.01/2644/25/0.69
  delays=0.01/2643/25/0.58
  delays=0.01/2640/25/0.57

I scanned a large section of the logs | grep status=sent.  These
delays are consistent throughout the peak demand period.

Best, Greg

On Tue, May 21, 2024 at 7:12 AM Viktor Dukhovni via Postfix-users
 wrote:
>
> On Tue, May 21, 2024 at 06:51:08AM -0500, Greg Sims via Postfix-users wrote:
>
> > Our main.cf contains:
> >   smtpd_tls_cert_file = 
> >   smtpd_tls_key_file  = 
> >   smtpd_tls_security_level = none
>
> There's no point in configuring SMTP server certificates when TLS is
> disabled in the SMTP server.  If the idea is to make tlsproxy(8)
> "happy", then try: "smtpd_tls_cert_file = none" and leave the key file
> setting at its default empty value.
>
> >   smtp_tls_security_level  = may
>
> Fine.
>
> > We have used connection caching in the past so we added:
> >   * smtp_tls_connection_reuse=yes
>
> Presumably, you also have "smtp_connection_cache_on_demand = yes".
>
> >   * uncommented tlsproxy / maxproc = 0 in master.cf
> >   * commented smtp_tls_loglevel -- no logging as a result
>
> Level 1 is recommended when TLS is enabled.  The additional logging
> overhead is quite modest.
>
> > The outbound message transfer rate of our configuration is less than
> > 500 emails per minute.  I noted the following:
> >
> >   * we have four entries in master.cf for smtp -- each has a
> > unique ip address with maxproc=32; these are used with randmap{} in
> > main.cf
>
> Fine.
>
> >   * queuing of inbound email is high
> > * the inbound email queue contains over 3,000 emails
>
> So that's ~6 minutes of backlog...  Plausibly due to downstream rate
> and/or concurrency limits
>
> > * email average delay is over 400 seconds and 1,100
> > seconds for google.com where most of the email is delivered
>
> If the backlog isn't growing, can you post the averages of the
> delays=a/b/c/d components?
>
> >   * htop shows
> > * cpu utilization is low with Load average under 0.10 on
> > four physical processors and 4GB memory / 500MB used
>
> CPU was unlikely to be the problem.
>
> > * only one tlsproxy process running -- the highest cpu
> > utilization process most of the time
>
> Fine.  More would be spawned if it got too busy to serve client requests
> in a timely manner.
>
> > * 20 processes exist for each of our four ip address/smtp
> > entries in master.cf with maxproc=32
> >
> > I am concerned that the queuing of inbound email is caused by there
> > only being one tlsproxy process.
>
> More plausibly the real issue is message delivery latency to the various
> destinations.
>
> > maxproc=0 seems to allow for an unlimited number.  We seem to have
> > plenty of smtp processes as postfix is not starting more of them to
> > reach the maxproc=32.
>
> You could configure separate tlsproxy(8) services for each of the
> smtp(8) transports by overriding "tlsproxy_service_name" in master.cf,
> that will give you multiple tlsproxy(8) processes, but I guessing won't
> change much, if the issue is downstream delays.
>
> --
> Viktor.
> ___
> Postfix-users mailing list -- postfix-users@postfix.org
> To unsubscribe send an email to postfix-users-le...@postfix.org
___
Postfix-users mailing list -- postfix-users@postfix.org
To unsubscribe send an email to postfix-users-le...@postfix.org


[pfx] Re: TLS for SMTP Outbound -- Only One tlsproxy

2024-05-21 Thread Viktor Dukhovni via Postfix-users
On Tue, May 21, 2024 at 06:51:08AM -0500, Greg Sims via Postfix-users wrote:

> Our main.cf contains:
>   smtpd_tls_cert_file = 
>   smtpd_tls_key_file  = 
>   smtpd_tls_security_level = none

There's no point in configuring SMTP server certificates when TLS is
disabled in the SMTP server.  If the idea is to make tlsproxy(8)
"happy", then try: "smtpd_tls_cert_file = none" and leave the key file
setting at its default empty value.

>   smtp_tls_security_level  = may

Fine.

> We have used connection caching in the past so we added:
>   * smtp_tls_connection_reuse=yes

Presumably, you also have "smtp_connection_cache_on_demand = yes".

>   * uncommented tlsproxy / maxproc = 0 in master.cf
>   * commented smtp_tls_loglevel -- no logging as a result

Level 1 is recommended when TLS is enabled.  The additional logging
overhead is quite modest.

> The outbound message transfer rate of our configuration is less than
> 500 emails per minute.  I noted the following:
>
>   * we have four entries in master.cf for smtp -- each has a
> unique ip address with maxproc=32; these are used with randmap{} in
> main.cf

Fine.

>   * queuing of inbound email is high
> * the inbound email queue contains over 3,000 emails

So that's ~6 minutes of backlog...  Plausibly due to downstream rate
and/or concurrency limits

> * email average delay is over 400 seconds and 1,100
> seconds for google.com where most of the email is delivered

If the backlog isn't growing, can you post the averages of the
delays=a/b/c/d components?

>   * htop shows
> * cpu utilization is low with Load average under 0.10 on
> four physical processors and 4GB memory / 500MB used

CPU was unlikely to be the problem.

> * only one tlsproxy process running -- the highest cpu
> utilization process most of the time

Fine.  More would be spawned if it got too busy to serve client requests
in a timely manner.

> * 20 processes exist for each of our four ip address/smtp
> entries in master.cf with maxproc=32
> 
> I am concerned that the queuing of inbound email is caused by there
> only being one tlsproxy process.

More plausibly the real issue is message delivery latency to the various
destinations.

> maxproc=0 seems to allow for an unlimited number.  We seem to have
> plenty of smtp processes as postfix is not starting more of them to
> reach the maxproc=32.

You could configure separate tlsproxy(8) services for each of the
smtp(8) transports by overriding "tlsproxy_service_name" in master.cf,
that will give you multiple tlsproxy(8) processes, but I guessing won't
change much, if the issue is downstream delays.

-- 
Viktor.
___
Postfix-users mailing list -- postfix-users@postfix.org
To unsubscribe send an email to postfix-users-le...@postfix.org