Re: Connection Caching Per-Destination

Greg Sims Sat, 01 Aug 2020 20:40:27 -0700

> > I changed master.cf to 3 processes for outlook: in hopes of reducing
> > MaxConnections feedback -- I can not go much smaller.
>
> This has been asked before: when Outlook puts you in the penalty
> box and starts ratelimiting your new connections, was that because
> a) you exceeded a limit for the number of SIMULTANEOUS CONNECTIONS,
> or b) you exceeded a limit for the number of NEW CONNECTIONS over
> a time interval.
>
> I am asking because these two scenarios have different solutions,
> and three is awfully low.


First some terms with respect to outlook.com messages in the log:

RateLimited = "said: 451 4.7.650 The mail server .* has been
temporarily rate limited"

MaxConnections = "said: 451 4.7.652 The mail server .* has exceeded
the maximum number of connections"

Connection = "lost connection with.* while receiving the initial
server greeting" and the like.

I have only seen RateLimited once -- the overnight email burst that is
documented in this thread.  In the last 24 hours we relayed 14K emails
to outlook.com servers and we saw:

MaxConnections: 54, Connection: 30, RateLimited: 0

I believe outlook.com servers will cause MaxConnection and Connection
until the number of connections reaches an upper threshold.  When we
are running below the upper threshold, the email is being delivered in
an orderly fashion. I looked at a section of the logs in detail.  This
burst was 11K of emails and 3.4K are going to outlook.  The Average
Delay per email "to=<.*>.*delay=(.*)" was 194 seconds for this burst.
I record qshape every 30 seconds and see minimal incoming/active Qing
for outlook.com -- or any Qing for that matter at a 500 email/minute
arrival rate.

I do see a consistent pattern in the logs that looks like the
following (first number is the email sequence number for outlook):

    1 - Aug 01 01:43:50, outlook, delay=    1.23, delays=  0.01 /
0.01 /   0.80 /   0.41
...
1,010 - Aug 01 01:48:08, outlook, delay=  124.43, delays=  0.01 /
124.00 /   0.04 /   0.38
...
3,453 - Aug 01 01:59:59, outlook, delay=  492.86, delays=  0.01 /
492.00 /   0.31 /   0.54

The "124.00" is in delay slot b = "time from last active queue entry
to connection setup".  "delay b" started the email burst at 0.01
seconds and increased until the end of the burst when it was 492.00
seconds.  The delivery rate to outlook.com servers was 3,453
emails/969 seconds = 3.56 emails/second over this burst. Is "delay b"
Postfix internal Qing or is it being caused by outlook.com servers in
some fashion?

When outlook.com reaches the upper threshold of MaxConnections, it
starts to issue RateLimited as well.  This seems appropriate as we
were out of bounds with MaxConnections given the 383 domains we had
vying for outlook.com connections unchecked and an arrival rate of
1,000 emails per minute.

Are we in the penalty box from outlook.com?  The Microsoft SNDS data
is detecting the correct ip address/email volume and shows us as Green
-- no issues.

I know this is likely simplistic thinking -- but how about this in master.cf:

outlook  unix  -       -       n       -       -       smtp
  -o syslog_name=outlook
  -o smtp_connection_cache_on_demand=yes
  -o smtp_max_connections=8

Now the outlook smtp processes are vying for the specified number of
connections and making use of the connection_cache where possible.  If
the limiting resource for outlook.com is connections, it seems this
design might optimize throughput.  This coming from someone who has
not read the first line of Postfix code!!

Thanks, Greg
www.RayStedman.org

Re: Connection Caching Per-Destination

Reply via email to