Re: How does OpenSMTPD compare to a dedicated high-volume MTA?

gilles Wed, 14 Oct 2020 16:24:09 -0700

October 14, 2020 11:05 PM, "Demi M. Obenour" <demioben...@gmail.com> wrote:

> On 10/14/20 4:01 PM, gil...@poolp.org wrote:
> 
>> October 9, 2020 1:29 AM, "Demi M. Obenour" <demioben...@gmail.com> wrote:
>> 
>>> I was looking at the EuroBSDCon 2017 presentation on OpenSMTPD, and I
>>> was wondering how it differs from the dedicated high-volume MTA that
>>> wound up being written for the ESP. What are the features that are
>>> needed for high volume, but otherwise don't make sense?
>> 
>> Hi,
>> 
>> I should probably write an article about that, I'll keep this mail short.
>> 
>> When you enter the realm of high-volume MTA, the rules of SMTP change and
>> a general purpose MTA will no longer do the job: you're no longer below a
>> radar for big ISP and big mailer corps, you're now hogging their machines
>> and need to comply to a ton of SMTP-unrelated rules.
> 
> That makes sense, thanks! I was wondering if OpenSMTPD could not
> handle the load, and was not aware that there were other rules that
> had to be followed. I am looking forward to your article.
> 

OpenSMTPD can definitely take the load, we had it handle multi-million mails
queues and a lot of stuff that people don't see are actually built so that a
multi-million mails queue doesn't degrade runtime performances.

Out of the box, OpenSMTPD won't be able to take the load because we have the
conservative artificial limits to slow it down and avoid at all cost to have
a regular MX be blacklisted even if it sends a relatively high volume.

OpenSMTPD scales automatically its routing based on available source address
so that if we didn't have conservative limits and you sent a lot of mails it
might just blast the other end.

Before OpenSMTPD hit its first production release, we ran an instance with a
couple hundred IP addresses for an ISP and we accidentally missed a limit. A
few minutes later, we had blasted a major network and got a call that we had
DDoS-ed them. This was one smtpd instance with enough resources, it can take
the load ;-)

>> There are limits to enforce and they are not like the generic ones we use
>> in OpenSMTPD, they are faaaaaaar more fine-grained. For example, some are
>> tied to your domain/IP reputation and need to be adapted dynamically, and
>> others apply to a whole cluster of MX meaning that your MTA needs to know
>> that domain X and Y share the same limits because they both have an MX in
>> the same cluster and the limits apply to the whole cluster. A lot of this
>> is unrelated to SMTP and makes the SMTP engine much much more complex. It
>> is possible to tweak a general purpose MTA to kinda work, it just takes a
>> ton of work.
>> 
>> Then another issue is that when you start going above radar, you start to
>> get feedback from ISP and big mailers ... encapsulated in SMTP responses.
>> You can no longer just handle SMTP responses like 421 or 550, because the
>> human message that follows will contain a destination-specific code which
>> will provide more info (or not) about why the error happened. If your MTA
>> is unable to handle these, then when you get a 550 from microsoft you are
>> going to permanently fail the message when you maybe should have waited a
>> bit to retry. With a general purpose MTA, some of your recipients are not
>> going to receive their mail, it's that simple.
> 
> Are these documented anywhere? Just curious.
> 

You are entering hell.

The limits are undocumented, you learn about them because you hit them, then
you tweak until you no longer hit them. Some are easy to deal with because a
message will clearly state you exceeded them, others will trigger and you'll
be in the dark as to what limit you exceeded AND what value to use. You will
need to figure either through reverse engineering or test & trial. If you're
interested in seeing people do scary faces, ask a "delivery expert" how much
they enjoy dealing with ISP limits.

The encapsulated codes are much nicer to deal with. Each big mailer has some
page documenting their own codes, here is the one for Microsoft for example:

    https://postmaster.live.com/pm/troubleshooting.aspx#errors

The only problem is that these lists are both incomplete (you'll find that a
few codes may be generated that are undocumented) or inaccurate (the code is
documented to describe a specific error but reverse engineering proves it is
serving another purpose). So you still have to do some test & trial.

Luckily, this is only an issue for very high volume senders that do not have
a very good reputation that essentially gives them a free pass to bypass the
limits that other high volume senders have to abide to.

>> With a general purpose MTA and custom tuning you can send quite a lot but
>> there's still a limit to it. You can push that limit with good reputation
>> but at some point, unless you're part of the big ones you'll start having
>> issues.
> 
> What factors made it better/simpler a new MTA from scratch, rather
> than incorporating those rules into OpenSMTPD? Right now, there are
> no open-source (or even self-hostable) high-volume mailers that I
> am aware of.
> 

We had a very special use-case and it was much simpler for us to implement a
very simple MTA which discarded pretty much 90% of what OpenSMTPD did and to
focus on the highly specific use-case which was complex on its own.

You can do high-volume with OpenSMTPD but just not out of the box, there's a
bit of tweaking of undocumented limits. Basically, if a delivery expert know
what limit to tweak, they can turn their OpenSMTPD into a mail cannon, we're
just not making it too easy to do because people like touching knobs even if
not needed, then mail us when they screw up.

What you can't do with OpenSMTPD is dynamically react to ISP-specific codes,
have multiple schedulers to handle multiple campaigns, etc... that's because
high-volume senders do not share the same use-cases and need custom stuff.

As for the custom ISP codes encapsulated in SMTP responses, no one does that
because it's not stable in time, if an open-source software was released and
implemented proper reacting to these codes, they would likely change them so
the software would break. It's not their interest to make it easy for all. I
can tell you first-hand that every serious ESP has experts dealing with this
and do not just rely on a one time implementation of a mapping table.

> From my (outsider) perspective, it appears that while
> those rules add significant complexity, they otherwise would not
> be harmful in a general-purpose MTA. The best current practice for
> high-volume business mail seems to be “pay a third-party service
> that specializes in this”, which is somewhat disappointing.
> 

I partly agree:

Part of me thinks it would be nice to introduce the concepts of reputation,
mail providers, provider codes, etc... they would allow OpenSMTPD to really
be more friendly for very high-volume senders.

Part of me thinks it would harm OpenSMTPD as a general purpose MTA as there
will be complexity crawling in the MTA layer but also... in conf, in table,
in the scheduler, in the envelope cache, in the MTA/scheduler IPC and quite
frankly probably in places I can't think of right now.

Then, the OpenBSD crowd would need to be convinced that the general purpose
MTA that ships with the system and which is quite critical should get these
complex bits introduced to handle the very high-volume senders.

Today, if I'm asked what would be the best solution for high-volume sending
I'd either answer to pay for a service or to contract me :-)

Re: How does OpenSMTPD compare to a dedicated high-volume MTA?

Reply via email to