Re: ATRN reloaded

Wietse Venema Tue, 26 Jan 2010 17:26:55 -0800

adrian ilarion ciobanu:
> > 
> > You associate a fixed nexthop with each authenticated client, and their
> > entire set of domains. You flush either all their domains, or the subset
> > they requested. The scache entry is for the client-specific nexthop, not
> > the recipient domain.
> > 
> >     example.com     atrn:[client1.atrn.invalid]
> >     example.net     atrn:[client1.atrn.invalid]
> >     example.org     atrn:[client2.atrn.invalid]
> > 
> > The scache slot is for "[client1.atrn.invalid]".
> 
> Got it. I mixed scache manpage missreadings with flush records.


If the architecture is like this:

    Customer <--> smtpd(8) <--> atrnd(8) <--> smtp(8)

Then the transport map would look like:

    example.com atrn:[example.com]
    example.org atrn:[example.org]

with in master.cf:

    atrnd  unix  -       -       n       -       -       atrnd

    atrn   unix  -       -       n       -       -       smtp
        -o smtp_connection_cache_destinations=static:all
        -o smtp_connection_reuse_time_limit=0

- smtpd(8) can rate-limit the ATRN command via the anvil(8) server.

- Each time smtpd(8) receives a valid ATRN command it connects to
atrnd(8), passes the customer domain name, and waits for atrnd(8)
to respond.

- atrnd(8) either rejects the request (perhaps because it's still
proxying mail for that domain) or stores a socket with scache(8)
under the customer domain name.

- Once mail starts flowing, smtp(8) retrieves that socket from
scache(8) and saves that socket to scache(8) upon each delivery
completion.

Other things to keep in mind:

- There need to be generous timeouts before the first delivery, and
perhaps smaller timeouts between successive deliveries.

- smtpd(8) needs to reset its watchdog timer periodically otherwise
bad things happen when the ATRN session lasts more than $daemon_timeout
seconds.

- smtpd(8) uses two atrn client APIs that encapsulate interaction
with atrnd(8), namely, sending the customer domain information,
and pushing bytes between customer and atrnd(8) without any
understanding of the content. 

- By playing byte-pusher-in-the-middle, smtpd(8) processes TLS
messages as soon as they become available, so there is a possibility
of enabling the TLS session renegotation bug. For background on this
see http://www.postfix.org/wip.html.

Below is a preliminary design document about implementing ATRN in
Postfix.  In this document I try to minimize the number of parameters
that need to be changed from their defaults. To achieve that goal
I introduce a new address class with its own main.cf settings.

Before a line of code gets written, I would like to see an updated
version of that design document. It needs to consider the choices
that need to be made.

Finally I have introduced earlier in this thread the requirement
that smtpd(8) be changed only minimally, and that most of the ATRN
smarts are implemented outside smtpd(8), by a separate daemon.

This principle has worked well in Postfix. Whenever a major feature
is added, don't mess up the existing programs. Instead, write a
new daemon process and a client library module that implements a
narrow protocol. This effectively guarantees that the new feature
will have zero impact on Postfix performance and reliability except
when a site decides to use that feature.

        Wietse

[ATRN design document dated Jun 26, 2005]

Below are some notes on what it would take to implement ATRN support
in Postfix. This is an updated version of notes that I made before
connection caching was added to Postfix.

In summary:

- Postfix can support ATRN requests with only one domain parameter,

- Postfix can support ATRN over TLS encrypted sessions,

- Postfix ATRN support needs only one mandatory parameter that
specifies an (ATRN domain name -> SASL login name) mapping,

- Postfix ATRN support needs one optional but recommended parameter
that specifies a table for recipient address validation.

- ATRN configuration is kept to the minimum by introducing a new
"atrn" mail delivery transport with its own address class (in
addition to the local, virtual alias, virtual mailbox, and relay
address classes) with its own default configuration parameters.

- Migration from ETRN <=> ATRN is not transparent.

One question: how much need is there for this functionality?  Most
of the infrastructure exists, so it's not a terrible amount of work
to implement. ATRN Support adds another notch to the list of RFCs
that Postfix implements.

        Wietse

Table of contents:

1 - ATRN Protocol basics
2 - Implementing ATRN with the Postfix connection cache
3 - Multi-domain ATRN requests
4 - Interactions with TLS
5 - ATRN User interface
6 - Migration between ETRN and ATRN

1 - ATRN Protocol basics
========================

ATRN is described in RFC 2645 as: ON-DEMAND MAIL RELAY (ODMR), SMTP
with Dynamic IP Addresses. The basic operation is easily explained
with the following quote:

RFC 2645 section 5.2.1.  ATRN (Authenticated TURN)

   Unlike the TURN command in [SMTP], the ATRN command optionally takes
   one or more domains as a parameter.  The ATRN command MUST be
   rejected if the session has not been authenticated.  Response code
   530 [AUTH] is used for this.

   The timeout for this command MUST be at least 10 minutes to allow the
   provider time to process its mail queue.

The protocol looks like this (P = provider, C = customer).

   P:  220 EXAMPLE.NET on-demand mail relay server ready
   C:  EHLO example.org
   P:  250-EXAMPLE.NET
   P:  250-AUTH CRAM-MD5 EXTERNAL
   P:  250 ATRN

Once the client has authenticated, the conversation proceeds:

   C:  ATRN example.org,example.com
   P:  250 OK now reversing the connection

At this point, the customer becomes the server, and the provider
becomes the client:

   C:  220 example.org ready to receive email
   P:  EHLO EXAMPLE.NET
   C:  250-example.org
   C:  250 SIZE
   P:  MAIL FROM: <lester.tes...@dot.foo.bar>
   C:  250 OK
   P:  RCPT TO: <l.eva....@example.com>
   C:  250 OK, recipient accepted
   ...
   P:  QUIT
   C:  221 example.org closing connection

All this makes sense only after the client has authenticated,
otherwise ATRN could be used to steal mail.

2 - Implementing ATRN with the Postfix connection cache
=======================================================

Postfix is a modular mail systen. Different processes implement the
SMTP server function, the queue manager/scheduler function, and the
SMTP client function. ATRN is in apparent conflict with this division
of concerns.

ATRN provides one customer-initiated TCP connection between customer
and provider, over which mail is to be delivered to the customer.
This was not possible with Postfix before connection caching was
implemented.  The reason is a conflict in connection management:
the Postfix SMTP clients always attempted to make their own connection
to the addresses listed for the recipient domain, and were unable
to (re)use an existing connection.

This is where SMTP connection caching comes to the rescue. If the
SMTP server makes an SMTP connection cache entry for the connection
that was initiated by the customer's domain, and if Postfix is
configured so that at most one SMTP client process tries to deliver
mail to the customer domain, then this Postfix SMTP client process
can use the connection that the SMTP server entered into the
connection cache. The cached connection can be used multiple times;
it is not tied to any particular SMTP client process.

Postfix ATRN can build on the code that already implements support
for ETRN; this avoids the need to search a potentially large number
of queue files.

Why not implement a dedicated ATRN queue, or even one queue per
domain?  One message can have multiple recipients, and Postfix does
not support queue file splitting.

3 - Multi-domain ATRN requests
==============================

The first complication is that ATRN allows the client to specify
multiple domains in the ATRN command. This is a problem for Postfix.
Suppose that the client sends two domain names with the command
"ATRN X,Y".  Once the cache manager gives the connection to a Postfix
SMTP client that has mail for domain X, that connection is no longer
in the cache. The Postfix SMTP client with mail for domain Y receives
a "not found" reply from the cache manager. Although the Postfix
SMTP client could be configured not to make SMTP connections by
itself, it would have no way to find out when, if ever, a cached
connection becomes available.

Changing this would introduce a great deal of complexity: Postfix
SMTP clients would have to block on a connection cache lookup
request, and the connection cache manager would have to know that
a client does not return a connection to the cache (perhaps the
client has crashed), so that the connection cache manager can inform
a blocked SMTP client that the request can no longer be satisfied.
All this for marginal benefit, because very few potential users of
ATRN are expected to run multiple domains on a dynamic IP address.

Another limitation of this scheme is that if a customer makes
multiple overlapping SMTP connections with ATRN requests for the
same domain, only one connection will be used for mail delivery,
because Postfix will deliver mail to ATRN domains with an SMTP
destination concurrency of only one connection. There is no problem,
however, when the customer makes multiple overlapping connections
with ATRN requests for different domains.

4 - Interactions with TLS
=========================

The TLS implementation introduces one additional complication. TLS
is implemented inside Postfix SMTP server and client processes; it
is not possible to transfer TLS state from one process to another
without closing the connection. Note that this complication would
not exist had Postfix TLS support been implemented outside the SMTP
client and sever processes; in that case we would simply pass around
a local socket that connects to the TLS proxy process.

Because of this complication, the Postfix SMTP server process has
to act as a proxy between the remote customer and the local Postfix
SMTP client; instead of entering the SMTP server socket into the
connection cache, the SMTP server enters one endpoint of a local
socketpair with a large reuse count and expiration time.  When the
Postfix SMTP client retrieves a connection from the cache it actually
gets that end of the local socketpair.  All communication with the
customer is sent through the Postfix SMTP server process, which
also does the TLS encapsulation/decapsulation.  After a mail
transaction, the Postfix SMTP client caches a good connection with
a large expiration time and decrements the connection reuse count.
This repeats until the connection expires from the cache, until the
customer disconnects, or until there is no more mail for the customer.

5 - ATRN User interface
=======================

Setting up ETRN is as simple as listing the domain in relay_domains,
and providing a valid recipient list.  It would be nice if ATRN
setup is just as easy.

Setting up ATRN requires:

- Authorization table with (ATRN domain name -> SASL login name)
mapping.  Without this table any SASL user could invoke ATRN and
steal mail.

- Postfix needs the list of ATRN domains to disable spontaneous
creation of connections for ATRN domains, and to defer delivery if
no connection is cached for such a domain.

- Postfix needs a dedicated "atrn" transport in master.cf, with a
main.cf destination concurrency of 1. This transport can be installed
as a default entry in future master.cf files (the alternative is
to add a completely new scheduling mechanism in the form of a
per-domain concurrency map).

- If we use a dedicated "atrn" mail delivery transport, then we either
need transport map entries for all ATRN domains (ugly), or we need
to introduce an "atrn" address class. The latter is preferable, but
implies that ATRN domains are in a different address class than
ETRN domains (which are in the relay domains class). See also the
section on migration issues below.

In conclusion, ATRN support can be configured with one configuration
paramater, and with a bunch of defaults that never need to be
changed:

Non-default entries: this is the only thing that you must specify:

    main.cf:
        atrn_domain_login_maps = type:table 
            (this provides the ATRN domain name -> SASL login name mapping)

Optional, but highly recommended:

        atrn_recipient_maps = type:table
            (this provides the list of valid recipients)

Default entries, never to be changed:

    main.cf:
        atrn_destination_concurrency_limit=1
        atrn_domains = $atrn_domain_login_maps
            (this uses the authorization table as a list of ATRN domains)
        atrn_transport = atrn
        fast_flush_domains = $relay_domains $atrn_domains

    master.cf:
        atrn      unix  -       -       n       -       -       smtp
            -o smtp_session_cache_only=yes

So, the upshot is that ATRN can be done with one configuration
parameter that specifies the (ATRN domain name -> SASL login name)
mapping, and an optional table for recipient address validation.
With ATRN sesions proxied by the Postfix SMTP server process, Postfix
can handle ATRN requests that specify a single domain, even over
STARTTLS sessions. This will work well as long as the active queue
is not saturated, so that ATRN connections won't expire from the
connection cache before Postfix finishes delivering mail for the
domain.

6 - Migration between ETRN and ATRN
===================================

The above implementation is elegant but has one down-side: a given
domain cannot use both ETRN and ATRN, which complicates migration.
Migration from ETRN to ATRN, or vice versa, requires that customer
and provider make the transition at the same time.

On the provider's end, the following steps are taken when migrating
a customer from ETRN to ATRN:

1 - Copy the customer's valid recipient list from relay_recipient_maps
    to atrn_recipient_maps. Do not update relay_recipient_maps.

2 - Populate the atrn_domain_login_maps table with (domain->login)
    mapping to prevent theft of mail.

3 - Remove the customer domain from relay_domains, to stop nagging
    warnings from Postfix that the domain is listed in multiple
    address classes.

4 - Remove the customer's valid recipient list from relay_recipient_maps.

The migration from ATRN to ETRN is as follows:

1 - Copy the customer's valid recipient list from atrn_recipient_maps
    to relay_recipient_maps. Do not update atrn_recipient_maps.

2 - List the customer domain in relay_domains.

3 - Remove the customer domain from atrn_domain_login_maps, to stop
    nagging warnings from Postfix that the domain is listed in
    multiple address classes.

4 - Remove the customer's valid recipient list from atrn_recipient_maps.

        Wietse

Re: ATRN reloaded

Reply via email to