EDNS / DANE trouble with Microsoft mail.protection.outlook.com.

Walter Doekes Wed, 16 Nov 2016 14:16:30 -0800

Hi there list,

this week we stumbled upon an issue where we could not send mail tocertain domains, for instance em...@umcg.nl.

Nov 16 17:04:08 mail postfix/smtp[13330]: warning: no MX host for umcg.nl has a 
valid address record
Nov 16 17:04:08 mail postfix/smtp[13330]: 1D1D21422C2: to=<em...@umcg.nl>, 
relay=none, delay=2257, delays=2256/0.02/0.52/0, dsn=4.4.3, status=deferred (Host or 
domain name not found. Name service error for 
name=umcg-nl.mail.protection.outlook.com type=A: Host not found, try again)


It turned out that this was the cause:

  $ dig MX umcg.nl +short
  10 umcg-nl.mail.protection.outlook.com.

  $ dig NS mail.protection.outlook.com. +short
  ns1-proddns.glbdns.o365filtering.com.
  ns2-proddns.glbdns.o365filtering.com.

  $ dig A umcg-nl.mail.protection.outlook.com.  \
      @ns1-proddns.glbdns.o365filtering.com. +edns +dnssec |
    grep FORMERR
  ;; ->>HEADER<<- opcode: QUERY, status: FORMERR, id: 46904
  ;; WARNING: EDNS query returned status FORMERR -
      retry with '+nodnssec +noedns'

Apparently some Microsoft Office 365 mail servers do not support EDNSand return FORMERR. This propagated through our DNS recursors asSERVFAIL and caused the lookup to fail.

A temporary workaround was to preheat the DNS cache by manually queryingsaid domain without EDNS and then flush the queue entries:


  $ dig A umcg-nl.mail.protection.outlook.com. \
      @ns1-proddns.glbdns.o365filtering.com. +noedns +nodnssec +short
  213.199.154.87
  213.199.154.23

  # postqueue -i THE_ITEM

But that's obviously not the right solution.

Some more digging revealed that EDNS was enabled on the query through`smtp_addr_list`:


     else if (smtp_tls_insecure_mx_policy > TLS_LEV_MAY)
        res_opt = RES_USE_DNSSEC;

The USE_DNSSEC causes the subsequent queries to use USE_EDNS0 with theDO flag and that killed our interoperability with the Microsoft Office365 DNS.

The fix was then to lower `smtp_tls_insecure_mx_policy` from 5 (dane) to1 (may):


    smtp_tls_dane_insecure_mx_policy=may   # default: dane

For the record, this miscommunication started on our servers since the2nd of November, according to the logs (although I cannot rule out ifanything changed on our side.) Running postfix 3.1.0-3 (Ubuntu Xenial) here.



My questions -- finally:

- Apart from Microsoft upgrading their servers to 2016 and supportingEDNS, is this issue something postfix should handle?

- Would postfix have handled FORMERR but not SERVFAIL and are my cachingresolvers to blame?


- Should postfix retry the query without EDNS on unexpected errors?

- Should the default smtp_tls_dane_insecure_mx_policy be set to 'dane'?Or should something more conservative be appropriate if it's able tocause this kind of miscommunication?




Thanks for your input.

Cheers,
Walter Doekes
OSSO B.V.

EDNS / DANE trouble with Microsoft mail.protection.outlook.com.

Reply via email to