Re: [exim-dev] DANE work

2014-09-04 Thread Jeremy Harris
On 03/09/14 05:13, Viktor Dukhovni wrote:
 On Wed, Sep 03, 2014 at 12:27:16AM +, Phil Pennock wrote:
 
 My jaw dropped when I returned to my list index view and saw just how
 many patches came in from work this weekend by Todd and Jeremy to
 implement DANE in Exim.

 Seriously good work which makes me very happy and feeling that I owe
 drinks to you both.  Thank you for picking up where I totally slacked
 off and working on implementing this.  (Thank you for removing one of my
 guilt layers of I need to find time to work on X too).
 
 Yes, thanks a bunch.  I have a few questions about the code.
 Specifically:
 
 verify_callback_client_dane(),(lines 469--471)
 tlsa_lookup(),(lines 1691--1713)
 dane_tlsa_load(), (all lines)
 
 transports/smtp.c:(lines 3297--3306,
tempfail_try_clear logic)
 
 Since I only read the commit diff, I am missing some context,
 so please pardon my ignorance...  I don't know where the logic
 is that tracks the DNSSEC status of the MX lookup to determine
 whether the subsequent per MX-host lookups can be considered
 secure.

There's a flag dnssec on a host struct, filled in by
host_find_bydns() called (usually) from the dnslookup
router, and checked by tls_client_start()

  [ in the version you have, but I'm about to pull it
  up a layer to smtp_deliver() ]

in deciding whether to try DANE.


 I am not sure that DNS lookup errors are handled correctly.  I
 could not find code that distinguishes between lookups failure and
 NXDOMAIN (which is not a lookup error).

Nor me.  What action do you suggest for which?

 
 I don't at first glance see code that enforces STARTTLS, without
 authentication, when a secure TLSA RRset exists, but all records
 are unusable.  I did not find any code to coerce PKIX-EE(1)
 and PKIX-TA(0) to unusable.
 
 I did notice that unsupported matching types seem to be needlessly
 treated as errors, rather than as an unusable record among possibly
 usable other records.

Those are all whatever your library does.

 
 I am confused by the interaction with tempfail try clear.

That'll be dealt with by the change above; intent is to
disable the try-in-clear possibility if dane was requested
and the tlsa lookup succeeded.

 
 The DANE-specific verify callback seems to assume that any call in
 which the verification status is OK is sufficient to make the peer's
 chain valid, but that need not be the case when the chain length
 is greater than 1.  It may be best to not infer any final status
 in that callback, and check the verification state after the
 connection completes, sending a QUIT (to the MiTM perhaps)
 and gracefully closing the connection if verification failed.
 

My understanding was that any fail return code to the sequence
of callbacks was enough to tell the OpenSSL library to fail the
TLS startup, and that there would be at least one callback per
chain element.  Is this not the case?
-- 
Cheers,
  Jeremy



-- 
## List details at https://lists.exim.org/mailman/listinfo/exim-dev Exim 
details at http://www.exim.org/ ##


Re: [exim-dev] DANE work

2014-09-04 Thread Viktor Dukhovni
On Thu, Sep 04, 2014 at 11:02:39PM +0100, Jeremy Harris wrote:

  Since I only read the commit diff, I am missing some context,
  so please pardon my ignorance...  I don't know where the logic
  is that tracks the DNSSEC status of the MX lookup to determine
  whether the subsequent per MX-host lookups can be considered
  secure.
 
 There's a flag dnssec on a host struct, filled in by
 host_find_bydns() called (usually) from the dnslookup
 router, and checked by tls_client_start()
 
   [ in the version you have, but I'm about to pull it
   up a layer to smtp_deliver() ]
 
 in deciding whether to try DANE.

So DANE is only attempted when both the MX and the host's A or 
RRsets are secure, right?  Also, a long time ago when I looked at
the base dnssec lookup code, I was not confident that chains of
CNAMEs were necessarily handled correctly in terms of making sure
that all the links are secure (typically recursive resolvers do
some of the chaining for you, but you need to be prepared to recurse
beyond that in some cases, and in that case track the security
status along each CNAME hop).

  I am not sure that DNS lookup errors are handled correctly.  I
  could not find code that distinguishes between lookups failure and
  NXDOMAIN (which is not a lookup error).
 
 Nor me.  What action do you suggest for which?

This is section 2.1 of the SMTP with DANE draft.  NXDOMAIN is not
an error, just a finding that the record does not exist.  When TLSA
RRs don't exist, that's fine, DANE does not apply.  Any other type
of lookup error (timeout, servfail, ...) needs to be considered as
a potential active attack, and the MX host in question skipped.

  I don't at first glance see code that enforces STARTTLS, without
  authentication, when a secure TLSA RRset exists, but all records
  are unusable.  I did not find any code to coerce PKIX-EE(1)
  and PKIX-TA(0) to unusable.
  
  I did notice that unsupported matching types seem to be needlessly
  treated as errors, rather than as an unusable record among possibly
  usable other records.
 
 Those are all whatever your library does.

Postfix applies additional logic above that library to process just
the supported TLSA RRs.  Unusable TLSA RRs lead to mandatory
unauthenticated TLS if no usable RRs are found.

  The DANE-specific verify callback seems to assume that any call in
  which the verification status is OK is sufficient to make the peer's
  chain valid, but that need not be the case when the chain length
  is greater than 1.  It may be best to not infer any final status
  in that callback, and check the verification state after the
  connection completes, sending a QUIT (to the MiTM perhaps)
  and gracefully closing the connection if verification failed.
 
 My understanding was that any fail return code to the sequence
 of callbacks was enough to tell the OpenSSL library to fail the
 TLS startup, and that there would be at least one callback per
 chain element.  Is this not the case?

Oh I see, you terminate connections early... Postfix allows the
connection to continue, gathers information and enforces policy
after the handshake.

If you never return a non-zero value from the callback when the
input status is zero, you might be OK.

So the immediate issues to address (not necessarily exhaustive)
are:

*   DNS error handling
*   Handling of unsupported TLSA RR values (unusable records),
and in particular PKIX-TA(0)/PKIX-EE(1) which the library
supports, but MTAs should not.

Later:

*   Digest agility, either as new code in my library,
or additional code in Exim.

Plus of course a lot more testing and a more thorough code review.
You might also consider ways to refactor the code that make it more
obviously complete/correct.

For PKIX-TA(2), the library needs to be told which hostnames are
valid in the leaf certificate, you need to pass the original nexthop
domain, its CNAME expansion (if different) and the TLSA base domain
(MX hostname).  If you support MX hosts that are CNAMEs (illegal,
but often tolerated) then the TLSA base domain should be the CNAME
expansion of the MX hostname and only failing that the original MX
hostname.  I don't recall running into the code that handles this.

-- 
Viktor.

-- 
## List details at https://lists.exim.org/mailman/listinfo/exim-dev Exim 
details at http://www.exim.org/ ##


Re: [exim-dev] DANE work

2014-09-03 Thread Jeremy Harris
On 03/09/14 01:27, Phil Pennock wrote:
 My jaw dropped when I returned to my list index view and saw just how
 many patches came in from work this weekend by Todd and Jeremy to
 implement DANE in Exim.

I did wonder if I should have squashed to a single commit to avoid
the mess in the timeline.  Do we have a preference, in git usage on
the Exim master?

Too late now for this one, I guess.
-- 
Cheers,
  Jeremy



-- 
## List details at https://lists.exim.org/mailman/listinfo/exim-dev Exim 
details at http://www.exim.org/ ##


Re: [exim-dev] DANE work

2014-09-03 Thread Todd Lyons
On Wed, Sep 3, 2014 at 1:05 AM, Jeremy Harris j...@wizmail.org wrote:
 On 03/09/14 01:27, Phil Pennock wrote:
 My jaw dropped when I returned to my list index view and saw just how
 many patches came in from work this weekend by Todd and Jeremy to
 implement DANE in Exim.

Jeremy gets to take the bow for this one.  My contributions are the
initial build plumbing and incorporating Viktor's OpenSSL based DANE
framework so that it built.  Jeremy did all the hard work of doing DNS
lookups, certificate handling, actually calling into the DANE
framework, analyzing results, etc.

How do we do licensing, since that entire file (dane-openssl.c) is
Viktor's original code?  Viktor, how is that code licensed?

 I did wonder if I should have squashed to a single commit to avoid
 the mess in the timeline.  Do we have a preference, in git usage on
 the Exim master?

Back when I committed the DMARC patches, Phil had me target a rebase
for 2-5 distinct patches.  I think that's probably what we should use
as a goal in the future.  I do like your numerous individual patches
too though since I can see and better understand incremental steps in
the thought process.

...Todd
-- 
The total budget at all receivers for solving senders' problems is $0.
If you want them to accept your mail and manage it the way you want,
send it the way the spec says to. --John Levine

-- 
## List details at https://lists.exim.org/mailman/listinfo/exim-dev Exim 
details at http://www.exim.org/ ##


[exim-dev] DANE work

2014-09-02 Thread Phil Pennock
My jaw dropped when I returned to my list index view and saw just how
many patches came in from work this weekend by Todd and Jeremy to
implement DANE in Exim.

Seriously good work which makes me very happy and feeling that I owe
drinks to you both.  Thank you for picking up where I totally slacked
off and working on implementing this.  (Thank you for removing one of my
guilt layers of I need to find time to work on X too).

Happiness.  :)
-Phil

-- 
## List details at https://lists.exim.org/mailman/listinfo/exim-dev Exim 
details at http://www.exim.org/ ##


Re: [exim-dev] DANE work

2014-09-02 Thread Viktor Dukhovni
On Wed, Sep 03, 2014 at 12:27:16AM +, Phil Pennock wrote:

 My jaw dropped when I returned to my list index view and saw just how
 many patches came in from work this weekend by Todd and Jeremy to
 implement DANE in Exim.
 
 Seriously good work which makes me very happy and feeling that I owe
 drinks to you both.  Thank you for picking up where I totally slacked
 off and working on implementing this.  (Thank you for removing one of my
 guilt layers of I need to find time to work on X too).

Yes, thanks a bunch.  I have a few questions about the code.
Specifically:

verify_callback_client_dane(),  (lines 469--471)
tlsa_lookup(),  (lines 1691--1713)
dane_tlsa_load(),   (all lines)

transports/smtp.c:  (lines 3297--3306,
 tempfail_try_clear logic)

Since I only read the commit diff, I am missing some context,
so please pardon my ignorance...  I don't know where the logic
is that tracks the DNSSEC status of the MX lookup to determine
whether the subsequent per MX-host lookups can be considered
secure.

I am not sure that DNS lookup errors are handled correctly.  I
could not find code that distinguishes between lookups failure and
NXDOMAIN (which is not a lookup error).

I don't at first glance see code that enforces STARTTLS, without
authentication, when a secure TLSA RRset exists, but all records
are unusable.  I did not find any code to coerce PKIX-EE(1)
and PKIX-TA(0) to unusable.

I did notice that unsupported matching types seem to be needlessly
treated as errors, rather than as an unusable record among possibly
usable other records.

I am confused by the interaction with tempfail try clear.

The DANE-specific verify callback seems to assume that any call in
which the verification status is OK is sufficient to make the peer's
chain valid, but that need not be the case when the chain length
is greater than 1.  It may be best to not infer any final status
in that callback, and check the verification state after the
connection completes, sending a QUIT (to the MiTM perhaps)
and gracefully closing the connection if verification failed.

-- 
Viktor.

-- 
## List details at https://lists.exim.org/mailman/listinfo/exim-dev Exim 
details at http://www.exim.org/ ##