Re: PATCH #3 (Postfix 3.4 + 3.5): TLS connection_reuse with "tafile"
On 8/22/2020 7:02 AM, Viktor Dukhovni wrote: > On Fri, Aug 21, 2020 at 05:38:42PM -0400, Wietse Venema wrote: > >> thorsten.hab...@findichgut.net: >>> Any chance to backport the patch to 3.4/3.5? >> This is more change than is allowed in a stable release. Postfix >> 3.6 drops support for OpenSSL < 1.1.1, deletes o(thousand) lines >> of DANE support from the Postfix TLS library, and replaces it with >> o(hundred) lines to use instead the DANE support in OpenSSL. > The backport request was just for the one-liner fix in posttls-finger, > where "-X" no longer falsely conflicts with "-r" (when no "-r" is > in fact specified). This should/will likely be backported. Yes, sorry. I removed the patch I posted earlier in my message. Because I am not sure that it's correct: --- a/src/posttls-finger/posttls-finger.c 2019-02-12 14:17:45.0 +0100 +++ b/src/posttls-finger/posttls-finger.c.new 2020-08-21 09:15:04.256945675 +0200 @@ -1988,7 +1988,7 @@ msg_fatal("bad '-a' option value: %s", state->options.addr_pref); #ifdef USE_TLS - if (state->tlsproxy_mode && state->reconnect) + if (state->tlsproxy_mode && state->reconnect > 0) msg_fatal("The -X and -r options are mutually exclusive"); #endif while it's state->reconnect >= 0 in the 3.6 snapshot. I checked the source of the 3.5.6 posttls-finger.c again and I am still not sure which version should be used for 3.5.
Re: PATCH #3 (Postfix 3.4 + 3.5): TLS connection_reuse with "tafile"
On Fri, Aug 21, 2020 at 05:38:42PM -0400, Wietse Venema wrote: > thorsten.hab...@findichgut.net: > > Any chance to backport the patch to 3.4/3.5? > > This is more change than is allowed in a stable release. Postfix > 3.6 drops support for OpenSSL < 1.1.1, deletes o(thousand) lines > of DANE support from the Postfix TLS library, and replaces it with > o(hundred) lines to use instead the DANE support in OpenSSL. The backport request was just for the one-liner fix in posttls-finger, where "-X" no longer falsely conflicts with "-r" (when no "-r" is in fact specified). This should/will likely be backported. -- Viktor.
Re: PATCH #3 (Postfix 3.4 + 3.5): TLS connection_reuse with "tafile"
Viktor Dukhovni: > On Fri, Aug 21, 2020 at 03:11:50PM -0400, Wietse Venema wrote: > > > Viktor Dukhovni: > > > On Fri, Aug 21, 2020 at 10:59:11AM -0400, Wietse Venema wrote: > > > > > > > > Viktor Dukhovni: > > > > > > - &_DANE_BASED(state->client_start_props->tls_level)) > > > > > > + && TLS_DANE_HASTA(state->client_start_props->dane)) > > > > > > msg_warn("%s: DANE requested, but not available", > > > > > > state->client_start_props->namaddr); > > > > > > > > Should there be a warning when tls_dane_avail() fails AND the > > > > TLS_DANE_BASED is true? > > > > > > Not needed if TLS_DANE_HASTA is not true, because: > > > > In that case, can you can suggest a more appropriate warning message? > > The text no longer matches the error condition. > > Fair point. The warning message could/should read: > > msg_warn("%s: DANE or local trust anchor based chain" > " verification requested, but not available", >state->client_start_props->namaddr); DANE verification requested? This condition triggers when the SMTP client (or posttls-finger) specifies an explicit trust anchor. They do not request DANE. Are you saying that the condition can also trigger when the SMTP client (or posttls-finger) tries to enforce DANE? Wietse Wietse
Re: PATCH #3 (Postfix 3.4 + 3.5): TLS connection_reuse with "tafile"
> On Aug 21, 2020, at 5:21 PM, thorsten.hab...@findichgut.net wrote: > > By the way I already applied your last patch on the testing environment. > No problems found so far. tafile and CApath based mandatory TLS delivery > work just fine. Thanks for the confirmation. Fortunately, the good news is not surprising, the reason for the intermittent (more failure than success) problem you were having, and only in tlsproxy(8) is clear from the patch. The wrong TLS SSL_CTX was selected for "tafile" connections, it was shared with normal WebPKI connections which raced the "tafile" connections to set the correct verification callback. With the symptoms fitting the bug so well, the confirmation is more of a formality, but still good to have. Sorry it took a while to get here, but the early messages in the thread had me focused on resumption, rather than the initial verification failure, which was the real problem. -- Viktor.
Re: PATCH #3 (Postfix 3.4 + 3.5): TLS connection_reuse with "tafile"
thorsten.hab...@findichgut.net: > Any chance to backport the patch to 3.4/3.5? This is more change than is allowed in a stable release. Postfix 3.6 drops support for OpenSSL < 1.1.1, deletes o(thousand) lines of DANE support from the Postfix TLS library, and replaces it with o(hundred) lines to use instead the DANE support in OpenSSL. > By the way I already applied your last patch on the testing > environment. No problems found so far. tafile and CApath based > mandatory TLS delivery work just fine. The bug was a race condition between concurrent TLS handshakes that were updating the same callback hook in global state, therefore it only happened with concurrent deliveries to different sites. Wietse
Re: PATCH #3 (Postfix 3.4 + 3.5): TLS connection_reuse with "tafile"
On Fri, Aug 21, 2020 at 03:11:50PM -0400, Wietse Venema wrote: > Viktor Dukhovni: > > On Fri, Aug 21, 2020 at 10:59:11AM -0400, Wietse Venema wrote: > > > > > > Viktor Dukhovni: > > > > > - &_DANE_BASED(state->client_start_props->tls_level)) > > > > > + && TLS_DANE_HASTA(state->client_start_props->dane)) > > > > > msg_warn("%s: DANE requested, but not available", > > > > >state->client_start_props->namaddr); > > > > > > Should there be a warning when tls_dane_avail() fails AND the > > > TLS_DANE_BASED is true? > > > > Not needed if TLS_DANE_HASTA is not true, because: > > In that case, can you can suggest a more appropriate warning message? > The text no longer matches the error condition. Fair point. The warning message could/should read: msg_warn("%s: DANE or local trust anchor based chain" " verification requested, but not available", state->client_start_props->namaddr); -- Viktor.
Re: PATCH #3 (Postfix 3.4 + 3.5): TLS connection_reuse with "tafile"
Viktor Dukhovni: > On Fri, Aug 21, 2020 at 10:59:11AM -0400, Wietse Venema wrote: > > > > Viktor Dukhovni: > > > > - &_DANE_BASED(state->client_start_props->tls_level)) > > > > + && TLS_DANE_HASTA(state->client_start_props->dane)) > > > > msg_warn("%s: DANE requested, but not available", > > > > state->client_start_props->namaddr); > > > > Should there be a warning when tls_dane_avail() fails AND the > > TLS_DANE_BASED is true? > > Not needed if TLS_DANE_HASTA is not true, because: In that case, can you can suggest a more appropriate warning message? The text no longer matches the error condition. Wietse
Re: PATCH #3 (Postfix 3.4 + 3.5): TLS connection_reuse with "tafile"
On Fri, Aug 21, 2020 at 10:59:11AM -0400, Wietse Venema wrote: > > Viktor Dukhovni: > > > - &_DANE_BASED(state->client_start_props->tls_level)) > > > + && TLS_DANE_HASTA(state->client_start_props->dane)) > > > msg_warn("%s: DANE requested, but not available", > > >state->client_start_props->namaddr); > > Should there be a warning when tls_dane_avail() fails AND the > TLS_DANE_BASED is true? Not needed if TLS_DANE_HASTA is not true, because: - For a DANE-based policy without DANE-TA TLSA RRs to have made it to tlsproxy(8), all the DNS preconditions have already been satisfied, and DANE-EE TLSA records have been provided to tlsproxy(8). - In Postfix 2.11–3.5, DANE-EE checks are performed post-handshake. Specifically, in particular the DANE-style X.509 verification callback is not needed and is not enabled (is set back to WebPKI default) on the shared SSL_CTX. > Would the following be more correct: > >int missing_infrastructure = 0; > if (!tls_dane_avail()) { /* mandatory side effects!! */ > /* True DANE request. */ > if (TLS_DANE_BASED(state->client_start_props->tls_level)) { > msg_warn("%s: DANE requested, but not available", >state->client_start_props->namaddr); > missing_infrastructure = 1; > } > /* Not DANE, but TA support implicitly dependss on the DANE stack. */ > else if (TLS_DANE_HASTA(state->client_start_props->dane)) { > msg_warn("%s: TA support requested, but DANE is not available", >state->client_start_props->namaddr); > missing_infrastructure = 1; > } > ) > if (missing_infrastructure == 0) > state->tls_context = tls_client_start(state->client_start_props); No, because DANE-EE works without tls_dane_avail. We just check the certificate fingerprint post-handshake. > But wait, there is more... > > > > state->appl_state = tlsp_client_init(state->tls_params, > > >state->client_init_props, > > > - TLS_DANE_BASED(state->client_start_props->tls_level)); > > > + TLS_DANE_HASTA(state->client_start_props->dane)); > > Will this also use the right verify callback function pointer when > real DANE is requested? Or does real DANE not use those same > callbacks? In Postfix 2.11–3.5, the real DANE-EE does not use the custom DANE-specific X.509 verification callbacks, they're only needed for DANE-TA verification. The dichotomy was between WebPKI-style chain verification and DANE-TA (or "tafile") chain verification. When all the TLSA records are DANE-EE, no chain verification is performed, we just do a post-handshake fingerprint check (supporting both DANE-EE and the "fingerprint" security level). All this simplified in Postfix 3.6. Speaking of which, upgrades from 3.5 to 3.6 need to perform a "reload", to flush the TLS session cache. Otherwise, some cached sessions returned by the "old" tlsmgr(8) to new smtp(8) may have the wrong properties. One way to avoid this is to include "mail_version" in the session lookup key hash. We should probably do that... -- VIktor.
Re: PATCH #3 (Postfix 3.4 + 3.5): TLS connection_reuse with "tafile"
I have more questions. Wietse Venema: > Viktor Dukhovni: > > state->client_start_props->fd = state->ciphertext_fd; > > /* These predicates and warning belong inside tls_client_start(). */ > > if (!tls_dane_avail() /* mandatory side effects!! */ > > - &_DANE_BASED(state->client_start_props->tls_level)) > > + && TLS_DANE_HASTA(state->client_start_props->dane)) > > msg_warn("%s: DANE requested, but not available", > > state->client_start_props->namaddr); Should there be a warning when tls_dane_avail() fails AND the TLS_DANE_BASED is true? Would the following be more correct: int missing_infrastructure = 0; if (!tls_dane_avail()) {/* mandatory side effects!! */ /* True DANE request. */ if (TLS_DANE_BASED(state->client_start_props->tls_level)) { msg_warn("%s: DANE requested, but not available", state->client_start_props->namaddr); missing_infrastructure = 1; } /* Not DANE, but TA support implicitly dependss on the DANE stack. */ else if (TLS_DANE_HASTA(state->client_start_props->dane)) { msg_warn("%s: TA support requested, but DANE is not available", state->client_start_props->namaddr); missing_infrastructure = 1; } ) if (missing_infrastructure == 0) state->tls_context = tls_client_start(state->client_start_props); Sorry that the above looks like "my first program" but it is the best I can do given the libtls API semantics. But wait, there is more... > > state->appl_state = tlsp_client_init(state->tls_params, > > state->client_init_props, > > - TLS_DANE_BASED(state->client_start_props->tls_level)); > > + TLS_DANE_HASTA(state->client_start_props->dane)); Will this also use the right verify callback function pointer when real DANE is requested? Or does real DANE not use those same callbacks? Wietse
Re: PATCH #3 (Postfix 3.4 + 3.5): TLS connection_reuse with "tafile"
On Fri, Aug 21, 2020 at 10:32:10AM +0300, Thorsten Habich wrote: > > This is relevant, but probably not 100% accurate, likely some domains > > also intermittently failed routine CAfile-based validation. > > Thanks for the patch. There was no higher number of certificate > verification failures since I updated to Postfix 3.5.4. > > Checking the logs of the last 8 days, there is only one "Server > certificate not trusted" error for a CApath-based configuration and 348 > errors for TAfile based configurations. > The number of *CApath* (if that's relevant) based mandatory TLS > configurations per destination is currently FAR higher than the tafile > based configurations. This is expected, given that most destinations are not using "tafile", the probability of failing a CApath validation during a concurrent successful "tafile" delivery is low, but not zero. If that one validation failure was transient (the destination otherwise verified before and after) then that could be the one low probability case. > Hope this patch is suitable: > > --- a/src/posttls-finger/posttls-finger.c 2019-02-12 > 14:17:45.0 +0100 > +++ b/src/posttls-finger/posttls-finger.c.new 2020-08-21 > 09:15:04.256945675 +0200 > @@ -1988,7 +1988,7 @@ > msg_fatal("bad '-a' option value: %s", state->options.addr_pref); > > #ifdef USE_TLS > - if (state->tlsproxy_mode && state->reconnect) > + if (state->tlsproxy_mode && state->reconnect > 0) > msg_fatal("The -X and -r options are mutually exclusive"); > #endif This has already been fixed in the 3.6 snapshots. Don't recall whether that's been backported to 3.4/3.5. -- Viktor.
Re: PATCH #3 (Postfix 3.4 + 3.5): TLS connection_reuse with "tafile"
Viktor Dukhovni: > On Thu, Aug 20, 2020 at 01:20:00PM -0400, Wietse Venema wrote: > > > Viktor Dukhovni: > > > > > - &_DANE_BASED(state->client_start_props->tls_level)) > > > + && TLS_DANE_HASTA(state->client_start_props->dane)) > > > @@ -1427,7 +1427,7 @@ static void tlsp_get_request_event(int event, void > > > *context) > > > - TLS_DANE_BASED(state->client_start_props->tls_level)); > > > + TLS_DANE_HASTA(state->client_start_props->dane)); > > > > This looks weird. I thought that the problem was with trust anchors, not > > DANE? > > Yes, the problem is with trust anchors, but DANE is the general case of: > > * Policy-based end-entity cert matching: > > - DANE "_25._tcp.example.net. IN TLSA 3 ? ? ..." > > - The Postfix "fingerprint" security level > > * Policy-based issuer CA cert matching: > > - DANE "_25._tcp.example.net. IN TLSA 2 ? ? ..." > > - The Postfix verify/secure levels with a custom per-site > "tafile" . > > Actual DANE TLSA RRsets can have either or both DANE-EE or DANE-TA > records, with verification ultimately matching either or both. The > "fingerprint" level is mapped to DANE-EE, while "tafile" support is > mapped to DANE-TA. > > Thus actual DANE, fingerprint and secure/verify with a "tafile" are all > handled via the "general case" of "some sort of DANE-like policy". > > In Postfix 3.6, the job of validating "some sort DANE-like policy" is > entirely delegated to OpenSSL. You'll be pleased to know, that in > Postfix 3.6 the TLS_DANE_HASTA() and TLS_DANE_HASEE() macros are gone. > We no longer need to treat the various DANE-like matching differently. As discussed offlist, I would have structured the code in a different manner, such that trust-anchor support does not call into the DANE stack, but DANE and trust anchors are entirely separate features that call into a common infrastructure. In any case most of that code is gone with Postfix 3.6. Wietse
Re: PATCH #3 (Postfix 3.4 + 3.5): TLS connection_reuse with "tafile"
On Thu, Aug 20, 2020 at 01:20:00PM -0400, Wietse Venema wrote: > Viktor Dukhovni: > > > - &_DANE_BASED(state->client_start_props->tls_level)) > > + && TLS_DANE_HASTA(state->client_start_props->dane)) > > @@ -1427,7 +1427,7 @@ static void tlsp_get_request_event(int event, void > > *context) > > - TLS_DANE_BASED(state->client_start_props->tls_level)); > > + TLS_DANE_HASTA(state->client_start_props->dane)); > > This looks weird. I thought that the problem was with trust anchors, not DANE? Yes, the problem is with trust anchors, but DANE is the general case of: * Policy-based end-entity cert matching: - DANE "_25._tcp.example.net. IN TLSA 3 ? ? ..." - The Postfix "fingerprint" security level * Policy-based issuer CA cert matching: - DANE "_25._tcp.example.net. IN TLSA 2 ? ? ..." - The Postfix verify/secure levels with a custom per-site "tafile" . Actual DANE TLSA RRsets can have either or both DANE-EE or DANE-TA records, with verification ultimately matching either or both. The "fingerprint" level is mapped to DANE-EE, while "tafile" support is mapped to DANE-TA. Thus actual DANE, fingerprint and secure/verify with a "tafile" are all handled via the "general case" of "some sort of DANE-like policy". In Postfix 3.6, the job of validating "some sort DANE-like policy" is entirely delegated to OpenSSL. You'll be pleased to know, that in Postfix 3.6 the TLS_DANE_HASTA() and TLS_DANE_HASEE() macros are gone. We no longer need to treat the various DANE-like matching differently. -- Viktor.
Re: PATCH #3 (Postfix 3.4 + 3.5): TLS connection_reuse with "tafile"
Viktor Dukhovni: > state->client_start_props->fd = state->ciphertext_fd; > /* These predicates and warning belong inside tls_client_start(). */ > if (!tls_dane_avail()/* mandatory side effects!! */ > - &_DANE_BASED(state->client_start_props->tls_level)) > + && TLS_DANE_HASTA(state->client_start_props->dane)) > msg_warn("%s: DANE requested, but not available", >state->client_start_props->namaddr); > else > @@ -1427,7 +1427,7 @@ static void tlsp_get_request_event(int event, void > *context) > } > state->appl_state = tlsp_client_init(state->tls_params, >state->client_init_props, > - TLS_DANE_BASED(state->client_start_props->tls_level)); > + TLS_DANE_HASTA(state->client_start_props->dane)); > ready = state->appl_state != 0; > break; > case TLS_PROXY_FLAG_ROLE_SERVER: This looks weird. I thought that the problem was with trust anchors, not DANE? Wietse