Incident Report - CAA misissuance (was Re: Lack of CAA checking at Comodo)

Rob Stradling via dev-security-policy Tue, 12 Sep 2017 10:11:07 -0700

On 11/09/17 15:30, Rob Stradling via dev-security-policy wrote:

Hi Hanno. Thanks for reporting this to us. We acknowledge the problem,and as I mentioned at [1], we took steps to address it this morning.
We will follow-up with an incident report ASAP.


INCIDENT REPORT

We received two Problem Reports - from Hanno Böck on 9th September at20:10 UTC, and from Jonathan Rudenberg on 10th September at 00:08 UTC -each of which reported that we had misissued a certificate contrary to apublished CAA RRset.Jonathan reported this problem athttps://bugzilla.mozilla.org/show_bug.cgi?id=1398545, and inhttps://bugzilla.mozilla.org/show_bug.cgi?id=1398545#c2 Quirin Scheitleprovided a further misissuance report.


TRIAGING

Some Comodo staff saw these reports late on Friday 9th and began todiscuss them over the weekend, but they were unable to confirm theiraccuracy. Indeed, the reports appeared to them to be erroneous, becausethe logs at their disposal showed that the relevant CAA checks had beenperformed but the RRsets were empty. Therefore, the only action takenat that point was to escalate the reports to the original developer ofour CAA checking code to look at first thing Monday morning.


BACKGROUND

As you'd expect from the authors of RFC6844, we were an early adopter,deploying our initial CAA checking implementation 2.5 years ago. Itexecutes `dig CAA +dnssec +sigchase +trusted-key=dnssec_trusted.keys` toperform the DNS queries. We chose this approach after concluding that,at that time, it was the least worst option available to us for checkingDNSSEC signatures. We deployed a specific version of BIND (9.10.1-P2)because testing had shown that `dig` in the next release of BIND wouldcrash when trying to do DNSSEC validation.


WHAT WENT WRONG

Our ops team upgraded the servers that our CAA checking code was runningon. This included a very-long-awaited transition from a 32-bit to64-bit OS. Rather than recompile 9.10.1-P2 for 64-bit, our opsengineers upgraded BIND to 9.10.5-P1.Yesterday morning (Monday 11th), when investigating the Problem Reports,the original developer discovered that as a result of that BIND upgradeall of our calls to `dig` were returning the following response:


`Invalid option: +sigchase
Usage:  dig [@global-server] [domain] [q-type] [q-class] {q-opt}
            {global-d-opt} host [@local-server] {local-d-opt}
            [ host [@local-server] {local-d-opt} [...]]

Use "dig -h" (or "dig -h | more") for complete list of options`

Unfortunately, this `dig` response was being interpreted by our CAAchecking code as a CAA response that contained: no "issue" property, no"issuewild" property, no unrecognized critical properties, etc.

This problem had gone undetected due to a combination of reasons: thedeveloper did not ask for BIND to be upgraded and so did not expect anybehaviour to have changed; the ops engineers did not realize thatupgrading BIND might cause a problem; there wasn't an automated testthat would've detected this problem and raised an alarm; CAA RRsets arestill fairly uncommon, so nobody noticed that we'd dropped from findinghardly any RRsets to finding zero RRsets; our validation staff only seethe results of our CAA processing rather than the complete output from`dig`.


ACTION TAKEN TO ADDRESS THE PROBLEM

Upon discovery of the failing `dig` calls, we immediately downgraded toBIND 9.10.1-P2 and verified that our CAA checks were then workingcorrectly. We also purged our local cache (of recent `dig` responses)to ensure that the misissuance vector was completely closed.


PROBLEM CERTIFICATES
The following certificates have all been revoked:
Reported by Hanno:
https://crt.sh/?id=207082245
Reported by Jonathan:
https://crt.sh/?id=207224651
Reported by Quirin:
https://crt.sh/?id=208456003
https://crt.sh/?id=208486480
https://crt.sh/?id=208486489
https://crt.sh/?id=208486485
https://crt.sh/?id=208486495

NEW CAA CHECKING IMPLEMENTATION

Our initial CAA checking implementation, while functional, was notdesigned with our current certificate issuance volumes in mind.Consequently, we had been working on a new, much more scalable CAAchecking implementation, written in Go. We had expected to deploy thisnew implementation during Q2 2017, but work on this project was pauseddue to the uncertainties of CNAME processing that have now been resolvedat IETF (see https://www.rfc-editor.org/errata/eid5065) and that willhopefully soon also be resolved at CABForum (seehttps://cabforum.org/pipermail/public/2017-August/011972.html).


DEPLOYING OUR NEW CAA CHECKING IMPLEMENTATION

Having fixed our `dig` calls we found that our system was struggling toprocess the queue of CAA checks quickly enough, and so we acceleratedour plans to deploy our new CAA checking implementation. This morning(Tuesday 12th) we verified that our new implementation does a reasonablejob when faced with the test cases at https://caatestsuite.com/, and wedeployed it.


VERIFYING OUR NEW CAA CHECKING IMPLEMENTATION

We are taking immediate steps to engage the services of an externalsecurity consultant to independently assess our new CAA checkingimplementation and to work with us to ensure that it behaves correctly.


ACKNOWLEDGMENTS

We would like to express our thanks to Hanno, Jonathan and Quirin forreporting the problem to us, and to Andrew Ayer for providinghttps://caatestsuite.com/.


--
Rob Stradling
Senior Research & Development Scientist
COMODO - Creating Trust Online

_______________________________________________
dev-security-policy mailing list
dev-security-policy@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-security-policy

Incident Report - CAA misissuance (was Re: Lack of CAA checking at Comodo)

Reply via email to