Re: GTS - OCSP serving issue 2020-04-09
On Sun, Apr 19, 2020 at 6:13 AM Nick Lamb wrote: > It's possible that I'm confused somehow, but for me §9.16.3 of the BRs > does not have numbered item 5, and neither this nor §9.6.1 define > "contractual jeopardy" nor do they clear up why a subscriber would want > to shut down their service and perhaps be driven into bankruptcy in > deference to a mere technical error. 9.6.3. Is your position now that your earlier advice was quite wrong and > should be disregarded? That’s an extreme take from what I wrote, and an extremely bad one at that. You asked for more details, I pointed you to the BRs which provide you more details. The answer the “what” that you wanted more details on. CAs are required to have legally enforceable agreements with Subscribers that, in some circumstances, the Subscriber must immediately cease use of the private key. You can see me referencing that as an abuse vector in the parallel thread on revocation reasons. In any event, this incident report has been so throughly hijacked as to be unsalvagable as a thread for the purpose of gathering more data. This is because it was unfortunately taken as a pedagogical opportunity, and the advice wasn’t necessarily relevant to the incident at hand (e.g. GTS does not OCSP staple nor offer Subscribers the means to), nor good at capturing the tradeoffs (there’s a reason stapling isn’t done). Luckily, the bug exists to continue discussion there. > ___ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy
Re: GTS - OCSP serving issue 2020-04-09
On 19/04/2020 11:13, Nick Lamb via dev-security-policy wrote: On Sat, 18 Apr 2020 22:57:03 -0400 Ryan Sleevi via dev-security-policy wrote: The Baseline Requirements address this. See 9.16.3 (particularly item 5) and 9.6.1 (6). For better or worse, the situation is as Neil described and required for all CAs. It's possible that I'm confused somehow, but for me §9.16.3 of the BRs does not have numbered item 5, and neither this nor §9.6.1 define "contractual jeopardy" nor do they clear up why a subscriber would want to shut down their service and perhaps be driven into bankruptcy in deference to a mere technical error. I suspect that this was a typo from Ryan, and he meant Section 9.6.3 (5) which states (regarding subscriber agreements) : 5. Reporting and Revocation: An obligation and warranty to: (a) promptly request revocation of the Certificate, and cease using it and its associated Private Key, if there is any actual or suspected misuse or compromise of the Subscriber’s Private Key associated with the Public Key included in the Certificate, and (b) promptly request revocation of the Certificate, and cease using it, if any information in the Certificate is or becomes incorrect or inaccurate. Clause 6 of the same section is also relevant - (but only if the private key has been compromised): 6. Termination of Use of Certificate: An obligation and warranty to promptly cease all use of the Private Key corresponding to the Public Key included in the Certificate upon revocation of that Certificate for reasons of Key Compromise. So, a CA is _required_ to have these terms in its Subscriber Agreements. Regarding 9.6.1, you are right that my generic term (contractual jeopardy) is not defined, but it does establish that the Subscriber Agreement must be a legally enforceable document. If one party declines to adhere to its responsibilities under the agreement, the contract is placed in peril. Now, if a CA is producing OCSP errors, or vague or confusing statements as to the status of one of its certificates, then absolutely a Subscriber would not shut down its services until the instruction from the CA is clearly expressed. My view is be that a properly formed, digitally signed and dated, statement of revocation _does_ make the instruction clear. Regards, Neil ___ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy
Re: GTS - OCSP serving issue 2020-04-09
On 18/04/2020 23:39, Nick Lamb via dev-security-policy wrote: I'm sure the client does understand revoked, but it won't (and certainly shouldn't) _accept_ it, hence Ryan's choice of language. Clients also understand expired OCSP certificates, and they don't accept those either. I'm a bit confused. I'm not sure I understand the point being made here. Expired OCSP responder certificates would be an example of an artifact produced by the CA in error if no newer OCSP responses were produced. The relying party software would not treat such a response as a valid statement from the CA, and would thus rely on the last statement which is still valid. But a revoked response as part of an OCSP staple should be understood (and the TLS session stopped) if the signature on the response is proper. I suspect that I'm simply unclear what the term "accept" means in this context. Presumably failure to adhere to that agreement could place you in some contractual jeopardy? What does "contractual jeopardy" mean here? I guess a CA representative might chime in here to tell us if they've sued any subscribers for not treating OCSP responses as a legal notice that they must desist using a Private Key ? My firm guess would be "No, this has never happened". Again, I'm still not sure what point you are making here. Apologies for being a bit dim - but because a legal right or responsibility hasn't been enforced through court action doesn't necessarily negate the existence of such rights and responsibilities. Surely a CA's signature on an artifact (certificate, CRL, OCSP response) has got to mean _something_. I don't think a OCSP response validly issued and with a "revoked" certificate status can be taken to mean "This certificate with this serial number may, or may not be, revoked and should be treated as advisory until earlier OCSP responses have expired and this becomes a repeated signed artifact". However, I think that I've probably misunderstood the point being made; if so, I apologise and would be happy to be corrected. I readily admit that CAs make mistakes - sometimes certificates which should be revoked aren't (in a timely manner), and I'd bet that some CAs have revoked certificates which were not meant to be revoked. Yet I still hold that a signature indicating that a certificate has been revoked is binding on the CA, and (probably) on the Subscriber via the Subscriber Agreement, which all CAs are _required_ to have as legally enforceable instruments between certificate holders and the issuing CA. I didn't mean to imply that "contractual jeopardy" is some hard and fast legal term: I just meant that if you don't adhere to a binding agreement, the non-adhering party could be in danger of acting unlawfully and potentially subject to damages (presumably if they can be qualified). Sometimes some Subscriber Agreements allow the certificate holder to display some sort of seal owned by the CA - again, failure to adhere to the Subscriber Agreement (i.e. by using the certificate post revocation when the holder knows that it has been revoked) would negate the right to continued display of the seal. I very much doubt that this has actually been litigated, but I'm also pretty certain that displaying trademarks without authorisation has been litigated. Regards, Neil ___ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy
Re: GTS - OCSP serving issue 2020-04-09
On Sat, 18 Apr 2020 22:57:03 -0400 Ryan Sleevi via dev-security-policy wrote: > On Sat, Apr 18, 2020 at 6:39 PM Nick Lamb via dev-security-policy < > dev-security-policy@lists.mozilla.org> wrote: > > > What does "contractual jeopardy" mean here? > > The Baseline Requirements address this. See 9.16.3 (particularly item > 5) and 9.6.1 (6). > > For better or worse, the situation is as Neil described and required > for all CAs. It's possible that I'm confused somehow, but for me §9.16.3 of the BRs does not have numbered item 5, and neither this nor §9.6.1 define "contractual jeopardy" nor do they clear up why a subscriber would want to shut down their service and perhaps be driven into bankruptcy in deference to a mere technical error. Is your position now that your earlier advice was quite wrong and should be disregarded? Nick. ___ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy
Re: GTS - OCSP serving issue 2020-04-09
On Sat, Apr 18, 2020 at 6:39 PM Nick Lamb via dev-security-policy < dev-security-policy@lists.mozilla.org> wrote: > What does "contractual jeopardy" mean here? The Baseline Requirements address this. See 9.16.3 (particularly item 5) and 9.6.1 (6). For better or worse, the situation is as Neil described and required for all CAs. > ___ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy
Re: GTS - OCSP serving issue 2020-04-09
On Fri, 17 Apr 2020 18:34:00 +0100 Neil Dunbar via dev-security-policy wrote: > timestamp checking etc, etc]. Ryan's writeup calls out the revoked > situation under the heading of 'make sure it is something the client > will accept' - if the client understands OCSP responses at all, it > needs to understand revoked, surely? I'm sure the client does understand revoked, but it won't (and certainly shouldn't) _accept_ it, hence Ryan's choice of language. Clients also understand expired OCSP certificates, and they don't accept those either. > Because it places you (a good actor) in compliance with your > subscriber agreement? Just as an example, some text in a few commonly > used CA Subscriber Agreements have subscriber obligations like "cease > all use of the Certificate and its Private Key upon expiration or > revocation of the Certificate" or "Subscriber shall promptly cease > using a Certificate and its associated Private Key" (under the > section for revocation). Presumably failure to adhere to that > agreement could place you in some contractual jeopardy? What does "contractual jeopardy" mean here? I guess a CA representative might chime in here to tell us if they've sued any subscribers for not treating OCSP responses as a legal notice that they must desist using a Private Key ? My firm guess would be "No, this has never happened". In fact do any CA representatives want to stand up and tell us they regard OCSP responses as legally binding declarations by their CA which are immune to ordinary mistakes? Nick. ___ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy
Re: GTS - OCSP serving issue 2020-04-09
On 17/04/2020 14:22, Nick Lamb via dev-security-policy wrote: GOOD means_at least_ the good CertStatus (also 0) in OCSP. We'll see why in a moment. Fair enough. That's what I thought - so holding onto the last successful OCSP report you have, even if you get exception status codes thereafter is a good way forward. I think that's reasonable. I'm just less sure that you should be treating a well formed 'revoked' response as something which can be ignored until the current 'good' OCSP response expires. [Note my carefully chosen weasel words like 'well formed', which also entails stuff like proper timestamp checking etc, etc]. Ryan's writeup calls out the revoked situation under the heading of 'make sure it is something the client will accept' - if the client understands OCSP responses at all, it needs to understand revoked, surely? But why? We are us, why would we want to announce that our certificate is revoked? What possible benefit could accrue to us from choosing to do this? Because it places you (a good actor) in compliance with your subscriber agreement? Just as an example, some text in a few commonly used CA Subscriber Agreements have subscriber obligations like "cease all use of the Certificate and its Private Key upon expiration or revocation of the Certificate" or "Subscriber shall promptly cease using a Certificate and its associated Private Key" (under the section for revocation). Presumably failure to adhere to that agreement could place you in some contractual jeopardy? So, following from your response, I think that, indeed - shutting down the site until a replacement key/cert is deployed would be the 'right' thing to do, rather than advertise a revoked response. The difference being that shutting down is (usually) a manual step, whereas stapling the most recent valid response from the CA (good or revoked) is probably an automated step. Regards, Neil ___ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy
Re: GTS - OCSP serving issue 2020-04-09
On Thu, 16 Apr 2020 13:56:34 +0100 Neil Dunbar via dev-security-policy wrote: > On 16/04/2020 00:04, Nick Lamb via dev-security-policy wrote: > For the avoidance of doubt (and my own poor brain) - does 'GOOD' here > mean OCSP status code 'successful' (0) AND returning a 'good' status > for the certificate, or does it just mean status code 'successful'? > The GTS case here was returning OCSP exception status 'unauthorized' > (6). GOOD means _at least_ the good CertStatus (also 0) in OCSP. We'll see why in a moment. Ryan provides a considerably longer list of stupid things that might go wrong in item (2) from https://gist.github.com/sleevi/5efe9ef98961ecfb4da8 You should consider all of them reasons the answer shouldn't replace an existing GOOD answer you have. > I would have thought that an OCSP-stapling implementation which got > an OCSP status code 'successful' (0) with a 'revoked' status for the > certificate would want to pass that on to the client, replacing any > prior OCSP successful/status-good report, whether that prior report > was still valid. But why? We are us, why would we want to announce that our certificate is revoked? What possible benefit could accrue to us from choosing to do this? Remember we cannot choose the behaviour of an adversary. So if we choose to tell clients our certificate is revoked, but an adversary asserts their copy is still good, clients will continue to talk to the adversary which is almost certainly a worse outcome. If your model of TLS still looks like early SSL, with implicit RSA authentication then I can see that if you squint advertising your own revocation isn't completely stupid. Maybe the revocation means an adversary knows our private key, and so in continuing to talk to clients with this key we make things worse, we should admit it's revoked instead. I'd argue that if this was a scenario you care about the right thing is for the server to shut down instead, not staple revoked responses. But anyway sites which actually care about security should never use implicit authentication (and it doesn't exist in TLS 1.3). As a result there is zero risk from pressing on, you are definitely you, the only question is whether you can continue to convince clients that this is so, and stapling a non-GOOD answer will never help you do that so it's never the correct thing to do. Nick. ___ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy
Re: GTS - OCSP serving issue 2020-04-09
On 16/04/2020 14:49, Kurt Roeckx via dev-security-policy wrote: On 2020-04-16 14:56, Neil Dunbar wrote: I would have thought that an OCSP-stapling implementation which got an OCSP status code 'successful' (0) with a 'revoked' status for the certificate would want to pass that on to the client, replacing any prior OCSP successful/status-good report, whether that prior report was still valid. As owner of the certificate, I think you actually don't want to do that, because things will stop working. If it's revoked you want to get a new certificate, and as long as you don't have the new one, you want to use the old certificate and staple the good OCSP response. Really? Continue to use a certificate in the (more recent) knowledge that the issuing CA has disavowed it? I know that will work from the perspective of the TLS protocol, but it might be the sort of thing which would run afoul of the owner's subscriber agreement. So, if the CA operated on a purely customer-enforced OCSP-stapling approach (ie, didn't publish the OCSP URI in the end certificate), that would mean the relying party would have no reasonable way to validate whether the certificate even _could_ be trusted. I mean - I see what you're saying: you have a website which you want to keep working until you replace your certificate and/or private key. But if I had signed knowledge from the issuing CA that (for instance) my private key was compromised, I don't think it would be terribly ethical to continue its use; depending on your subscriber agreement it might not even be lawful. It seems like you are materially misrepresenting the state of your certificate to the detriment of your relying parties. Regards, Neil ___ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy
RE: GTS - OCSP serving issue 2020-04-09
> As owner of the certificate, I think you actually don't want to do that, > because things will stop working. If it's revoked you want to get a new > certificate, and as long as you don't have the new one, you want to use the > old certificate and staple the good OCSP response. > That depends on what you're optimizing for. While your solution definitely helps with Availability, it deprioritizes Confidentiality and Integrity. For example, if your private key was compromised and your certificate subsequently revoked, your service would continue to be accessible (good availability) but all communications could be MITMed (bad for Confidentiality and Integrity). Thanks, Corey This transmission may contain information that is privileged, confidential, and/or exempt from disclosure under applicable law. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or use of the information contained herein (including any reliance thereon) is STRICTLY PROHIBITED. If you received this transmission in error, please immediately contact the sender and destroy the material in its entirety, whether in electronic or hard copy format. ___ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy
Re: GTS - OCSP serving issue 2020-04-09
On 2020-04-16 14:56, Neil Dunbar wrote: I would have thought that an OCSP-stapling implementation which got an OCSP status code 'successful' (0) with a 'revoked' status for the certificate would want to pass that on to the client, replacing any prior OCSP successful/status-good report, whether that prior report was still valid. As owner of the certificate, I think you actually don't want to do that, because things will stop working. If it's revoked you want to get a new certificate, and as long as you don't have the new one, you want to use the old certificate and staple the good OCSP response. Kurt ___ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy
Re: GTS - OCSP serving issue 2020-04-09
On 16/04/2020 00:04, Nick Lamb via dev-security-policy wrote: Specifically: You should cache your stapled GOOD answers in durable storage if practical, and when periodically refreshing you should report non-GOOD answers to the operator (e.g. logging them as an ERROR condition) but always continue to present clients with the last GOOD answer until it actually expires even if you receive newer non-GOOD OCSP responses. For the avoidance of doubt (and my own poor brain) - does 'GOOD' here mean OCSP status code 'successful' (0) AND returning a 'good' status for the certificate, or does it just mean status code 'successful'? The GTS case here was returning OCSP exception status 'unauthorized' (6). I would have thought that an OCSP-stapling implementation which got an OCSP status code 'successful' (0) with a 'revoked' status for the certificate would want to pass that on to the client, replacing any prior OCSP successful/status-good report, whether that prior report was still valid. But I'm with you on the implementation retaining the last successful OCSP report until it expires (I'd go further: if I got a successful/revoked response, followed by a successful/good response later on, I'd be flagging that to the CA as a serious problem, and retaining the successful/revoked ones until _it_ expires) Cheers, Neil ___ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy
Re: GTS - OCSP serving issue 2020-04-09
On Tue, 14 Apr 2020 13:13:59 -0700 Andy Warner via dev-security-policy wrote: > From 2020-04-08 16:25 UTC to 2020-04-09 05:40 UTC, Google Trust > Services' EJBCA based CAs (GIAG4, GIAG4ECC, GTSY1-4) served empty > OCSP data which led the OCSP responders to return unauthorized. No new lessons for CAs here in general, but I think this incident is worth highlighting as an example to OCSP Stapling implementations. It is desirable (not technically required in the standard, but necessary to a robust implementation) that your software should not be adversely affected by an outage like this. Mistakes will happen, and good software can and thus should allow for them without introducing cascading failure. Specifically: You should cache your stapled GOOD answers in durable storage if practical, and when periodically refreshing you should report non-GOOD answers to the operator (e.g. logging them as an ERROR condition) but always continue to present clients with the last GOOD answer until it actually expires even if you receive newer non-GOOD OCSP responses. Nick. ___ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy
GTS - OCSP serving issue 2020-04-09
m.d.s.p community, Google Trust Services just filed https://bugzilla.mozilla.org/show_bug.cgi?id=1630040 which contains the same information as the report that follows. >From 2020-04-08 16:25 UTC to 2020-04-09 05:40 UTC, Google Trust Services' EJBCA based CAs (GIAG4, GIAG4ECC, GTSY1-4) served empty OCSP data which led the OCSP responders to return unauthorized. These CAs exist for issuance of custom certificate profiles and certificates for test sites for inactive roots. Our primary CAs (GTS CA 1O1 and GTS CA 1D2) were unaffected. The problem self-corrected, but we have added safeguards to prevent recurrence. 1. How your CA first became aware of the problem (e.g. via a problem report submitted to your Problem Reporting Mechanism, a discussion in mozilla.dev.security.policy, a Bugzilla bug, or internal self-audit), and the time and date. Monitoring detected the issue on 2020-04-08 at 16:35 UTC. The root cause was identified within hours. The issue was automatically remediated in the next generation and push to CDN cycle while debugging and fixes were ongoing. 2. A timeline of the actions your CA took in response. A timeline is a date-and-time-stamped sequence of all relevant events. This may include events before the incident was reported, such as when a particular requirement became applicable, or a document changed, or a bug was introduced, or an audit was done. 2020-04-08, 11:29 UTC - Scheduled system update begins 2020-04-08, 14:00 UTC - Incorrect OCSP archives are generated 2020-04-08, 15:03 UTC - Scheduled system update concludes 2020-04-08, 16:20 UTC - Incorrect OCSP responses pushed to CDN 2020-04-08, 16:35 UTC - First production monitoring alert fires 2020-04-08, 22:00 UTC - Correct OCSP archives are generated automatically 2020-04-09, 00:20 UTC - Correct OCSP responses pushed to CDN 2020-04-09, 05:40 UTC - Monitoring confirms all probes are passing 3. Whether your CA has stopped, or has not yet stopped, issuing certificates with the problem. A statement that you have will be considered a pledge to the community; a statement that you have not requires an explanation. The affected CAs are only used for infrequent and manual custom certificate issuance. No certificate issuance aside from a manually issued post update test certificate to validate the upgrade to resolve the issue took place during this period. The issue in question also was specific to refreshing OCSP responses and not certificate issuance. 4. A summary of the problematic certificates. For each problem: number of certs, and the date the first and last certs with that problem were issued. No certificate issuance aside from a manually issued post update test certificate to validate the upgrade to resolve the issue took place during this period. The test certificate was a valid and fully compliant issuance. 5. The complete certificate data for the problematic certificates. The recommended way to provide this is to ensure each certificate is logged to CT and then list the fingerprints or crt.sh IDs, either in the report or as an attached spreadsheet, with one list per distinct problem. No certificate issuance aside from the manually issued post update test certificate to validate the the upgrade. 6. Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now. Our creation of OCSP responses and packaging them for serving is designed to fail if any sub-command fails using set -e. However, if the function call is part of an AND or OR sequence (ie. using '&&' or '||' control operators), the set -e is suppressed inside the function. The tool we use to fetch OCSP responses from EJBCA correctly returned a non-zero exit code (due to no OCSP responses being generated because EJBCA was not running), but because it was called inside a function with its own error handling (using && syntax), the script continued without handling the error properly and wrongly used empty tar.gz files with no responses in them. The bug had existed for multiple years as a potential race condition and we did not encounter it previously. Quality tests are executed before publication to the CDN, however, those tests accommodate empty responses as a valid condition because it is something that can and does happen. This condition did not repeat on the following update of the OCSP responses. As a result the next update resolved the issue. Our monitoring caught the issue enabling expedient root cause analysis and resolution. 7. List of steps your CA is taking to resolve the situation and ensure such issuance will not be repeated in the future, accompanied with a timeline of when your CA expects to accomplish these things. No certificate issuance aside from a valid manually issued post update test certificate to validate the upgrade took place during this period. The logic error that led to incorrect OCSP responses being served has been corrected, is checked in and in production. Additionally, checks have b