Re: Post-Mortem of the djangoproject.com outage earlier today.

2016-05-04 Thread Jacob Kaplan-Moss
Thanks for the info, and for the quick fix!

Jacob

On Wed, May 4, 2016 at 3:48 AM, Florian Apolloner 
wrote:

> Hi,
>
> earlier today (roughly 9:30 UTC) I deployed a wrong (local) branch from
> our ansible repository to dp.com. This branch included our old (now
> expired) gandi SSL certificate. Once I realized what I did (which took a
> few minutes since I was preparing to push a commit) I switched to the
> proper branch and redeployed (+ notified ops and IRC). This redeployment
> caused our new and shiny letsencrypt.org key to be redeployed (which we
> have been running successfully the last few weeks) but left the old gandi
> crt (the cert not the key!) in place, since letsencrypt automatically
> generates that. At this point nginx would no longer start:
>
> SSL_CTX_use_PrivateKey_file("/etc/nginx/ssl/djangoproject.com.key") failed
> (SSL: error:0B080074:x509 certificate routines:X509_check_private_key:key
> values mismatch)
>
> This is kind of an obvious error, as the private key and the cert no
> longer match, which I verified with "openssl x509 -in djangoproject.com.crt
> -text -noout". The solution to this problem is simple, regenerate the cert
> and be done with it -- sadly not that simple since nginx refused to start
> :D Setting ssl to off in nginx.conf had no effect, apparently nginx is
> really picky about unusable ssl certs. Knowing that we configure nginx only
> via ansible I quickly reduced the config to the default server without ssl,
> started it and regenerated the cert. An ansible run restored the original
> config and allowed the server to start again -- executed twice just to
> ensure that the second run would not yield any changes in files (finally at
> 9:56 UTC).
>
> Going forward I think we will do the following:
>  * Monitor SSL certs more closely, it shouldn't take me minutes to notice
> that the cert is broken
>  * See if we can ensure that we push from the correct ansible branch
>  * Find a config that allows nginx to start with invalid certs if possible
> (at least the http listeners), so we can easily reissue certs.
>
> Sorry for the outage,
> Florian
>
> P.S.: On a non-related note: Google should display search results in the
> correct language now :D
>
> --
> You received this message because you are subscribed to the Google Groups
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to django-developers+unsubscr...@googlegroups.com.
> To post to this group, send email to django-developers@googlegroups.com.
> Visit this group at https://groups.google.com/group/django-developers.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/django-developers/992b6ec7-fa82-4392-ba4b-d0f74f09b865%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/CAK8PqJEVqPpEXXzOAaGJEaRYbua%2BJQ4YTrjKk%3DWtoGb7QJUxRw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Post-Mortem of the djangoproject.com outage earlier today.

2016-05-04 Thread Florian Apolloner
Hi,

earlier today (roughly 9:30 UTC) I deployed a wrong (local) branch from our 
ansible repository to dp.com. This branch included our old (now expired) 
gandi SSL certificate. Once I realized what I did (which took a few minutes 
since I was preparing to push a commit) I switched to the proper branch and 
redeployed (+ notified ops and IRC). This redeployment caused our new and 
shiny letsencrypt.org key to be redeployed (which we have been running 
successfully the last few weeks) but left the old gandi crt (the cert not 
the key!) in place, since letsencrypt automatically generates that. At this 
point nginx would no longer start:

SSL_CTX_use_PrivateKey_file("/etc/nginx/ssl/djangoproject.com.key") failed 
(SSL: error:0B080074:x509 certificate routines:X509_check_private_key:key 
values mismatch)

This is kind of an obvious error, as the private key and the cert no longer 
match, which I verified with "openssl x509 -in djangoproject.com.crt -text 
-noout". The solution to this problem is simple, regenerate the cert and be 
done with it -- sadly not that simple since nginx refused to start :D 
Setting ssl to off in nginx.conf had no effect, apparently nginx is really 
picky about unusable ssl certs. Knowing that we configure nginx only via 
ansible I quickly reduced the config to the default server without ssl, 
started it and regenerated the cert. An ansible run restored the original 
config and allowed the server to start again -- executed twice just to 
ensure that the second run would not yield any changes in files (finally at 
9:56 UTC).

Going forward I think we will do the following:
 * Monitor SSL certs more closely, it shouldn't take me minutes to notice 
that the cert is broken
 * See if we can ensure that we push from the correct ansible branch
 * Find a config that allows nginx to start with invalid certs if possible 
(at least the http listeners), so we can easily reissue certs.

Sorry for the outage,
Florian

P.S.: On a non-related note: Google should display search results in the 
correct language now :D

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/992b6ec7-fa82-4392-ba4b-d0f74f09b865%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.