bug#46961: Nginx and certbot cervices don't play well togther

2024-01-31 Thread Clément Lassieur
On Wed, Jan 31 2024, Carlo Zancanaro wrote:

>>> +  (< attempt 12)) ; 12 * 10 seconds = 2 minutes
>>  ^--
>> This comment is not true because certbot takes time to execute (around 15s
>> on my vm).  I don't think there is a need to be that precise.
>
> I haven't extracted/named the max-attempts value, but I have removed the
> comments that imply that the time frame is bounded.

Ok

>> Also could you update the example in the docs?
>
> I have removed the %certbot-deploy-hook in the example in the manual.
>
>> ... However, we could add a nginx-service-type and a
>> dhcp-client-service-type so that people have an idea of what the minimal
>> config is, maybe like I did in my first review:
>> https://debbugs.gnu.org/cgi/bugreport.cgi?bug=46961#23.
>
> I have not added this. I understand the desire, but I'm wary of providing an
> example that's "too involved". The current example demonstrates a minimal
> config of certbot itself. I think you are looking to include an example of a
> minimal system that hosts a website using certbot provided certificates. I
> don't know where an example like that belongs, but I'm not yet convinced it
> belongs in the certbot service documentation.

Sounds good

Pushed, thank you!
Clément





bug#46961: Nginx and certbot cervices don't play well togther

2024-01-31 Thread Carlo Zancanaro

On Wed, Jan 31 2024, Clément Lassieur wrote:

Removing guix-devel.


I've also removed Brice.


On Tue, Jan 30 2024, Carlo Zancanaro wrote:
 (format #t "Acquiring or renewing 
 certificate: ~a~%" name)


Here we could add ‘(force-output)’, because otherwise those logs 
arrive

after the certbot logs, and it's hard to understand anything.


Done.

+  ;; If we have a connection error, 
then bail early
+  ;; with exit code 2. We don't expect 
this to
+  ;; resolve within the timespan of 
this script.


Could we have a (log + force-output) here too?  (I imagine 
within a

‘begin’)


Done.

+  ;; If we have any other type of 
error, then continue
+  ;; but exit with a failing status 
code in the end.


and here?


Done.

And maybe a log also in case the command succeeds.  (So that 
would mean

to replace ‘unless’ with ‘if’).


Done.

+  (< attempt 12)) ; 12 * 10 seconds = 
2 minutes

 ^--
This comment is not true because certbot takes time to execute 
(around 15s on my vm).  I don't think there is a need to be that 
precise.


I haven't extracted/named the max-attempts value, but I have 
removed the comments that imply that the time frame is bounded.



Also could you update the example in the docs?


I have removed the %certbot-deploy-hook in the example in the 
manual.


... However, we could add a nginx-service-type and a 
dhcp-client-service-type so that people have an idea of what the 
minimal config is, maybe like I did in my first review: 
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=46961#23.


I have not added this. I understand the desire, but I'm wary of 
providing an example that's "too involved". The current example 
demonstrates a minimal config of certbot itself. I think you are 
looking to include an example of a minimal system that hosts a 
website using certbot provided certificates. I don't know where an 
example like that belongs, but I'm not yet convinced it belongs in 
the certbot service documentation.


Carlo





bug#46961: Nginx and certbot cervices don't play well togther

2024-01-30 Thread Clément Lassieur
Removing guix-devel.

On Tue, Jan 30 2024, Carlo Zancanaro wrote:

> +(define (file-contains? file string)
> +  (string-contains (call-with-input-file file
> + get-string-all)
> +   string))
> +
> +(define (connection-error?)
> +  (file-contains? "/var/log/letsencrypt/letsencrypt.log"
> +  "Failed to establish a new connection"))
> +
> +(let ((script-code 0))
>(for-each
> (match-lambda
>   ((name . command)
>(begin
>  (format #t "Acquiring or renewing certificate: ~a~%" 
> name)

Here we could add ‘(force-output)’, because otherwise those logs arrive
after the certbot logs, and it's hard to understand anything.

> -(set! code (or (apply system* command) code)
> -   '#$commands) code)))
> +(unless (zero? (status:exit-val (apply system* command)))
> +  ;; Certbot errors are always exit code 1, but we'd like
> +  ;; to separate connection errors from other error 
> types.
> +  (if (connection-error?)
> +  ;; If we have a connection error, then bail early
> +  ;; with exit code 2. We don't expect this to
> +  ;; resolve within the timespan of this script.

Could we have a (log + force-output) here too?  (I imagine within a
‘begin’)
> +  (exit 2)
> +  ;; If we have any other type of error, then 
> continue
> +  ;; but exit with a failing status code in the end.

and here?
> +  (set! script-code 1))

And maybe a log also in case the command succeeds.  (So that would mean
to replace ‘unless’ with ‘if’).

> +   '#$commands)
> +  (exit script-code
>  
> +   (let loop ((attempt 0))
> + (let ((code (status:exit-val
> +  (system* #$(certbot-command config)
> +   (cond
> +((and (= code 2)  ; Exit code 2 means connection 
> error
> +  (< attempt 12)) ; 12 * 10 seconds = 2 minutes
 ^--
This comment is not true because certbot takes time to execute (around
15s on my vm).  I don't think there is a need to be that precise.  Maybe
you can just add in in the let form, as in (let ((code ...)
 (max-attempts 12)).

> + (sleep 10)
> + (loop (1+ attempt)))
> +((zero? code)
> + ;; Success!
> + #t)
> +(else
> + ;; Failure.
> + #f))

Also could you update the example in the docs?

>From the doc:
>> @defvar certbot-service-type
>> A service type for the @code{certbot} Let's Encrypt client.  Its value
>> must be a @code{certbot-configuration} record as in this example:
>> 
>> @lisp
>> (define %certbot-deploy-hook
>>   (program-file "certbot-deploy-hook.scm"
>> (with-imported-modules '((gnu services herd))
>>   #~(begin
>>   (use-modules (gnu services herd))
>>   (with-shepherd-action 'nginx ('reload) result result)
   ^
This part isn't useful anymore.  However, we could add a
nginx-service-type and a dhcp-client-service-type so that people have an
idea of what the minimal config is, maybe like I did in my first review:
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=46961#23.

>> (service certbot-service-type
>>  (certbot-configuration
>>   (email "foo@@example.net")
>>   (certificates
>>(list
>> (certificate-configuration
>>  (domains '("example.net" "www.example.net"))
>>  (deploy-hook %certbot-deploy-hook))
>> (certificate-configuration
>>  (domains '("bar.example.net")))
>> @end lisp

We are almost there, thanks!
Clément





bug#46961: Nginx and certbot cervices don't play well togther

2024-01-30 Thread Clément Lassieur
I removed guix-devel, not sure we need to spam it.

On Tue, Jan 30 2024, Carlo Zancanaro wrote:

> +(define %default-deploy-hook
> +  (program-file
> +   "reload-nginx.scm"
> +   (with-imported-modules '((gnu services herd))
> + #~(begin
> + (use-modules (gnu services herd))
> + (with-shepherd-action 'nginx ('reload) result result)
> +
>  (define-record-type* 
>certificate-configuration make-certificate-configuration
>certificate-configuration?
> @@ -65,7 +74,7 @@ (define-record-type* 
>(cleanup-hookcertificate-cleanup-hook
> (default #f))
>(deploy-hook certificate-configuration-deploy-hook
> -   (default #f))
> +   (default %default-deploy-hook))
>(start-self-signed?  certificate-configuration-start-self-signed?
> (default #t)))

I'd reload within ‘certbot-deploy-hook’, between ‘rename-file’ and “(if
deploy-hook-script” so that people don't get surprised, when they use a
deploy-hook for unrelated reasons, that the nginx doesn't reload
anymore.

Plus, reloading nginx is harmless.





bug#46961: Nginx and certbot cervices don't play well togther

2024-01-29 Thread Clément Lassieur
On Tue, Jan 30 2024, Carlo Zancanaro wrote:

>>> + ;; Due to the way certbot runs, we need to
>>> + ;; create the self-signed certificates in the
>>> + ;; archive folder and symlink them into the live
>>> + ;; folder. This mimics what certbot does well
>>> + ;; enough to make acquiring new certificates
>>> + ;; work.
>>
>> In another mail you say it doesn't work as well as you thought it did?
>> What doesn't work?
>
> This comment doesn't describe the code any more. In my first attempt I
> was trying to generate certificates in /etc/letsencrypt/live/ and get
> certbot to write over them when it ran. Unfortunately, it refused to do
> so. I then tried writing to /etc/letsencrypt/archive/ and symlinking
> into /etc/letsencrypt/live/ (which is what this comment describes), but
> that also failed. Certbot refuses to write over any existing files when
> fetching a certificate.

Oh I read the comment too quickly, I thought it was describing the
/etc/certs moving.  I suppose you will update it so to reflect the
actual state?

What you did (using /etc/certs, and symlinking stuff in
/etc/letsencrypt) is a good idea I think, and it's excellent that it's
backward compatible!

> It looks like other acme clients might be happier to overwrite existing
> files, but changing away from certbot seemed like more work than adding
> a deploy hook to do what we need.

Indeed!

> I'll follow up with a v2 of this patch when I get a chance.

Thanks!

> Carlo





bug#46961: Nginx and certbot cervices don't play well togther

2024-01-29 Thread Carlo Zancanaro
Hi Clément,

Thanks for taking the time to review my change. I've responded inline
below.

On Mon, Jan 29 2024, Clément Lassieur wrote:
> This is great, thank you!  I tested it, it worked.  Could you please
> just make sure lines fit within 80 columns?

Yep, no worries.

> Would it make sense now to run ‘update-certificates’ at end of the
> activation stuff?

We can't run it during activation, because nginx won't have started yet.
However, I am planning a follow-up to add a one-shot service to run
certbot after nginx starts. I'll see if I can add it to this series, but
if I run into any issues I'll leave it for later.

> And would it make sense to reload nginx after ‘update-certificates’ is
> run?

This would be a sensible default. There is an example in the manual of
configuring certbot to reload nginx, so this should be straightforward
to add.

> gnu/services/certbot.scm:203:26: warning: "subjectAltName=~{DNS:~a~^,~}": 
> unsupported format option ~{, use (ice-9 format) instead

Ha! I import (ice-9 format), but within the gexp (and then I don't use
it, whoops!). Must be a leftover from a previous iteration. I'll fix
this up.

>> + ;; Due to the way certbot runs, we need to
>> + ;; create the self-signed certificates in the
>> + ;; archive folder and symlink them into the live
>> + ;; folder. This mimics what certbot does well
>> + ;; enough to make acquiring new certificates
>> + ;; work.
>
> In another mail you say it doesn't work as well as you thought it did?
> What doesn't work?

This comment doesn't describe the code any more. In my first attempt I
was trying to generate certificates in /etc/letsencrypt/live/ and get
certbot to write over them when it ran. Unfortunately, it refused to do
so. I then tried writing to /etc/letsencrypt/archive/ and symlinking
into /etc/letsencrypt/live/ (which is what this comment describes), but
that also failed. Certbot refuses to write over any existing files when
fetching a certificate.

It looks like other acme clients might be happier to overwrite existing
files, but changing away from certbot seemed like more work than adding
a deploy hook to do what we need.

I'll follow up with a v2 of this patch when I get a chance.

Carlo





bug#46961: Nginx and certbot cervices don't play well togther

2024-01-29 Thread Clément Lassieur
Also, I forgot, I think it would be great to have somewhere in the doc
an example of minimal config.scm that works.  I know we can't do proper
testing because we depend of certbot service but that would make it
easier for a lot of people to test it.  Maybe such example is already in
the docs and I haven't seen it though.

Here is the one I used:

--8<---cut here---start->8---
(use-modules (gnu) (gnu tests))
(use-package-modules web)
(use-service-modules certbot networking web)

(operating-system
  (inherit %simple-os)
  (services
   (cons*
(service dhcp-client-service-type)
(service nginx-service-type
 (nginx-configuration
  (server-blocks
   (list
(nginx-server-configuration
 (listen '("443 ssl"))
 (server-name '("test.lassieur.org"))
 (ssl-certificate
  "/etc/certs/test.lassieur.org/fullchain.pem")
 (ssl-certificate-key
  "/etc/certs/test.lassieur.org/privkey.pem"))
(service certbot-service-type
 (certbot-configuration
  (certificates
   (list
(certificate-configuration
 (domains '("test.lassieur.org")))
%base-services)))
--8<---cut here---end--->8---





bug#46961: Nginx and certbot cervices don't play well togther

2024-01-29 Thread Clément Lassieur
Hi Carlo,

On Wed, Jan 24 2024, Carlo Zancanaro wrote:

> * gnu/services/certbot.scm (): Add
> start-self-signed? field.
> (generate-certificate-gexp): New procedure.
> (certbot-activation): Generate self-signed certificates when
> start-self-signed? is #t.
> * doc/guix.texi (Certificate services): Document start-self-signed?.
> ---
>  doc/guix.texi|  6 +
>  gnu/services/certbot.scm | 56 +---
>  2 files changed, 59 insertions(+), 3 deletions(-)

This is great, thank you!  I tested it, it worked.  Could you please
just make sure lines fit within 80 columns?  And there is a warning
during compilation, pasted below.

Would it make sense now to run ‘update-certificates’ at end of the
activation stuff?

And would it make sense to reload nginx after ‘update-certificates’ is
run?

Clément

> diff --git a/doc/guix.texi b/doc/guix.texi
> index 2d43ab9a65..15b256d0a3 100644
> --- a/doc/guix.texi
> +++ b/doc/guix.texi
> @@ -32664,6 +32664,12 @@ Certificate Services
>  contain a space-delimited list of renewed certificate domains (for
>  example, @samp{"example.com www.example.com"}.
>  
> +@item @code{start-self-signed?} (default: @code{#t})
> +Whether to generate an initial self-signed certificate during system
> +activation.  This option is particularly useful to allow @code{nginx} to
> +start before @code{certbot} has run, because @code{certbot} relies on
> +@code{nginx} running to perform HTTP challenges.
> +
>  @end table
>  @end deftp
>  
> diff --git a/gnu/services/certbot.scm b/gnu/services/certbot.scm
> index 58e709f8a4..bb321a1b50 100644
> --- a/gnu/services/certbot.scm
> +++ b/gnu/services/certbot.scm
> @@ -64,7 +64,9 @@ (define-record-type* 
>(cleanup-hookcertificate-cleanup-hook
> (default #f))
>(deploy-hook certificate-configuration-deploy-hook
> -   (default #f)))
> +   (default #f))
> +  (start-self-signed?  certificate-configuration-start-self-signed?
> +   (default #t)))
>  
>  (define-record-type* 
>certbot-configuration make-certbot-configuration
> @@ -91,7 +93,10 @@ (define-record-type* 
>  (define (certbot-deploy-hook name deploy-hook-script)
>"Returns a gexp which creates symlinks for privkey.pem and fullchain.pem
>  from /etc/certs/NAME to /etc/letsenctypt/live/NAME.  If DEPLOY-HOOK-SCRIPT is
> -not #f then it is run after the symlinks have been created."
> +not #f then it is run after the symlinks have been created.  This wrapping is
> +necessary for certificates with start-self-signed? set to #t, as it will
> +overwrite the initial self-signed certificates upon the first successful
> +deploy."
>(program-file
> (string-append name "-deploy-hook")
> (with-imported-modules '((guix build utils))
> @@ -108,7 +113,8 @@ (define (certbot-deploy-hook name deploy-hook-script)
>   "/etc/letsencrypt/live/" name "/fullchain.pem")
>#$(string-append "/etc/certs/" name "/fullchain.pem.new"))
>  
> - ;; Rename over the top of the old ones, if there are any.
> + ;; Rename over the top of the old ones, just in case they were the
> + ;; original self-signed certificates.
>   (rename-file #$(string-append "/etc/certs/" name "/privkey.pem.new")
>#$(string-append "/etc/certs/" name "/privkey.pem"))
>   (rename-file #$(string-append "/etc/certs/" name 
> "/fullchain.pem.new")
> @@ -182,6 +188,44 @@ (define (certbot-renewal-jobs config)
> #~(job '(next-minute-from (next-hour '(0 12)) (list (random 60)))
>#$(certbot-command config
>  
> +(define (generate-certificate-gexp certbot-cert-directory rsa-key-size)
> +  (match-lambda
> +(($  name (primary-domain other-domains ...) 
> challenge
> +csr authentication-hook
> +cleanup-hook deploy-hook)
> + (let (;; Arbitrary default subject, with just the
> +   ;; right domain filled in. These values don't
> +   ;; have any real significance.
> +   (subject (string-append "/C=US/ST=Oregon/L=Portland/O=Company 
> Name/OU=Org/CN="
> +   primary-domain))
> +   (alt-names (if (null? other-domains)
> +  #f
> +  (format #f "subjectAltName=~{DNS:~a~^,~}" 
> other-domains)))

gnu/services/certbot.scm:203:26: warning: "subjectAltName=~{DNS:~a~^,~}": 
unsupported format option ~{, use (ice-9 format) instead

> +   (directory (string-append "/etc/certs/" (or name 
> primary-domain
> +   #~(begin
> +   (use-modules (ice-9 format))
> +   (when (not (file-exists? #$directory))
> + ;; Due to the way certbot runs, we need to
> + ;; create the self-signed certificates in the
> + ;; archive folder and symlink them into the live
> +   

bug#46961: Nginx and certbot cervices don't play well togther

2021-03-06 Thread Brice Waegeneire


Hello Guix,

After an suggestion from Tobias to give a try at forcing HTTPS for
Guix's websites on berlin, I had a go at it but it was more complex that
what I was expecting.  Looking deeper at nginx and certbot services it
appear both services don't play that well together, requering a inital
dance when deploying a new HTTPS virtual server. As explained in #36389¹
you need to:
« - run system configuration with just the certbot service
- use certbot to generate your initial certificates
- reconfigure with additional nginx server configuration, pointing to
  the SSL certificates created by certbot »

Indeed, with an operating-system continaing the following services it's
impossible to sart Nginx and Certbot at once as one would expect:

--8<---cut here---start->8---
(service nginx-service-type)
(service php-fpm-service-type)

(service certbot-service-type
 (certbot-configuration
   (certificates
 (list (certificate-configuration
 (domains '("test.sama.re"))
 (deploy-hook
   (program-file
 "nginx-deploy-hook"
 #~(let ((pid (call-with-input-file "/var/run/nginx.pid"
read)))
 (kill pid SIGHUP)

(cat-avatar-generator-service
  #:configuration
  (nginx-server-configuration
(listen '("443 ssl"))
(server-name '("test.sama.re"))
(ssl-certificate
  "/etc/letsencrypt/live/test.sama.re/fullchain.pem")
(ssl-certificate-key
  "/etc/letsencrypt/live/test.sama.re/privkey.pem")))
--8<---cut here---end--->8---

Here is the error from reconfiguring the system:

--8<---cut here---start->8---
# guix system reconfigure /etc/config.sm
[...]
building /gnu/store/55cq2ja4i5489s55viv9fh50032d1ziy-switch-to-system.scm.drv...
making '/gnu/store/p2rkcmrnpls5py7x2iappf2qcbxwlb95-system' the current 
system...
setting up setuid programs in '/run/setuid-programs'...
populating /etc from /gnu/store/k2kb8hsq3q0dhhad4a9pjh4kx32mn4g0-etc...
/var/lib/certbot/renew-certificates may need to be run
creating nginx log directory '/var/log/nginx'
creating nginx run directory '/var/run/nginx'
creating nginx temp directories 
'/var/run/nginx/{client_body,proxy,fastcgi,uwsgi,scgi}_temp'
nginx: [emerg] cannot load certificate 
"/etc/letsencrypt/live/test.sama.re/fullchain.pem": BIO_new_file() failed (SSL: 
error:02001002:system library:fopen:No such file or 
directory:fopen('/etc/letsencrypt/live/test.sama.re/fullchain.pem','r') 
error:2006D080:BIO routines:BIO_new_file:no such file)
nginx: configuration file 
/gnu/store/chpw631djay2w39x7agg8zz53iayy4zy-nginx.conf test failed
`/gnu/store/jyxc290q7jyhhpalski0h13h8z9zvnka-openssh-authorized-keys/bricewge' 
-> `/etc/ssh/authorized_keys.d/bricewge'
The following derivation will be built:
   /gnu/store/qlzbrmpx6wnhzqcpqi9yrbb6xva82kvr-install-bootloader.scm.drv

building 
/gnu/store/qlzbrmpx6wnhzqcpqi9yrbb6xva82kvr-install-bootloader.scm.drv...
guix system: bootloader successfully installed on '/dev/sda'
The following derivation will be built:
   /gnu/store/ikak44inrnz3b3dx8j8csdakgqafbijn-upgrade-shepherd-services.scm.drv

building 
/gnu/store/ikak44inrnz3b3dx8j8csdakgqafbijn-upgrade-shepherd-services.scm.drv...
shepherd: Removing service 'dbus-system'...
shepherd: Service dbus-system has been stopped.
shepherd: Done.
shepherd: Service host-name has been started.
shepherd: Service user-homes has been started.
shepherd: Service host-name has been started.
shepherd: Service term-auto could not be started.
shepherd: Service php-fpm has been started.
guix system: warning: exception caught while executing 'start' on service 
'nginx':
Throw to key `%exception' with args `("#< program: 
\"/gnu/store/hn1mvgafkpf5knrnzvwpgpdlzmq553al-nginx-1.19.6/sbin/nginx\" 
arguments: (\"-c\" \"/gnu/store/chpw631djay2w39x7agg8zz53iayy4zy-nginx.conf\" 
\"-p\" \"/var/run/nginx\") exit-status: 1 term-signal: #f stop-signal: #f>")'.
guix system: warning: some services could not be upgraded
hint: To allow changes to all the system services to take effect, you will need 
to reboot.
--8<---cut here---end--->8---

What happen is Nginx won't start because the certficate related files
present in it's configuration doesn't exist and we can't get a Let's
Encrypt certificate from a HTTP-01 challenge without that web server
running.  NixOS broke that chicken and egg problem by generating a
self-signed certificate first, after that starting nginx, then
requesting a valid Lets' Encrypt certificate and finally reloading
Nginx.  That way we end up with a Nginx server using Let's Encrypt
certificate with no more that a simple system reconfiguration.  Note
that, the initial self-signed certificate will need to be at the path
were certbot will put it's