Re: mod_md 1.1.0 repeating on error

Steffen Tue, 12 Dec 2017 04:48:47 -0800

It was happening before 1.1.0, but i did not give it attention, seenit in several situations which all I unfortunate cannot recall (seethe retries as example https://github.com/icing/mod_md/issues/52 andhttps://github.com/icing/mod_md/issues/62 ).


It is a more serious issue then I thought before.

I think we must first fix this, otherwise it is a bad introduction toour users. This because Windows community first-time users learnedthat they are dealing with it and are dealing with all kind of (try)errors, most users stopped using it. As said in an other post mod_mdis not that easy to start with.

Also when the loglevel is on the default Warn, users see hardly whatis happening. I advise our users to use LogLevel info md:trace2ssl:notice

The Endless Retry loop Tested now in the following situations, testedduring renew and no new certificate is generated, httpd running finewith the old certificate which was still valid.


1 - Mis-configuration like below.
2 - ACME CA service  down (cause Letsencrypt down)

3 - ACME CA service not reachable (cause local network, or OSfailure/misconfig)4 - Error response (Get/Post errors)when accessing Letsencrypt,dependency issue like curl, mod_ssl.

5 - mod_md/mod_ssl faults
6 - Should be more

2) 3) Both can be that Letsencrypt is temp down maybe retry there, buthard to tell if the cause is temp LE-Down, issue local or OSmisconfig.

4) Is a good example: Error response from LE, which happens quitesome situations, Curl issues, Rate-Limits, mod_md faults etc.


Below I introduced a Curl issue:


...

[md:debug] [pid 7508:tid 1052] mod_md.c(762): AH10055: md watchdogrun, auto drive 2 mds[md:debug] [pid 7508:tid 1052] mod_md.c(691): AH10052:md(apachelounge.nl): state=2, driving[md:debug] [pid 7508:tid 1052] md_reg.c(884): apachelounge.nl: runstaging[md:debug] [pid 7508:tid 1052] md_acme_drive.c(690): apachelounge.nl:staging started, state=2, can_http=0, can_https=1,challenges='tls-sni-01'[md:debug] [pid 7508:tid 1052] md_store_fs.c(690): purgestaging/apachelounge.nl(D:/servers/apacheS/md/staging/apachelounge.nl)[md:debug] [pid 7508:tid 1052] md_acme.c(144): get directory fromhttps://acme-v01.api.letsencrypt.org/directory[md:debug] [pid 7508:tid 1052] md_acme.c(407): req: POSThttps://acme-v01.api.letsencrypt.org/directory[md:debug] [pid 7508:tid 1052] md_curl.c(258): (20014)Internal error(specific information not available): request 10 failed(60): Peercertificate cannot be authenticated with given CA certificates[md:debug] [pid 7508:tid 1052] md_acme.c(425): (20014)Internal error(specific information not available): req sent[md:error] [pid 7508:tid 1052] (20014)Internal error (specificinformation not available): apachelounge.nl: setupACME(https://acme-v01.api.letsencrypt.org/directory)[md:debug] [pid 7508:tid 1052] md_acme_drive.c(912): (20014)Internalerror (specific information not available): apachelounge.nl: ACME,ACME staging[md:debug] [pid 7508:tid 1052] md_reg.c(891): (20014)Internal error(specific information not available): apachelounge.nl: staging done[md:error] [pid 7508:tid 1052] (20014)Internal error (specificinformation not available): AH10056: processing apachelounge.nl[md:info] [pid 7508:tid 1052] AH10057: apachelounge.nl: encounterederror for the 6. time, next run in 0:02:40 hours

...

Maybe a little solution: starting httpd, mod_md checks if LE isreachable without error.

And a solution for the below one can be: make a check that 443 and/or80 is used.


Still my questions:

Does the retry stop ?
When does it happen, on what errors ?


Steffen



On Tuesday 12/12/2017 at 10:18, Stefan Eissing  wrote:

Can you switch to "LogLevel md:debug" for a while and send me thedetails? Did this start on the v1.1.0 or before that?
Am 11.12.2017 um 16:09 schrieb Steffen <[email protected]>:


Running 1.1.0 with the new naming.
When mod_md encounters an error it looks like it is going in a endlessloop:
[md:info] [pid 10372:tid 1964] AH10057: apachelounge.nl: encounterederror for the 1. time, next run in 0:00:05 hours[md:info] [pid 10372:tid 1964] AH10057: apachelounge.nl: encounterederror for the 2. time, next run in 0:00:10 hours[md:info] [pid 10372:tid 1964] AH10057: apachelounge.nl: encounterederror for the 3. time, next run in 0:00:20 hours[md:info] [pid 10372:tid 1964] AH10057: apachelounge.nl: encounterederror for the 4. time, next run in 0:00:40 hours[md:info] [pid 10372:tid 1964] AH10057: apachelounge.nl: encounterederror for the 5. time, next run in 0:01:20 hours[md:info] [pid 10372:tid 1964] AH10057: apachelounge.nl: encounterederror for the 6. time, next run in 0:02:40 hours[md:info] [pid 10372:tid 1964] AH10057: apachelounge.nl: encounterederror for the 7. time, next run in 0:05:20 hours[md:info] [pid 10372:tid 1964] AH10057: apachelounge.nl: encounterederror for the 8. time, next run in 0:10:40 hours
...
...
...

Above is during renew and using port 444..

Apache is running fine  because the certificate is still valid.

Does it stop ?
When does it happen, on what errors ? Above happens when:(20014)Internal error (specific information not available): AH10056:processing apachelounge.nl.
What to do. Stopping on above retries can be tricky because when theACME CA service is temp down or not reachable we do want maybe aretry. A reachable error/down error is different then a configurationerror causing it like in above case..

Re: mod_md 1.1.0 repeating on error

Reply via email to