Re: mod_md 1.1.0 repeating on error

Stefan Eissing Tue, 12 Dec 2017 05:02:34 -0800

And btw. what is the Windows OS version that your server runs on?

And since you had mod_md running before, what did change in relation to
Windows and the curl you use?


> Am 12.12.2017 um 13:58 schrieb Stefan Eissing <stefan.eiss...@greenbytes.de>:
> 
> 
> 
>> Am 12.12.2017 um 13:47 schrieb Steffen <i...@apachelounge.com>:
>> 
>> It was happening before 1.1.0, but i did not give it attention, seen it in 
>> several situations which all I unfortunate cannot recall (see the retries  
>> as example https://github.com/icing/mod_md/issues/52and 
>> https://github.com/icing/mod_md/issues/62 ).
>> 
>> It is a more serious issue then I thought before. 
>> 
>> I think we must first fix this, otherwise it is a bad introduction to our 
>> users. This because Windows community first-time users learned that they  
>> are dealing with it and are dealing with all kind of (try) errors, most 
>> users stopped using it.  As said in an other post mod_md is not that easy to 
>> start with.
>> 
>> Also when the loglevel is on the default Warn, users see hardly what is 
>> happening. I advise our users to use LogLevel info md:trace2 ssl:notice
>> 
>> The Endless Retry loop Tested now in the following situations, tested during 
>> renew and no new certificate is generated, httpd running fine with the old 
>> certificate which was still valid.
>> 
>> 1 - Mis-configuration like below.
>> 2 - ACME CA service  down (cause Letsencrypt down)
>> 3 - ACME CA service not reachable (cause local network, or OS 
>> failure/misconfig)
>> 4 - Error response (Get/Post errors)when accessing Letsencrypt, dependency 
>> issue like curl, mod_ssl.
>> 5 - mod_md/mod_ssl faults
>> 6 - Should be more
>> 
>> 
>> 2) 3) Both can be that Letsencrypt is temp down maybe retry there, but hard 
>> to tell if the cause is temp LE-Down, issue local or OS misconfig.
>> 
>> 4) Is a good example: Error response from LE, which happens quite  some 
>> situations, Curl issues, Rate-Limits, mod_md faults  etc.
>> 
>> Below I introduced a Curl issue:
>> 
>> ...
>> [md:debug] [pid 7508:tid 1052] mod_md.c(762): AH10055: md watchdog run, auto 
>> drive 2 mds
>> [md:debug] [pid 7508:tid 1052] mod_md.c(691): AH10052: md(apachelounge.nl): 
>> state=2, driving
>> [md:debug] [pid 7508:tid 1052] md_reg.c(884): apachelounge.nl: run staging
>> [md:debug] [pid 7508:tid 1052] md_acme_drive.c(690): apachelounge.nl: 
>> staging started, state=2, can_http=0, can_https=1, challenges='tls-sni-01'
>> [md:debug] [pid 7508:tid 1052] md_store_fs.c(690): purge 
>> staging/apachelounge.nl (D:/servers/apacheS/md/staging/apachelounge.nl)
>> [md:debug] [pid 7508:tid 1052] md_acme.c(144): get directory from 
>> https://acme-v01.api.letsencrypt.org/directory
>> [md:debug] [pid 7508:tid 1052] md_acme.c(407): req: POST 
>> https://acme-v01.api.letsencrypt.org/directory
>> [md:debug] [pid 7508:tid 1052] md_curl.c(258): (20014)Internal error 
>> (specific information not available): request 10 failed(60): Peer 
>> certificate cannot be authenticated with given CA certificates
> 
> Ok, this needs to be logged at ERROR level, so users do not have to mess with 
> LogLevel to see what is going on.
> 
> As for the reason, this seems to indicate that the curl client finds no way 
> to verify the Let's Encrypt server certificate. Can you verify that the 
> "curl.exe" can connect to "https://acme-v01.api.letsencrypt.org/directory"; 
> and retrieve the JSON there *without* you giving it the '-k' or '--insecure' 
> option? And where does your curl.exe/libcurl come from? Did you build it 
> yourself?
> 
>> [md:debug] [pid 7508:tid 1052] md_acme.c(425): (20014)Internal error 
>> (specific information not available): req sent
>> [md:error] [pid 7508:tid 1052] (20014)Internal error (specific information 
>> not available): apachelounge.nl: setup 
>> ACME(https://acme-v01.api.letsencrypt.org/directory)
>> [md:debug] [pid 7508:tid 1052] md_acme_drive.c(912): (20014)Internal error 
>> (specific information not available): apachelounge.nl: ACME, ACME staging
>> [md:debug] [pid 7508:tid 1052] md_reg.c(891): (20014)Internal error 
>> (specific information not available): apachelounge.nl: staging done
>> [md:error] [pid 7508:tid 1052] (20014)Internal error (specific information 
>> not available): AH10056: processing apachelounge.nl
>> [md:info] [pid 7508:tid 1052] AH10057: apachelounge.nl: encountered error 
>> for the 6. time, next run in 0:02:40 hours
>> ...
>> 
>> Maybe a little solution:  starting httpd, mod_md checks if LE is reachable 
>> without error.
> 
> No, I think checking external servers on every httpd restart is a good idea.
> 
>> And a solution for the below one can be: make a check that 443 and/or 80 is 
>> used.
>> 
>> Still my questions:
>> 
>> Does the retry stop ?
> 
> The retry does not stop, but it uses longer and longer retry intervals. 
> Exactly to recover from errors with the ACME server that are recoverable, 
> e.g. server/internet down. Your local certificate store not able to verify 
> the LE server will not recover itself, however.
> 
>> When does it happen, on what errors ?
> 
> On any error where signup/renew is necessary and could not complete.
> 
>> 
>> 
>> Steffen
>> 
>> 
>> On Tuesday 12/12/2017 at 10:18, Stefan Eissing wrote:
>>> Can you switch to "LogLevel md:debug" for a while and send me the details? 
>>> Did this start on the v1.1.0 or before that?
>>> 
>>>> Am 11.12.2017 um 16:09 schrieb Steffen <i...@apachelounge.com>:
>>>> 
>>>> 
>>>> Running 1.1.0 with the new naming.
>>>> 
>>>> When mod_md encounters an error it looks like it is going in a endless 
>>>> loop:
>>>> 
>>>> 
>>>> [md:info] [pid 10372:tid 1964] AH10057: apachelounge.nl: encountered error 
>>>> for the 1. time, next run in 0:00:05 hours
>>>> [md:info] [pid 10372:tid 1964] AH10057: apachelounge.nl: encountered error 
>>>> for the 2. time, next run in 0:00:10 hours
>>>> [md:info] [pid 10372:tid 1964] AH10057: apachelounge.nl: encountered error 
>>>> for the 3. time, next run in 0:00:20 hours
>>>> [md:info] [pid 10372:tid 1964] AH10057: apachelounge.nl: encountered error 
>>>> for the 4. time, next run in 0:00:40 hours
>>>> [md:info] [pid 10372:tid 1964] AH10057: apachelounge.nl: encountered error 
>>>> for the 5. time, next run in 0:01:20 hours
>>>> [md:info] [pid 10372:tid 1964] AH10057: apachelounge.nl: encountered error 
>>>> for the 6. time, next run in 0:02:40 hours
>>>> [md:info] [pid 10372:tid 1964] AH10057: apachelounge.nl: encountered error 
>>>> for the 7. time, next run in 0:05:20 hours
>>>> [md:info] [pid 10372:tid 1964] AH10057: apachelounge.nl: encountered error 
>>>> for the 8. time, next run in 0:10:40 hours
>>>> ...
>>>> ...
>>>> ...
>>>> 
>>>> Above is during renew and using port 444..
>>>> 
>>>> Apache is running fine because the certificate is still valid.
>>>> 
>>>> Does it stop ?
>>>> 
>>>> When does it happen, on what errors ? Above happens when: (20014)Internal 
>>>> error (specific information not available): AH10056: processing 
>>>> apachelounge.nl.
>>>> 
>>>> What to do. Stopping on above retries can be tricky because when the ACME 
>>>> CA service is temp down or not reachable we do want maybe a retry. A 
>>>> reachable error/down error is different then a configuration error causing 
>>>> it like in above case..

Re: mod_md 1.1.0 repeating on error

Reply via email to