> Am 12.12.2017 um 13:47 schrieb Steffen <i...@apachelounge.com>:
> 
> It was happening before 1.1.0, but i did not give it attention, seen it in 
> several situations which all I unfortunate cannot recall (see the retries  as 
> example https://github.com/icing/mod_md/issues/52and 
> https://github.com/icing/mod_md/issues/62 ).
> 
> It is a more serious issue then I thought before. 
> 
> I think we must first fix this, otherwise it is a bad introduction to our 
> users. This because Windows community first-time users learned that they  are 
> dealing with it and are dealing with all kind of (try) errors, most users 
> stopped using it.  As said in an other post mod_md is not that easy to start 
> with.
> 
> Also when the loglevel is on the default Warn, users see hardly what is 
> happening. I advise our users to use  LogLevel info md:trace2 ssl:notice
> 
> The Endless Retry loop Tested now in the following situations, tested during 
> renew and no new certificate is generated, httpd running fine with the old 
> certificate which was still valid.
> 
> 1 - Mis-configuration like below.
> 2 - ACME CA service  down (cause Letsencrypt down)
> 3 - ACME CA service not reachable (cause local network, or OS 
> failure/misconfig)
> 4 - Error response (Get/Post errors)when accessing Letsencrypt, dependency 
> issue like curl, mod_ssl.
> 5 - mod_md/mod_ssl faults
> 6 - Should be more
> 
> 
> 2) 3) Both can be that Letsencrypt is temp down maybe retry there, but hard 
> to tell if the cause is temp LE-Down, issue local or OS misconfig.
> 
> 4) Is a good example: Error response from LE, which happens quite  some 
> situations, Curl issues, Rate-Limits, mod_md faults  etc.
> 
> Below I introduced a Curl issue:
> 
> ...
> [md:debug] [pid 7508:tid 1052] mod_md.c(762): AH10055: md watchdog run, auto 
> drive 2 mds
> [md:debug] [pid 7508:tid 1052] mod_md.c(691): AH10052: md(apachelounge.nl): 
> state=2, driving
> [md:debug] [pid 7508:tid 1052] md_reg.c(884): apachelounge.nl: run staging
> [md:debug] [pid 7508:tid 1052] md_acme_drive.c(690): apachelounge.nl: staging 
> started, state=2, can_http=0, can_https=1, challenges='tls-sni-01'
> [md:debug] [pid 7508:tid 1052] md_store_fs.c(690): purge 
> staging/apachelounge.nl (D:/servers/apacheS/md/staging/apachelounge.nl)
> [md:debug] [pid 7508:tid 1052] md_acme.c(144): get directory from 
> https://acme-v01.api.letsencrypt.org/directory
> [md:debug] [pid 7508:tid 1052] md_acme.c(407): req: POST 
> https://acme-v01.api.letsencrypt.org/directory
> [md:debug] [pid 7508:tid 1052] md_curl.c(258): (20014)Internal error 
> (specific information not available): request 10 failed(60): Peer certificate 
> cannot be authenticated with given CA certificates

Ok, this needs to be logged at ERROR level, so users do not have to mess with 
LogLevel to see what is going on.

As for the reason, this seems to indicate that the curl client finds no way to 
verify the Let's Encrypt server certificate. Can you verify that the "curl.exe" 
can connect to "https://acme-v01.api.letsencrypt.org/directory"; and retrieve 
the JSON there *without* you giving it the '-k' or '--insecure' option? And 
where does your curl.exe/libcurl come from? Did you build it yourself?

> [md:debug] [pid 7508:tid 1052] md_acme.c(425): (20014)Internal error 
> (specific information not available): req sent
> [md:error] [pid 7508:tid 1052] (20014)Internal error (specific information 
> not available): apachelounge.nl: setup 
> ACME(https://acme-v01.api.letsencrypt.org/directory)
> [md:debug] [pid 7508:tid 1052] md_acme_drive.c(912): (20014)Internal error 
> (specific information not available): apachelounge.nl: ACME, ACME staging
> [md:debug] [pid 7508:tid 1052] md_reg.c(891): (20014)Internal error (specific 
> information not available): apachelounge.nl: staging done
> [md:error] [pid 7508:tid 1052] (20014)Internal error (specific information 
> not available): AH10056: processing apachelounge.nl
> [md:info] [pid 7508:tid 1052] AH10057: apachelounge.nl: encountered error for 
> the 6. time, next run in  0:02:40 hours
> ...
> 
> Maybe a little solution:  starting httpd, mod_md checks if LE is reachable 
> without error.

No, I think checking external servers on every httpd restart is a good idea.

> And a solution for the below one can be: make a check that 443 and/or 80 is 
> used.
> 
> Still my questions:
> 
> Does the retry stop ?

The retry does not stop, but it uses longer and longer retry intervals. Exactly 
to recover from errors with the ACME server that are recoverable, e.g. 
server/internet down. Your local certificate store not able to verify the LE 
server will not recover itself, however.

> When does it happen, on what errors ?

On any error where signup/renew is necessary and could not complete.

> 
> 
> Steffen
> 
>  
> On Tuesday 12/12/2017 at 10:18, Stefan Eissing wrote:
>> Can you switch to "LogLevel md:debug" for a while and send me the details? 
>> Did this start on the v1.1.0 or before that?
>> 
>>> Am 11.12.2017 um 16:09 schrieb Steffen <i...@apachelounge.com>:
>>> 
>>> 
>>> Running 1.1.0 with the new naming.
>>> 
>>> When mod_md encounters an error it looks like it is going in a endless loop:
>>> 
>>> 
>>> [md:info] [pid 10372:tid 1964] AH10057: apachelounge.nl: encountered error 
>>> for the 1. time, next run in 0:00:05 hours
>>> [md:info] [pid 10372:tid 1964] AH10057: apachelounge.nl: encountered error 
>>> for the 2. time, next run in 0:00:10 hours
>>> [md:info] [pid 10372:tid 1964] AH10057: apachelounge.nl: encountered error 
>>> for the 3. time, next run in 0:00:20 hours
>>> [md:info] [pid 10372:tid 1964] AH10057: apachelounge.nl: encountered error 
>>> for the 4. time, next run in 0:00:40 hours
>>> [md:info] [pid 10372:tid 1964] AH10057: apachelounge.nl: encountered error 
>>> for the 5. time, next run in 0:01:20 hours
>>> [md:info] [pid 10372:tid 1964] AH10057: apachelounge.nl: encountered error 
>>> for the 6. time, next run in 0:02:40 hours
>>> [md:info] [pid 10372:tid 1964] AH10057: apachelounge.nl: encountered error 
>>> for the 7. time, next run in 0:05:20 hours
>>> [md:info] [pid 10372:tid 1964] AH10057: apachelounge.nl: encountered error 
>>> for the 8. time, next run in 0:10:40 hours
>>> ...
>>> ...
>>> ...
>>> 
>>> Above is during renew and using port 444..
>>> 
>>> Apache is running fine because the certificate is still valid.
>>> 
>>> Does it stop ?
>>> 
>>> When does it happen, on what errors ? Above happens when: (20014)Internal 
>>> error (specific information not available): AH10056: processing 
>>> apachelounge.nl.
>>> 
>>> What to do. Stopping on above retries can be tricky because when the ACME 
>>> CA service is temp down or not reachable we do want maybe a retry. A 
>>> reachable error/down error is different then a configuration error causing 
>>> it like in above case..
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
> 
> 

Reply via email to