And btw. what is the Windows OS version that your server runs on? And since you had mod_md running before, what did change in relation to Windows and the curl you use?
> Am 12.12.2017 um 13:58 schrieb Stefan Eissing <stefan.eiss...@greenbytes.de>: > > > >> Am 12.12.2017 um 13:47 schrieb Steffen <i...@apachelounge.com>: >> >> It was happening before 1.1.0, but i did not give it attention, seen it in >> several situations which all I unfortunate cannot recall (see the retries >> as example https://github.com/icing/mod_md/issues/52and >> https://github.com/icing/mod_md/issues/62 ). >> >> It is a more serious issue then I thought before. >> >> I think we must first fix this, otherwise it is a bad introduction to our >> users. This because Windows community first-time users learned that they >> are dealing with it and are dealing with all kind of (try) errors, most >> users stopped using it. As said in an other post mod_md is not that easy to >> start with. >> >> Also when the loglevel is on the default Warn, users see hardly what is >> happening. I advise our users to use LogLevel info md:trace2 ssl:notice >> >> The Endless Retry loop Tested now in the following situations, tested during >> renew and no new certificate is generated, httpd running fine with the old >> certificate which was still valid. >> >> 1 - Mis-configuration like below. >> 2 - ACME CA service down (cause Letsencrypt down) >> 3 - ACME CA service not reachable (cause local network, or OS >> failure/misconfig) >> 4 - Error response (Get/Post errors)when accessing Letsencrypt, dependency >> issue like curl, mod_ssl. >> 5 - mod_md/mod_ssl faults >> 6 - Should be more >> >> >> 2) 3) Both can be that Letsencrypt is temp down maybe retry there, but hard >> to tell if the cause is temp LE-Down, issue local or OS misconfig. >> >> 4) Is a good example: Error response from LE, which happens quite some >> situations, Curl issues, Rate-Limits, mod_md faults etc. >> >> Below I introduced a Curl issue: >> >> ... >> [md:debug] [pid 7508:tid 1052] mod_md.c(762): AH10055: md watchdog run, auto >> drive 2 mds >> [md:debug] [pid 7508:tid 1052] mod_md.c(691): AH10052: md(apachelounge.nl): >> state=2, driving >> [md:debug] [pid 7508:tid 1052] md_reg.c(884): apachelounge.nl: run staging >> [md:debug] [pid 7508:tid 1052] md_acme_drive.c(690): apachelounge.nl: >> staging started, state=2, can_http=0, can_https=1, challenges='tls-sni-01' >> [md:debug] [pid 7508:tid 1052] md_store_fs.c(690): purge >> staging/apachelounge.nl (D:/servers/apacheS/md/staging/apachelounge.nl) >> [md:debug] [pid 7508:tid 1052] md_acme.c(144): get directory from >> https://acme-v01.api.letsencrypt.org/directory >> [md:debug] [pid 7508:tid 1052] md_acme.c(407): req: POST >> https://acme-v01.api.letsencrypt.org/directory >> [md:debug] [pid 7508:tid 1052] md_curl.c(258): (20014)Internal error >> (specific information not available): request 10 failed(60): Peer >> certificate cannot be authenticated with given CA certificates > > Ok, this needs to be logged at ERROR level, so users do not have to mess with > LogLevel to see what is going on. > > As for the reason, this seems to indicate that the curl client finds no way > to verify the Let's Encrypt server certificate. Can you verify that the > "curl.exe" can connect to "https://acme-v01.api.letsencrypt.org/directory" > and retrieve the JSON there *without* you giving it the '-k' or '--insecure' > option? And where does your curl.exe/libcurl come from? Did you build it > yourself? > >> [md:debug] [pid 7508:tid 1052] md_acme.c(425): (20014)Internal error >> (specific information not available): req sent >> [md:error] [pid 7508:tid 1052] (20014)Internal error (specific information >> not available): apachelounge.nl: setup >> ACME(https://acme-v01.api.letsencrypt.org/directory) >> [md:debug] [pid 7508:tid 1052] md_acme_drive.c(912): (20014)Internal error >> (specific information not available): apachelounge.nl: ACME, ACME staging >> [md:debug] [pid 7508:tid 1052] md_reg.c(891): (20014)Internal error >> (specific information not available): apachelounge.nl: staging done >> [md:error] [pid 7508:tid 1052] (20014)Internal error (specific information >> not available): AH10056: processing apachelounge.nl >> [md:info] [pid 7508:tid 1052] AH10057: apachelounge.nl: encountered error >> for the 6. time, next run in 0:02:40 hours >> ... >> >> Maybe a little solution: starting httpd, mod_md checks if LE is reachable >> without error. > > No, I think checking external servers on every httpd restart is a good idea. > >> And a solution for the below one can be: make a check that 443 and/or 80 is >> used. >> >> Still my questions: >> >> Does the retry stop ? > > The retry does not stop, but it uses longer and longer retry intervals. > Exactly to recover from errors with the ACME server that are recoverable, > e.g. server/internet down. Your local certificate store not able to verify > the LE server will not recover itself, however. > >> When does it happen, on what errors ? > > On any error where signup/renew is necessary and could not complete. > >> >> >> Steffen >> >> >> On Tuesday 12/12/2017 at 10:18, Stefan Eissing wrote: >>> Can you switch to "LogLevel md:debug" for a while and send me the details? >>> Did this start on the v1.1.0 or before that? >>> >>>> Am 11.12.2017 um 16:09 schrieb Steffen <i...@apachelounge.com>: >>>> >>>> >>>> Running 1.1.0 with the new naming. >>>> >>>> When mod_md encounters an error it looks like it is going in a endless >>>> loop: >>>> >>>> >>>> [md:info] [pid 10372:tid 1964] AH10057: apachelounge.nl: encountered error >>>> for the 1. time, next run in 0:00:05 hours >>>> [md:info] [pid 10372:tid 1964] AH10057: apachelounge.nl: encountered error >>>> for the 2. time, next run in 0:00:10 hours >>>> [md:info] [pid 10372:tid 1964] AH10057: apachelounge.nl: encountered error >>>> for the 3. time, next run in 0:00:20 hours >>>> [md:info] [pid 10372:tid 1964] AH10057: apachelounge.nl: encountered error >>>> for the 4. time, next run in 0:00:40 hours >>>> [md:info] [pid 10372:tid 1964] AH10057: apachelounge.nl: encountered error >>>> for the 5. time, next run in 0:01:20 hours >>>> [md:info] [pid 10372:tid 1964] AH10057: apachelounge.nl: encountered error >>>> for the 6. time, next run in 0:02:40 hours >>>> [md:info] [pid 10372:tid 1964] AH10057: apachelounge.nl: encountered error >>>> for the 7. time, next run in 0:05:20 hours >>>> [md:info] [pid 10372:tid 1964] AH10057: apachelounge.nl: encountered error >>>> for the 8. time, next run in 0:10:40 hours >>>> ... >>>> ... >>>> ... >>>> >>>> Above is during renew and using port 444.. >>>> >>>> Apache is running fine because the certificate is still valid. >>>> >>>> Does it stop ? >>>> >>>> When does it happen, on what errors ? Above happens when: (20014)Internal >>>> error (specific information not available): AH10056: processing >>>> apachelounge.nl. >>>> >>>> What to do. Stopping on above retries can be tricky because when the ACME >>>> CA service is temp down or not reachable we do want maybe a retry. A >>>> reachable error/down error is different then a configuration error causing >>>> it like in above case..