Greg, I realise you are trying to solve the problem and I thank you for trying to make the URL checks better for everyone. I probably sound defeatist in my e-mails; sorry about that.
On Thu, 30 Jun 2022 17:49:49 +1000 Greg Hunt <g...@firmansyah.com> wrote: > Do you have evidence that even without the use of HEAD that > CloudFlare is rejecting the CRAN checks? Unfortunately, yes, I think it's possible: $ curl -v https://support.rstudio.com/hc/en-us/articles/219949047-Installing-older-versions-of-packages # ...skipping TLS logs... > GET /hc/en-us/articles/219949047-Installing-older-versions-of-packages HTTP/2 > Host: support.rstudio.com > User-Agent: curl/7.64.0 > Accept: */* > * Connection state changed (MAX_CONCURRENT_STREAMS == 256)! < HTTP/2 403 < date: Thu, 30 Jun 2022 08:13:01 GMT CloudFlare blocks are probabilistic. I *think* the reason I got a 403 is because I didn't visit the page with my browser first. Switching from HEAD to GET might also increase the traffic flow, leading to more blocks from hosts not previously blocking the HEAD requests. CloudFlare's suggested solution would be Private Access Tokens [*], but that looks hard to implement (who would agree to sign those tokens?) and leaves other CDNs. > The CDN rejecting requests or flagging the service as temporarily > unavailable when there is a communication failure with its upstream > server is much the same behaviour that you would expect to see from > the usual kinds of protection that you'd apply to a web server (some > kind of filter/proxy/firewall) even without a CDN in place. My point was different. If the upstream is actually down, the page can't be served even to "valid" users, and the 503 error from CloudFlare should fail the URL check. On the other hand, if the 503 error is due to the check tripping a bot detector, it could be reasonable to give that page a free pass. How can we distinguish those two situations? Could CloudFlare ask for a CAPTCHA first, then realise that the upstream is down and return another 503? Yes, this is a sensitivity vs specificity question, and we can trade some false positives (that we get now) for some false negatives (letting a legitimate error status from a CDN pass the test) to make life easier for package maintainers. Your suggestions are a step in the right direction, but there has to be a way to make them less fragile. -- Best regards, Ivan [*] https://blog.cloudflare.com/eliminating-captchas-on-iphones-and-macs-using-new-standard/ ______________________________________________ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel