Phil Holmes writes:

> I'm not aware of any changes being made, unless it's the regularity of
> the upgrades and people downloading the PDFs more often.

It turns out that it's the notorious AhrefsBot.  I fixed .htaccess but
we need this patch below.  Please apply.

Greetings,
Jan

>From 08c4b0e80428db285ba3865d4ea795fbdf2d17ff Mon Sep 17 00:00:00 2001
From: Jan Nieuwenhuizen <[email protected]>
Date: Thu, 8 Aug 2013 08:34:12 +0200
Subject: [PATCH] [Web]: Deny rogue crawler AhrefsBot.  Fixes web load.

The AhrefsBot is checking crawling files every second, including
all binaries.  This increases the load in an unacceptable way.
---
 Documentation/web/server/lilypond.org.htaccess | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/Documentation/web/server/lilypond.org.htaccess b/Documentation/web/server/lilypond.org.htaccess
index b4780a9..5e7dfae 100644
--- a/Documentation/web/server/lilypond.org.htaccess
+++ b/Documentation/web/server/lilypond.org.htaccess
@@ -23,6 +23,10 @@ RewriteEngine On
 RewriteCond %{HTTP_USER_AGENT} httrack [NC]
 RewriteRule ^.*/source/.*$ /please-respect-robots.txt.html [L]
 
+# Deny rogue crawler
+RewriteCond %{HTTP_USER_AGENT} ^(.*)AhrefsBot(.*) [NC]
+RewriteRule .* - [F,L]
+
 # Permanent top level entry points -- ./doc
 RedirectMatch ^/music-glossary /glossary
 RedirectMatch ^/tutorial /learning
-- 
1.8.1.2

-- 
Jan Nieuwenhuizen <[email protected]> | GNU LilyPond http://lilypond.org
Freelance IT http://JoyofSource.com | AvatarĀ®  http://AvatarAcademy.nl  
_______________________________________________
lilypond-devel mailing list
[email protected]
https://lists.gnu.org/mailman/listinfo/lilypond-devel

Reply via email to