I wrote: >> Traffic increases quickly (though still a fraction of >> Toolserver's) and half of it is from geohack, it's normal to >> see some growth pains. You seem to be right that 500 errors >> are increasing: according to >> http://tools.wmflabs.org/awstats/ they were 4.1 % of the >> "valid" requests in July and they have been 4.9 % so far in >> August. >> (The millions of 404 and 403 errors per month are even more >> mysterious though.)
> Probably related as Geohack seems to trigger accesses to > /~dispenser/temp/clickheat/js/clickheat.js (in today's log: > 66639) and /~geohack/siteicon.png (53241). Other common > misses are /apple-touch-icon.png (752), > /apple-touch-icon-precomposed.png (1165) and /robots.txt > (736). Regarding robots.txt, I've started https://gerrit.wikimedia.org/r/77916. Toolserver's robots.txt is: | User-agent: msnbot | Disallow: / | User-agent: * | Disallow: /~magnus/geo/geohack.php | Disallow: /~daniel/WikiSense | Disallow: /~geohack/ | Disallow: /~enwp10/ | Disallow: /~cbm/cgi-bin/ (WikiSense is CatScan IIRC.) Excluding Geohack is probably a good idea. Do other tool authors have tools they do not want to be crawled by search engine bots? Tim _______________________________________________ Labs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/labs-l
