On Wed, 19 Jul 2006, Patrick R. Michaud wrote:
On Wed, Jul 19, 2006 at 09:34:44PM +0200, [EMAIL PROTECTED] wrote:
On Wed, 19 Jul 2006, Patrick R. Michaud wrote:
On Wed, Jul 19, 2006 at 11:36:53AM -0500, JB wrote:
PM,
Can I please get a copy of your robots.txt file?
Also, for any who are interested, here's the relevant
sections of my root .htaccess file, which denies certain
user agents at the webserver level instead of waiting
for PmWiki to do it:
# HTTrack and MSIECrawler are just plain annoying
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} HTTrack [OR]
RewriteCond %{HTTP_USER_AGENT} MSIECrawler
RewriteRule ^wiki/ - [F,L]
# block ?action= requests for these spiders
RewriteCond %{QUERY_STRING} action=[^rb]
RewriteCond %{HTTP_USER_AGENT} Googlebot [OR]
RewriteCond %{HTTP_USER_AGENT} Slurp [OR]
RewriteCond %{HTTP_USER_AGENT} msnbot [OR]
RewriteCond %{HTTP_USER_AGENT} Teoma [OR]
RewriteCond %{HTTP_USER_AGENT} ia_archive
RewriteRule .* - [F,L]
The obvious solution: Add this to some PmWiki page? Perhaps something
about administrative tasks? Or something related to robots.txt?
It probably belongs in Cookbook.ControllingWebRobots, which also needs
to be rewritten to be up-to-date with PmWiki 2.1. There also needs
to be a link in the administrative tasks section, or at least a
FAQ question.
I'm going through old posts. Should I place the above on
Cookbook.ControllingWebRobots? (I wonder there's a problem placing it in
an offical place - no security risks)
/C
--
Christian Ridderström, +46-8-768 39 44 http://www.md.kth.se/~chr
_______________________________________________
pmwiki-users mailing list
[email protected]
http://www.pmichaud.com/mailman/listinfo/pmwiki-users