ekompute wrote: > Hi, can anyone help me with my robot.txt. The name is 'robots.txt'
> My contents for the page reads as > follows: > > User-agent: * > Disallow: /Help > Disallow: /MediaWiki > Disallow: /Template > Disallow: /skins/ > > But it is blocking pages like: > > - http://www.dummipedia.org/Special:Protectedpages > - http://dummipedia.org/Special:Allpages Special pages try to autoprotect themselves. See how they have '<meta name="robots" content="noindex,nofollow" />' A crawler traversing Special:Allpages would likely produce too much load. > and external pages like: > > - http://www.stumbleupon.com/ > - http://www.searchtheweb.com/ $wgNoFollowLinks = false; http://www.mediawiki.org/wiki/Manual:$wgNoFollowLinks http://www.mediawiki.org/wiki/Manual:$wgNoFollowDomainExceptions > As you can see, my robot.txt did not block these pages. Also, should I block > the print version to prevent what Google calls "duplicate content"? If so, > how? Disable /index.php (printable, edit...) > Response will be very much appreciated. > > PM Poon _______________________________________________ MediaWiki-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
