Here is some more info. http://www.robotstxt.org/wc/robots.html
You can have some fun with Crawl-delay: setting. On 5/24/07, Chris Louden <[email protected]> wrote:
For those that follow the rules just add a robots.txt to the root of your web directory. User-agent: * Disallow: / The above will tell all bots that you do not want to be crawled. Google, Yahoo and etc. will respect your wish and stop. Unfriendly bots will still continue to crawl your site. http://chrislouden.com/robots.txt You can also get more creative, http://thcnet.net/robots.txt On 5/24/07, Roger Rustad <[email protected]> wrote: > What is the best way to not let google (or any other search engine) > crawl your Apache website? I'm assuming that it's something > .htaccess-related? > _______________________________________________ > 909linux mailing list > [email protected] > http://909linux.org/cgi-bin/mailman/listinfo/909linux >
