AlunFoto wrote: >Michael, > >Assuming that your blog is automatically harvested by webcrawlers and >then fed to a translation service, you may be able to limit this sort >of thing by using a robot.txt file. Take a look at this site for how >the crawlers work: > >http://www.robotstxt.org/
Of course, obeying the contents of a robots.txt file is all on the honor system: Web spiders are free to ignore it if they want. Mainstream players like Goodle and Yahoo are scrupulous about doing so. Others... who knows? If your blog is hosted on an Apache server you can do what I do and upload a .htaccess file that prevents specified IP addresses or ranges from even connecting to the server. I have a .htaccess file that's only about 100 lines long but keeps 99% of spamming *attempts* off my blog (and Spam Karma keeps the rest from getting posted). http://www.javascriptkit.com/howto/htaccess.shtml http://www.htaccesstools.com/ -- PDML Pentax-Discuss Mail List [email protected] http://pdml.net/mailman/listinfo/pdml_pdml.net to UNSUBSCRIBE from the PDML, please visit the link directly above and follow the directions.

