Re: PDML Digest, Vol 37, Issue 275

Mark Roberts Fri, 22 May 2009 14:45:03 -0700

AlunFoto wrote:

>Michael,
>
>Assuming that your blog is automatically harvested by webcrawlers and
>then fed to a translation service, you may be able to limit this sort
>of thing by using a robot.txt file. Take a look at this site for how
>the crawlers work:
>
>http://www.robotstxt.org/


Of course, obeying the contents of a robots.txt file is all on the
honor system: Web spiders are free to ignore it if they want.
Mainstream players like Goodle and Yahoo are scrupulous about doing
so. Others... who knows?

If your blog is hosted on an Apache server you can do what I do and
upload a .htaccess file that prevents specified IP addresses or ranges
from even connecting to the server.

I have a .htaccess file that's only about 100 lines long but keeps 99%
of spamming *attempts* off my blog (and Spam Karma keeps the rest from
getting posted).

http://www.javascriptkit.com/howto/htaccess.shtml
http://www.htaccesstools.com/

--
PDML Pentax-Discuss Mail List
[email protected]
http://pdml.net/mailman/listinfo/pdml_pdml.net
to UNSUBSCRIBE from the PDML, please visit the link directly above and follow 
the directions.

Re: PDML Digest, Vol 37, Issue 275

Reply via email to