VBCoder wrote:

Hi,
Every place I have read about robots.txt rules state that it is supposed to
be case insensitive.

The spec says "A case insensitive substring match of the name without version information is recommended." This is up to the robots, not you. You probably are getting hit by robots that don't do it.


You seem to be suggesting that this is wrong.  I have
added lines the include the exact case of the offender, but this does not
seem to stop them.  The mixed case lines are and experiment, the all lower
case lines should be enough to stop them from what I have read.  Are you
suggesting that robots.txt needs to be case sensitive?

This should only be necessary as a work-around for robots that aren't following the above recommendation. Do you have user-agent names for robots that seem to download and not follow the directive? It would be interesting to see if they're using a third-party library to interpret robots.txt.


The domain name heartnart.com does a redirect to www.coseco.com/heartnart .
I would think that the case should not matter to a search engine as it
doesn't to the web in general.  I mix them so that it can be more easily
read by a humans.  Are you suggesting that a search engine would think that
www.coseco.com/heartnart is a different place than www.coseco.com/HeartnArt?
I am more confused than before.

Windows-based web servers are the only ones that ignore case, generally speaking. And they make up a relatively small portion of servers out there.


Nick

--
Nick Arnett
Phone/fax: (408) 904-7198
[EMAIL PROTECTED]

_______________________________________________
Robots mailing list
[EMAIL PROTECTED]
http://www.mccmedia.com/mailman/listinfo/robots

Reply via email to