On Sun, 19 Dec 2004, J and T wrote:

> I understand what you're saying and I completely agree with you if I had not 
> read something different at the w3c.org and that Yahoo! indexed the example 
> site below. (please notice Subject "Possible" problem with RobotRules?)
> 
> According to this document:
> 
> http://www.w3.org/TR/1998/REC-html40-19980424/appendix/notes.html#h-B.4.1.1
> 
> B.4.1 Search robots
> The robots.txt file
> 
> It states:
> 
> Some tips: URI's are case-sensitive, and "/robots.txt" string must be all 
> lower-case. Blank lines are not permitted.
> 
> "Blank lines are not permitted." is stated here and I wouldn't have asked 
> this question if the W3C was not the one stating this. I personally believe 
> the W3C is in error, but there are a lot of people who believe the W3C is 
> God here.

The W3C's error is noted in the errata for the old version of 
HTML 4 that you cited, and it's corrected in the latest HTML 4 
Recommendation.

http://www.w3.org/MarkUp/html40-updates/REC-html40-19980424-errata.html

  The specification reads, "Blank lines are not permitted." Blank lines 
  are permitted in the robots.txt file, just not within a single "record". 
  Note that the specification doesn't define record.

http://www.w3.org/TR/html4/appendix/notes.html#h-B.4.1.1

  Some tips: URI's are case-sensitive, and "/robots.txt" string must be 
  all lower-case. Blank lines are not permitted within a single record in the 
  "robots.txt" file.

> Isn't the W3C the authority on this 
> stuff?

Not on robots.txt.  The W3C's section on robots.txt is buried in an 
appendix to the HTML 4 Recommendation and preceded with "The following 
notes are informative, not normative."

-- 
Liam Quinn

Reply via email to