I don't think the explicit names would be required, most robots simply
read the title tag, or infer it from the first portion of clear text,
the content meta tag, or other document attributes.  Anyway, this method
would become quite burdensome for very complicated sites. I also suspect
the file would also become stale rather quickly.  

I do like the Interval attribute, that makes perfect sense to me.
There's a lot we could do with the same basic concept.  For instance, we
could add a touch date to the file to indicate when the site was last
updated, so that even if the interval has passed robots would not need
to scan the site if they had already done so after the touch date.  Keep
in mind that if robot developers surmise that the touch dates are being
artificially manipulated to keep them out, they'll ignore them.

Anybody else interested in the Session attribute?

-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On
Behalf Of Fred Atkinson
Sent: Sunday, January 11, 2004 4:38 PM
To: Robots
Subject: [Robots] Another approach


    Another idea that has occured to me is to simply code the
information to be indexed in the robots.txt file.  Then, the robot could
simply suck the information out of the file and be done.

Example:

User-agent: Scooter
Interval: 30d
Disallow: /
Name: Fred's Site
Index: /index.html
Name: My Article
Index: /article/index.html
Name: My Article's FAQs
Index: /article/faq.html

    This would tell them to take this information to include in their
search database and move one.

    Other ideas?



                                                                Fred

_______________________________________________
Robots mailing list
[EMAIL PROTECTED] http://www.mccmedia.com/mailman/listinfo/robots
_______________________________________________
Robots mailing list
[EMAIL PROTECTED]
http://www.mccmedia.com/mailman/listinfo/robots

Reply via email to