Re: [Robots] Robots.txt Evolution?

2004-01-11 Thread Fred Atkinson
I'm inclined to agree that a second file would probably get overlooked by bots. I would imagine it was difficult trying to get those who run them to respect the first one. I was unaware of the 'Allow' command. Is there a URL that documents it? Also, the use of wildcards when giving

[Robots] This email address is no longer in use.

2004-01-11 Thread paul
This email address is no longer in use. If you need to contact me, please call (07973) 172650 ___ Robots mailing list [EMAIL PROTECTED] http://www.mccmedia.com/mailman/listinfo/robots

Re: [Robots] Robots.txt Evolution?

2004-01-11 Thread Walter Underwood
--On Sunday, January 11, 2004 11:44 AM -0500 Fred Atkinson [EMAIL PROTECTED] wrote: I was unaware of the 'Allow' command. Is there a URL that documents it? The Allow directive is non-standard. Don't use it. wunder -- Walter Underwood Principal Architect Verity Ultraseek

[Robots] Another approach

2004-01-11 Thread Fred Atkinson
Another idea that has occured to me is to simply code the information to be indexed in the robots.txt file. Then, the robot could simply suck the information out of the file and be done. Example: User-agent: Scooter Interval: 30d Disallow: / Name: Fred's Site Index: /index.html Name: My

RE: [Robots] Another approach

2004-01-11 Thread Matthew Meadows
I don't think the explicit names would be required, most robots simply read the title tag, or infer it from the first portion of clear text, the content meta tag, or other document attributes. Anyway, this method would become quite burdensome for very complicated sites. I also suspect the file

Re: [Robots] Another approach

2004-01-11 Thread Sean 'Captain Napalm' Conner
It was thus said that the Great Walter Underwood once stated: --On Sunday, January 11, 2004 8:13 PM -0500 Sean 'Captain Napalm' Conner [EMAIL PROTECTED] wrote: And there you go. Using the different directives makes it backwards compatible with the original robots.txt (where an older