[Robots] Re: Correct URL, shlash at the end ?

2001-11-23 Thread Andrew Daviel
On Thu, 22 Nov 2001, George Phillips wrote: Yes, they are extremely useful. But they're just rules that take the stuff you used to get the current page and some relative stuff to construct new stuff -- all done by the browser. The web server only understands pure, unadulterated,

[Robots] Re: Correct URL, shlash at the end ?

2001-11-22 Thread Otis Gospodnetic
The above is just for consideration if the robots.txt is ever updated so the robots could be informed of this little detail. There was a push in '96 or '97 to update the robots.txt standard and I wrote a proposal back then (http://www.conman.org/people/spc/robots2.html) and

[Robots] Re: Correct URL, shlash at the end ?

2001-11-22 Thread thomas.kay
Crazy thought... This is where the robots.txt file could be used to hold that information for the robot agents that need to know the operational order of the / default names used on that service. User-agent: * Slash: default.htm, default.html, index.htm, index.html, welcome.html,

[Robots] Re: Correct URL, shlash at the end ?

2001-11-21 Thread Thomas Witt
You may have more than just two scans on the resource, as urls such as http://www.abc.de/xyz/index.html will also return the same document. Calculate a checksum for each url retrieved, and compare for identical checksums. If you find that one page is identical to another, the second can

[Robots] Re: Correct URL, shlash at the end ?

2001-11-21 Thread Klaus Johannes Rusch
In [EMAIL PROTECTED], Matthias Jaekle [EMAIL PROTECTED] writes: I read about adding a slash at the end of the URLs, if there is no absolut path present. But what about pathes ending in subdirectories (xyz). A link to http://www.abc.de/xyz/ might be more correct then the link to

[Robots] Re: Correct URL, shlash at the end ?

2001-11-21 Thread thomas.kay
I guess it depends on what you are asking to have returned. ( And this bring up another robots.txt question.. below) http://www.abc.de/xyz Asking for the directory. (where the service is allowed redirection to a temporary default file list or another default file as a reply if the service