It was thus said that the Great [EMAIL PROTECTED] once stated:
> 
> It would be great to know how to ask that http service for the list of 
> "default or index file" names so the agents could verify what file name was 
> indeed associated with the "/" slash.  We could then put the file name on the 
> URL to completely qualify that URL path.   Anyone? 

  No.  In some pathological cases there *isn't* a file associated with what
looks like a directory.  Case in point---both:

                     http://boston.conman.org/2001/11/10

                                     and

                    http://boston.conman.org/2001/11/10/

  Return the same document (or rather, they should---there appears to be a
bug in the code 8-)

  A URL like

                  http://boston.conman.org/2001/index.html

  Is invalid and does not exist.   In fact, the ``directory'' 2001/ does not
exist.  In fact, once you get past the hostname, any page in the form of
[0-9]+/[0-9][0-9]/[0-9][0-9] is more or less a database query (in fact, it's
a bit more convoluted than that but that's not important right now).  So in
fact, for this portion of the webspace, such a query just doesn't apply.

> Crazy thought...
> 
> This is where the robots.txt file could be used to hold that information for 
> the robot agents that need to know the operational order of the "/" defaults 
> names used on that service.
> 
> User-agent: *
> Slash: default.htm, default.html, index.htm, index.html, welcome.html, 
> sitemap.html
> 
>  The above is just for consideration if the robots.txt is ever updated so the 
> robots could be informed of this little detail.   

  There was a push in '96 or '97 to update the robots.txt standard and I
wrote a proposal back then (http://www.conman.org/people/spc/robots2.html)
and while I still get the occasional email about it to my knowledge, no
robot has implemented it (some portions perhaps, but not everything).  I
only mention this because it was attempted before.

  -spc (For an interesting discussion, tell me how a robot should handle
        a site like http://bible.conman.org/ or http://boston.conman.org
        where you really have multiple views into obstensibly a single
        document)


--
This message was sent by the Internet robots and spiders discussion list 
([EMAIL PROTECTED]).  For list server commands, send "help" in the body of a message 
to "[EMAIL PROTECTED]".

Reply via email to