According to Duncan Brannen:
>       A question about server aliases.  We have 2 domain names
> as it were
> 
> st-and.ac.uk and st-andrews.ac.uk  Since I don't want two copies of everything
> I've set up a server alias
> 
> server_aliases: www.st-and.ac.uk:80=www.st-andrews.ac.uk:80
> 
> but I also have
> 
> limit_urls_to:  http://www.st-andrews.ac.uk/
> 
> and I have some urls not being indexed.  Is it the case that limit urls 
> takes place
> before the server alias converts it & st-and.ac.uk URLs are rejected?

Yes, that's correct.

> Looking at the docs for limit_urls_to
> "The match will be performed after the relative references have been 
> converted to a valid URL"
> 
> Does this not include server aliases?

No.  There are a few different stages in URL processing.  The first
is to make relative URLs into absolute ones, and clean up the path
component.  Then, the URL is checked for validity, against limit_urls_to,
exclude_urls, bad_extensions, valid_extensions, and bad_querystr.
After that, the URL is normalized, and this is where server_aliases
are processed, as well as remove_default_doc and allow_virtual_hosts.
The URL is then checked against limit_normalized.  So, you could put the
canonical URL in limit_normalized, or all valid aliases in limit_urls_to,
as you have done.

The limit_normalized attribute was added to handle cases were there are
too many aliases to list them all, and allow_virtual_hosts is false so
all aliases are automatically canonicalized to the first name encountered
for a given IP address.

> Probably answering my own questions here since adding in st-and.ac.uk to 
> the limit_urls
> seems to fix it, but it might help someone else in the future :)

I guess it's always good to get independent confirmation of your
observations too, and to know things are working as they should.

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to