On Tue, 6 Mar 2001, Malcolm Austen wrote:
+ For my one-hop scenario I would (probably imprecisely) specify it as ...
+
+ A page that, by the present ruleset, would be rejected will be
+ accepted and processed as if it had an <index,nofollow> header.
I tried to think through the logic on the bus home ... here is a better
try at a specification.
Present sequence (presumably, I have not read the code 8-):
- Find a URL in a link
- check it against 'limit_urls_to'
- if it fails
- reject it
- check it against 'exclude_urls'
...
I would suggest instead:
- Find a URL in a link
- check it against 'limit_urls_to'
- if it fails
- if 'take_one_step_outside' option is off
- reject it
- if 'take_one_step_outside' option is on
- mark the page to be processed as <index,nofollow>
- - check it against 'exclude_urls'
...
regards,
Malcolm.
[EMAIL PROTECTED] http://users.ox.ac.uk/~malcolm/
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html