On Tue, 6 Mar 2001, Malcolm Austen wrote:

+ For my one-hop scenario I would (probably imprecisely) specify it as ...
+ 
+       A page that, by the present ruleset, would be rejected will be
+       accepted and processed as if it had an <index,nofollow> header.

I tried to think through the logic on the bus home ... here is a better
try at a specification.

Present sequence (presumably, I have not read the code 8-):

 - Find a URL in a link
   - check it against 'limit_urls_to'
     - if it fails
       - reject it
     - check it against 'exclude_urls'
       ...

I would suggest instead:

 - Find a URL in a link
   - check it against 'limit_urls_to'
     - if it fails
       - if 'take_one_step_outside' option is off
         - reject it
       - if 'take_one_step_outside' option is on
         - mark the page to be processed as <index,nofollow>
     -     - check it against 'exclude_urls'
             ...

regards,
        Malcolm.

 [EMAIL PROTECTED]     http://users.ox.ac.uk/~malcolm/


_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to