According to Soriana Villanueva:
> Greetings from a newbie...
> 
> Is there such an attribute as current_url?  I am indexing a site that has
> both English and French pages but only want to index the English pages.  The
> pages are organized as follows:  The French pages have an "f" at the end of
> their file names.
> 
> http://www.domain.org/directory/index.html
> http://www.domain.org/directory/indexf.html
> 
> http://www.domain.org/directory/sample.html
> http://www.domain.org/directory/samplef.html
> 
> http://www.domain.org/directory/leaf.html
> http://www.domain.org/directory/leaff.html
> 
> I thought of using "exclude_urls: f.html"  but this would exclude the
> English page named leaf.html.  This is why I was thinking of something like
> "exclude_urls: $(current_url)f.html"  Is this possible or perhaps there's an
> even better solution to this?

My suggestion would be the same as Joe's.  There's no easy way I can think
of to do this otherwise.

Even if there were a current_url attribute set internally for each URL
that's parsed, this still wouldn't do the trick for two reasons:

1) Right now, exclude_urls is parsed once at the start of the run, to
build a list of patterns by which all URLs are checked.  At that point,
current_url would have no value.  Reparsing exclude_urls for each URL, and
rebuilding the set of patterns each time, would be terribly inefficient.

2) Even if we did reparse exclude_urls for each URL, and set current_url
to the URL being parsed, this still wouldn't do what you want because
the current exclude_urls pattern would only apply to links found in the
current document.  So, if you found a link to leaff.html in leaf.html,
it would be excluded, but if you found a link to leaff.html in indexf.html,
it would not be excluded.

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to