According to Soriana Villanueva: > Greetings from a newbie... > > Is there such an attribute as current_url? I am indexing a site that has > both English and French pages but only want to index the English pages. The > pages are organized as follows: The French pages have an "f" at the end of > their file names. > > http://www.domain.org/directory/index.html > http://www.domain.org/directory/indexf.html > > http://www.domain.org/directory/sample.html > http://www.domain.org/directory/samplef.html > > http://www.domain.org/directory/leaf.html > http://www.domain.org/directory/leaff.html > > I thought of using "exclude_urls: f.html" but this would exclude the > English page named leaf.html. This is why I was thinking of something like > "exclude_urls: $(current_url)f.html" Is this possible or perhaps there's an > even better solution to this?
My suggestion would be the same as Joe's. There's no easy way I can think of to do this otherwise. Even if there were a current_url attribute set internally for each URL that's parsed, this still wouldn't do the trick for two reasons: 1) Right now, exclude_urls is parsed once at the start of the run, to build a list of patterns by which all URLs are checked. At that point, current_url would have no value. Reparsing exclude_urls for each URL, and rebuilding the set of patterns each time, would be terribly inefficient. 2) Even if we did reparse exclude_urls for each URL, and set current_url to the URL being parsed, this still wouldn't do what you want because the current exclude_urls pattern would only apply to links found in the current document. So, if you found a link to leaff.html in leaf.html, it would be excluded, but if you found a link to leaff.html in indexf.html, it would not be excluded. -- Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930 _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

