According to M. Schulz: > o.k., i build an index with ht://dig for e.g. 2 sites: > > http:///www.abc.org > and > http://www.def.org > > At the second site there�s an url e.g. > > http://www.def.org/test/index.html > > Question: Is it possible to remove exactly only that > url from the index?
That depends. If you want to remove a URL from the index without needing to reindex the site, there isn't currently a way to do this easily. A kludgy way, in 3.1.x, would be to find out what the document ID for this URL is, and then insert a record into db.wordlist telling htmerge to delete it. E.g., if its DocID is 123, then add the record: -123 to db.wordlist, and rerun htmerge. On the other hand, if you want to exclude this URL from future reindexing runs, you should add it to exclude_urls. However, be aware that the name "index.html" is usually stripped off of URLs, and I believe this is done before checking against exclude_urls. If you put that URL in exclude_urls, without the index.html part, it would tell htdig to exclude everything under the test/ subdirectory, which may not be what you want. You may have better luck with meta tags right in the document. See http://www.htdig.org/attrs.html#exclude_urls http://www.htdig.org/attrs.html#remove_default_doc and http://www.htdig.org/FAQ.html#q4.15 -- Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/ Dept. Physiology, U. of Manitoba Winnipeg, MB R3E 3J7 (Canada) _______________________________________________________________ Don't miss the 2002 Sprint PCS Application Developer's Conference August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

