I've been thinking a little more about this problem, and since it seems to consist of two parts, I wonder if it can be solved by splitting the dig into two parts, and then merging the databases.
If you use: limit_urls_to: DO_TOPIC \ DO_ROOT \ DO_COMMUNITY in one config, then my understanding of your problem is that the only 'GOOD' URL that you will exclude is http://example.org/index.html If you then have: limit_urls_to: ${start_url} Max_docs: 1 (or something similar) in a second config then you should be able to get the missing document into a second database, and merge it into the first. The only problem that I can see then is that on many systems you may not be able to get a good index this way, since the obvious start point is not accessible in the main dig. This may then be overcome by feeding a URL list generated by the 'short dig' (config 2) into the 'full dig' (config 1) Mike > On Mon, 10 Jan 2005, Dan Langille wrote: > > > How can I use that on limit_urls_to? I've been trying this: > > > > limit_urls_to: ${start_url}*DO_TOPIC|DO_ROOT|DO_COMMUNITY* > > > > There are addiitonal restrictions, but once I get a > starting point, I > > think it'll all fall into place. > > > > A few example of what we want to do: > > > > http://example.org/index.html OK > http://example.org/index.html?ID=4 > > BAD > http://example.org/index.html?ID=4&DO_TOPIC OK > ******************************************************************** This email may contain information which is privileged or confidential. If you are not the intended recipient of this email, please notify the sender immediately and delete it without reading, copying, storing, forwarding or disclosing its contents to any other person Thank you Check us out at http://www.bt.com/consulting ******************************************************************** ------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ ht://Dig general mailing list: <htdig-general@lists.sourceforge.net> ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html List information (subscribe/unsubscribe, etc.) https://lists.sourceforge.net/lists/listinfo/htdig-general