On 12 Jan 2005 at 0:11, Jim wrote: > On Mon, 10 Jan 2005, Dan Langille wrote: > > > Each URL must contain one of the following (actually, there are more > > values in this list, but they have been eliminated to simply things): > > > > DO_TOPIC > > DO_ROOT > > DO_COMMUNITY > > > > How can I use that on limit_urls_to? I've been trying this: > > > > limit_urls_to: ${start_url}*DO_TOPIC|DO_ROOT|DO_COMMUNITY* > > > > There are addiitonal restrictions, but once I get a starting point, I > > think it'll all fall into place. > > > > A few example of what we want to do: > > > > http://example.org/index.html OK > > http://example.org/index.html?ID=4 BAD > > http://example.org/index.html?ID=4&DO_TOPIC OK > > I don't think that you are going to be able to do what you want with > limit_urls_to. The attribute contains a list of patterns, one of which > must be matched. Once you add a pattern that satisfies the first URL > above, the other two are also satisfied since they contain the first. > > I am not sure how you would completely solve this type of problem short of > somehow using the external parser/converter mechanism as a filter. > Depending on specifics, you might be able to handle some restrictions > through the bad_querystr attribute, but that would not be sufficient for > the example above. There are also restrict and exclude attributes, but > those are applied at search time. The only other thing I can think of is > perhaps using url_rewrite_rules to rewrite URL's that you don't want to > something that limit_normalized then then drops (never tried this and > don't even know if it is actually feasible).
Here is what I stumbled across: limit_urls_to: [\&ID2=(DO\_TOPIC|DO\_ROOT|DO\_COMMUNITY|DO\_DISCUSSIONPOST\_LIST)] \ file_download.php That seems to do what I need. It's a start. -- Dan Langille : http://www.langille.org/ BSDCan - The Technical BSD Conference - http://www.bsdcan.org/ ------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ ht://Dig general mailing list: <htdig-general@lists.sourceforge.net> ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html List information (subscribe/unsubscribe, etc.) https://lists.sourceforge.net/lists/listinfo/htdig-general