On 12 Jan 2005 at 0:11, Jim wrote:

> On Mon, 10 Jan 2005, Dan Langille wrote:
> 
> > Each URL must contain one of the following (actually, there are more
> > values in this list, but they have been eliminated to simply things):
> >
> >  DO_TOPIC
> >  DO_ROOT
> >  DO_COMMUNITY
> >
> > How can I use that on limit_urls_to?  I've been trying this:
> >
> > limit_urls_to:  ${start_url}*DO_TOPIC|DO_ROOT|DO_COMMUNITY*
> >
> > There are addiitonal restrictions, but once I get a starting point, I
> > think it'll all fall into place.
> >
> > A few example of what we want to do:
> >
> >  http://example.org/index.html OK
> >  http://example.org/index.html?ID=4  BAD
> >  http://example.org/index.html?ID=4&DO_TOPIC OK
> 
> I don't think that you are going to be able to do what you want with
> limit_urls_to. The attribute contains a list of patterns, one of which
> must be matched. Once you add a pattern that satisfies the first URL
> above, the other two are also satisfied since they contain the first.
> 
> I am not sure how you would completely solve this type of problem short of
> somehow using the external parser/converter mechanism as a filter.
> Depending on specifics, you might be able to handle some restrictions
> through the bad_querystr attribute, but that would not be sufficient for
> the example above. There are also restrict and exclude attributes, but
> those are applied at search time. The only other thing I can think of is
> perhaps using url_rewrite_rules to rewrite URL's that you don't want to
> something that limit_normalized then then drops (never tried this and
> don't even know if it is actually feasible).

Here is what I stumbled across: 

limit_urls_to: 
[\&ID2=(DO\_TOPIC|DO\_ROOT|DO\_COMMUNITY|DO\_DISCUSSIONPOST\_LIST)] \
    file_download.php

That seems to do what I need. It's a start.
-- 
Dan Langille : http://www.langille.org/
BSDCan - The Technical BSD Conference - http://www.bsdcan.org/



-------------------------------------------------------
The SF.Net email is sponsored by: Beat the post-holiday blues
Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek.
It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt
_______________________________________________
ht://Dig general mailing list: <htdig-general@lists.sourceforge.net>
ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html
List information (subscribe/unsubscribe, etc.)
https://lists.sourceforge.net/lists/listinfo/htdig-general

Reply via email to