Re: AW: AW: AW: [htdig] add parameter to the url while indexing

Gilles Detillieux Thu, 25 Apr 2002 07:01:50 -0700

According to Thieme, Winfried:
> It works fine, thank you!!!
> But what does the \$ at the end of the first regular expression mean?
> 
> -> (.*\\?.*)&param=value\$ \\1 \


The \$ gets the $ past the first stage of attribute parsing (the variable
substitution phase), so the regular expression ends up with a simple $
at the end.  A $ at the end of the expression matches the end of the
string being tested, so what the whole expression does is to remove
"&param=value" only when it occurs at the end of the URL (i.e. where a
previous iteration of the second expression would have added it).

...
> > According to Thieme, Winfried:
> > > I already tried the url_rewrite_rules, but i got a strange 
> > > behavior.
> > > 
> > > E.g. the rule (.*)\\?(.*) \\0\\&param=value should append
> > > my parameter to every url with an already existing parameter.
> > > But the spider indexes infinitely in a recursive manner:
> > > 
> > >   url: abc.com/test?test=something
> > >   
> > >   -> abc.com/test?test=something&param=value
> > >       -> abc.com/test?test=something&param=value&param=value
> > >   -> 
> > abc.com/test?test=something&param=value&param=value&param=value
> > >   -> ...
> > 
> > It seems that somehow the same URL is being fed back into the 
> > queue, and so
> > the rewriting keeps adding another parameter to the same URL, 
> > making it a
> > different URL.  You might be able to add another rule to get 
> > rid of the
> > parameter before it's added back on, so it's never added more 
> > than once.
> > E.g.:
> > 
> > url_rewrite_rules:  (.*\\?.*)&param=value\$ \\1 \
> >                     (.*)\\?(.*) \\0\\&param=value


-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/
Dept. Physiology, U. of Manitoba  Winnipeg, MB  R3E 3J7  (Canada)

_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Re: AW: AW: AW: [htdig] add parameter to the url while indexing

Reply via email to