/sth/../sth/../sth/ also works. Thanks for the quick response.

Why is this filter necessary? It says to break out of loops.

Could someone please tell me what can go wrong if I chose to remove this filter?

On 9/7/07, Damian Florczyk <[EMAIL PROTECTED]> wrote:
> "Smith Norton" <[EMAIL PROTECTED]> wrote:
>
> > This is a very basic question and unfortunately I am not able to
> > figure this out.
> >
> > In the regex-urlfilter.txt, I find this line present:-
> >
> > # skip URLs with slash-delimited segment that repeats 3+ times, to break 
> > loops
> > -.*(/.+?)/.*?\1/.*?\1/
> >
> > What type of URLs does it block? What does 'segment' mean here? Could
> > someone please provide an example of an URL that this particular regex
> > will select and prevent from being crawled.
> For example:
>
> /sth/../sth/../sth/
>
> --
> Damian Florczyk aka thunder
> Gentoo Developer, Gentoo/NetBSD Development Lead
>

Reply via email to