Hello

I see in nutch-1.2/conf/regex-urlfilter.txt file the following lines

# skip URLs with slash-delimited segment that repeats 3+ times, to break loops
-.*(/[^/]+)/[^/]+\1/[^/]+\1/

However, nutch fetch urls like
http://www.example.com/text/dev/faq/dev/content/2305/dev/content/246/

Thanks.
Alex.

Reply via email to