You can use a suffix filter if there are no query strings.
Dennis
Jens Martin Schubert wrote:
Hello,
is it possible to crawl e.g. http://www.domain.com,
but to skip crawling all urls matching to
(http://www.domain.com/subpage/)
I tried to achieve this with crawl-urlfilter.txt/regex-urlfilter.txt.
but it doesn't work:
-ftp.tu-clausthal.de
-^http://([a-z0-9]*\.)asta.tu-clausthal.de/de/mobil/
+^http://([a-z0-9]*\.)asta.tu-clausthal.de
+^http://([a-z0-9]*\.)*tu-clausthal.de/
# skip everything else
-.
skipping ftp.tu-clausthal.de works perfect,
but http://www.asta.tu-clausthal.de/de/mobil/ is still indexed, which
takes a long time to crawl.
regards,
Jens Martin Schubert