Can you share your regex conf file. You should add accept all rule end of file.

2015-01-02 13:11 GMT+02:00 Kevin Porter <[email protected]>:
> Hi,
>
>
> I added a regex to conf/regex-urlfilter.txt because I want to stop it
> crawling all pages with "highlight=" in the query part of the URL. This is
> the regex:
>
> -^http://9ballpool.co.uk/forums/.*\?.*highlight=
>
> Now nutch crawls nothing, it's like every URL is matching that regex and so
> being excluded. Why?
>
>
> --
> http://themapps.com



-- 
Talat UYARER
Websitesi: http://talat.uyarer.com
Twitter: http://twitter.com/talatuyarer
Linkedin: http://tr.linkedin.com/pub/talat-uyarer/10/142/304

Reply via email to