Hey thanks for trying to help, but I've realised something else went wrong
and it was just coincidence I had added that regex at that time. Had me
scratching my head for a while, but the regex works fine :)

On 2 January 2015 at 12:07, Talat Uyarer <[email protected]> wrote:

> Can you share your regex conf file. You should add accept all rule end of
> file.
>
> 2015-01-02 13:11 GMT+02:00 Kevin Porter <[email protected]>:
> > Hi,
> >
> >
> > I added a regex to conf/regex-urlfilter.txt because I want to stop it
> > crawling all pages with "highlight=" in the query part of the URL. This
> is
> > the regex:
> >
> > -^http://9ballpool.co.uk/forums/.*\?.*highlight=
> >
> > Now nutch crawls nothing, it's like every URL is matching that regex and
> so
> > being excluded. Why?
> >
> >
> > --
> > http://themapps.com
>
>
>
> --
> Talat UYARER
> Websitesi: http://talat.uyarer.com
> Twitter: http://twitter.com/talatuyarer
> Linkedin: http://tr.linkedin.com/pub/talat-uyarer/10/142/304
>



-- 
http://themapps.com

Reply via email to