In my regex-urlfilters.txt I have the default filters that come with nutch.
If I have +. as the very last line of the file crawling works fine.

If I change that line to anything else then I get "Total urls rejected by
filters: 1" and no urls are fetched.

I've tried a bunch of different entries in the last line:

+html
+*html
+*html$
+.*(html)$

What am I missing?

Thanks.

Sol

Reply via email to