The regex can be tested without the need to run the fetcher with:
cat file-with-test-urls | nutch net/nutch/net/RegexURLFilter

Good luck

[EMAIL PROTECTED] wrote:

Hi

Can anyone assist me with why URL's are still being fetched
which (i think) match the following regex entries:?

-http:\/\/.*\/.*\/.*\/.*\/.*
[NEWLINE] (E-mail client may distort)
-.*\.\..*
[NEWLINE] (E-mail client may distort)
-http:\/\/.*\/.*(print|friend|email|emailto|register|signin|login|logon|signmein|menus|Print|Friend|Email|Emailto|Register|Signin|Login|Logon|Signmein|Menus).*
[NEWLINE] (E-mail client may distort)

Can any1 please help me?

Thanks

_____________________________________________________________________
For super low premiums, click here http://www.dialdirect.co.za/quote


Reply via email to