Hi,
I found I can use crawl-urlfilter.txt to define the
domain limitation by
"
# accept hosts in MY.DOMAIN.NAME
+^http://([a-z0-9]*\.)*MY.DOMAIN.NAME/
"
But, I found when I didn't use bin/nutch crawl...,
crawl-urlfilter.txt won't help me to filter out the
domain I don't want.
Can I use regex-urlfiter.txt to define the domain as
crawl-urlfiter.txt does?
thanks,
Michael Ji
__________________________________
Yahoo! Mail - PC Magazine Editors' Choice 2005
http://mail.yahoo.com
-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server.
Download it for free - -and be entered to win a 42" plasma tv or your very
own Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers