Hi, I found I can use crawl-urlfilter.txt to define the domain limitation by " # accept hosts in MY.DOMAIN.NAME +^http://([a-z0-9]*\.)*MY.DOMAIN.NAME/ "
But, I found when I didn't use bin/nutch crawl...,
crawl-urlfilter.txt won't help me to filter out the
domain I don't want.
Can I use regex-urlfiter.txt to define the domain as
crawl-urlfiter.txt does?
thanks,
Michael Ji
__________________________________
Yahoo! Mail - PC Magazine Editors' Choice 2005
http://mail.yahoo.com
