Hi, I have a list of about 5000 URLs which I need to crawl and fetch using Nutch. I want to do a very deep crawl on each and I want subdomains, but I dont want external links. If I set db.ignore.external.links, I dont get the subdomains. So I cant use that. If I set the domain in regex-urlfilter, I can avoid the external links and get the subdomains, but it does not seem right to include so many urls in the filter. Am I missing some configuration or am I using Nutch wrongly?
I would appreciate any help. Thanks in advance. Thanks and Regards, Sonal

