Thanks Markus. I will give this a try. I did refilter the crawldb. One more question:
I'm not good with regex. If I wanted to crawl http://my.domain.name/dir/subdirA/subdirA1/ http://my.domain.name/dir/subdirB/subdirB1/ http://my.domain.name/dir/subdirB/subdirB2/ http://my.domain.name/dir/subdirC/subdirC1/ but not http://my.domain.name/dir/subdirA/ http://my.domain.name/dir/subdirB/ http://my.domain.name/dir/subdirC/ Can I do that by modifying your suggestion or would I need to exclude each URL individually? I appreciate your help. Best Regards, ADS -- View this message in context: http://lucene.472066.n3.nabble.com/Prevent-crawl-of-parent-URL-tp4080032p4080111.html Sent from the Nutch - User mailing list archive at Nabble.com.

