I would like to crawl everything in http://my.domain.name/dir/subdir
but nothing in its parent http://my.domain.name/dir/ In regex-urlfilter.txt I have the following: # skip URLs -^http://my.domain.name/dir/ # accept URLs +^http://my.domain.name/dir/subdir/* but Nutch still crawls the skip URLs. Any suggestions how to correct this behavior? -- View this message in context: http://lucene.472066.n3.nabble.com/Prevent-crawl-of-parent-URL-tp4080032.html Sent from the Nutch - User mailing list archive at Nabble.com.

