-----Original message----- > From:stone2dbone <[email protected]> > Sent: Wednesday 24th July 2013 18:25 > To: [email protected] > Subject: RE: Prevent crawl of parent URL > > Thanks Markus. I will give this a try. I did refilter the crawldb. One more > question: > > I'm not good with regex. If I wanted to crawl > > http://my.domain.name/dir/subdirA/subdirA1/ > http://my.domain.name/dir/subdirB/subdirB1/ > http://my.domain.name/dir/subdirB/subdirB2/ > http://my.domain.name/dir/subdirC/subdirC1/ > > but not > > http://my.domain.name/dir/subdirA/ > http://my.domain.name/dir/subdirB/ > http://my.domain.name/dir/subdirC/ > > Can I do that by modifying your suggestion or would I need to exclude each > URL individually?
If you're not good with regex i'd exclude each one individually instead of making a hard to comprehend regex. > > I appreciate your help. > > Best Regards, > ADS > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Prevent-crawl-of-parent-URL-tp4080032p4080111.html > Sent from the Nutch - User mailing list archive at Nabble.com. >

