-----Original message-----
> From:stone2dbone <[email protected]>
> Sent: Wednesday 24th July 2013 18:25
> To: [email protected]
> Subject: RE: Prevent crawl of parent URL
> 
> Thanks Markus. I will give this a try.  I did refilter the crawldb. One more
> question:
> 
> I'm not good with regex. If I wanted to crawl
> 
> http://my.domain.name/dir/subdirA/subdirA1/
> http://my.domain.name/dir/subdirB/subdirB1/
> http://my.domain.name/dir/subdirB/subdirB2/
> http://my.domain.name/dir/subdirC/subdirC1/
> 
> but not
> 
> http://my.domain.name/dir/subdirA/
> http://my.domain.name/dir/subdirB/
> http://my.domain.name/dir/subdirC/
> 
> Can I do that by modifying your suggestion or would I need to exclude each
> URL individually?

If you're not good with regex i'd exclude each one individually instead of 
making a hard to comprehend regex.

> 
> I appreciate your help.
> 
> Best Regards,
> ADS
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Prevent-crawl-of-parent-URL-tp4080032p4080111.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
> 

Reply via email to