I'm having a hard time trying to avoid crawling a particular url. In regex-urlfilter.txt I added the following to ignore it. -^http://([a-z0-9]*\.)*bhejacry.com/forums/
This url is not in the list in my urls directory. I also have 'db.ignore.external.links' set to 'true'. However, I still see the following during the crawl fetching http://www.bhejacry.com/forums/memberlist.php?mode=viewprofile&u=2774 fetching http://www.bhejacry.com/forums/memberlist.php?mode=viewprofile&u=96 How do I ignore these urls? -- View this message in context: http://www.nabble.com/Ignoring-a-url-in-the-crawl-tp19729031p19729031.html Sent from the Nutch - User mailing list archive at Nabble.com.
