hi i am using nutch 1.2. in my crawl-urlfilter.txt, i am specifying URLs to be skipped. i am giving some patterns that need to be skipped but it is not working
e.g. -^http://([a-z0-9]*\.)*domain.com +^http://([a-z0-9]*\.)*domain.com/([0-9-a-z])*.html -^http://([a-z0-9]*\.)*domain.com/([a-z/])* -^http://([a-z0-9]*\.)*domain.com/top-ads.php i want the second URL only to be included while crawling & all other patterns to be excluded. but it is crawling all of them. Please suggest where might be the issue thanks Pawan