I know urlfilter will allow you to specify domain crawl only. (no crawl outside links)
-----Original Message----- From: Chris [mailto:[email protected]] Sent: Thursday, November 04, 2010 10:43 PM To: [email protected] Subject: Updates of websites Hello, I read a bit of the documentation but I never installed Nutch or so. First of all, I am wondering whether what I want is possible with Nutch. I have a bunch of websites .. like 200 or so and I'd like to monitor them - see whether someone adds new content etc. With bin/nutch inject crawl/crawldb seed it is possible to add my list of URLs as I read. Two things: Can I tell Nutch, not to follow outgoing links? Is it possible to see a website / statistics / whatever like: Today, 23rd October 2010 Website: www.url1.com added new content: www.url1.com/new_content And I'd like to have this daily. Is there a way of doing it with Nutch? Thanks already Best regards Chris

