Yes .. that looks good - there is a white list for enterprise searches. Sounds exactly as one part I need.
How about the other? Is there a way of doing a diff between two versions? Do you know that? Am 05.11.2010 13:49, schrieb Eric Martin:
I know urlfilter will allow you to specify domain crawl only. (no crawl outside links) -----Original Message----- From: Chris [mailto:[email protected]] Sent: Thursday, November 04, 2010 10:43 PM To: [email protected] Subject: Updates of websites Hello, I read a bit of the documentation but I never installed Nutch or so. First of all, I am wondering whether what I want is possible with Nutch. I have a bunch of websites .. like 200 or so and I'd like to monitor them - see whether someone adds new content etc. With bin/nutch inject crawl/crawldb seed it is possible to add my list of URLs as I read. Two things: Can I tell Nutch, not to follow outgoing links? Is it possible to see a website / statistics / whatever like: Today, 23rd October 2010 Website: www.url1.com added new content: www.url1.com/new_content And I'd like to have this daily. Is there a way of doing it with Nutch? Thanks already Best regards Chris

