I know urlfilter will allow you to specify domain crawl only. (no crawl
outside links)

-----Original Message-----
From: Chris [mailto:[email protected]] 
Sent: Thursday, November 04, 2010 10:43 PM
To: [email protected]
Subject: Updates of websites

Hello,

I read a bit of the documentation but I never installed Nutch or so.
First of all, I am wondering whether what I want is possible with Nutch.

I have a bunch of websites  .. like 200 or so and I'd like to monitor 
them - see whether someone adds new content etc.

With  bin/nutch inject crawl/crawldb seed  it is possible to add my list 
of URLs as I read.

Two things: Can I tell Nutch, not to follow outgoing links?
Is it possible to see a website / statistics / whatever like:
Today, 23rd October 2010
Website: www.url1.com added new content: www.url1.com/new_content

And I'd like to have this daily.

Is there a way of doing it with Nutch?

Thanks already
Best regards
Chris

Reply via email to