Two Instances of Nutch

vanderkerkoff Thu, 29 May 2008 07:25:23 -0700

Hello everyone

I'm really new to this and followed almost to the letter this page to set up
an intranet crawl
http://lucene.apache.org/nutch/tutorial8.html


I've added sites to it, and setup up a cron job to rescan every night,
everyone really happy with it

So happy in fact that someone has asked me if they could use the server to
create an index of another site, one which will not include the results from
the other indexes.

So, in my twisted logic I got to creating a new file in the urls folder
called bongo, exactly like the other file in there that handles my initial
crawl called FAT

In bongo I've added the url that i want to crawl,
http://bongolovelongtime.com say

I'm stuck now though.

If I edit conf/nutch-site.xml and add
+^http://([a-z0-9]*\.)*bongolovelongtime.com/

Wont the other crawl, the fat one, now crawl bongolove site?

any help greatly appreciated.


-- 
View this message in context: 
http://www.nabble.com/Two-Instances-of-Nutch-tp17536235p17536235.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Two Instances of Nutch

Reply via email to