Hello everyone I'm really new to this and followed almost to the letter this page to set up an intranet crawl http://lucene.apache.org/nutch/tutorial8.html
I've added sites to it, and setup up a cron job to rescan every night, everyone really happy with it So happy in fact that someone has asked me if they could use the server to create an index of another site, one which will not include the results from the other indexes. So, in my twisted logic I got to creating a new file in the urls folder called bongo, exactly like the other file in there that handles my initial crawl called FAT In bongo I've added the url that i want to crawl, http://bongolovelongtime.com say I'm stuck now though. If I edit conf/nutch-site.xml and add +^http://([a-z0-9]*\.)*bongolovelongtime.com/ Wont the other crawl, the fat one, now crawl bongolove site? any help greatly appreciated. -- View this message in context: http://www.nabble.com/Two-Instances-of-Nutch-tp17536235p17536235.html Sent from the Nutch - User mailing list archive at Nabble.com.
