Hello all,

I am going to stand up a solr and nutch instance and use it for indexing ~120 
web sites. These sites are in the U of Richmond public web space and contain 
less than 100,000 pages if crawled completely.  

I've not done this before. We have been using the google mini appliance and are 
decommissioning these soon. 

I'm looking for any advice I can get. 

Is Nutch 2.2.1 compatible with solr 4.5.1?  This will be on Amazon linux and to 
start I will install both on the same EC2 instance.  I may separate Nutch to a 
separate instance for performance reasons. The mini indexes these sites in less 
than 2 hours so I'm guessing Nutch will do the same on a single server instance.

Our needs are pretty simple. We just need to be able to extract title and body.

Thanks in advance for your help.


Eric Palmer
Web Services
University of Richmond

Reply via email to