Don't pin me down on the details but AFAIK there are many other significant differences between Nutch and Solr. At a high level this is my understanding:

Solr is really meant for enterprise search where there is more structured data and it makes sense to enforce a schema on at least part of the index, so you can do better sorting (e.g. by date or part number). I also believe it has a more controlled way of dealing with caches, warming etc.

Nutch has a rich set of plugins that is geared to analyzing unstructured data including office documents etc. There is a patch for SOLR to achieve this as well but as far as I can see it is not yet in the main line.

But at a high level, if you want to use Solr for general web indexing, you'll need a spider/crawler from some other place.

On Oct 22, 2008, at 4:50 AM, John Martyniak wrote:

Are the main differences between Nutch and Solr, that Solr doesn't have a spider. So in order to use it you would have to spider the web your self, or with some other tool?

-John


Reply via email to