RE: Nutch and Solr Redundancy

Markus Jelsma Wed, 20 Jun 2012 15:46:23 -0700

 
-----Original message-----
> From:Lewis John Mcgibbney <[email protected]>
> Sent: Wed 20-Jun-2012 22:23
> To: [email protected]
> Subject: Re: Nutch and Solr Redundancy
> 
> Hi Oakage,
> 
> On Wed, Jun 20, 2012 at 9:08 PM, Oakage <[email protected]> wrote:
> > Okay I've just started researching about nutch and knows that nutch index 
> > its
> > crawl and Solr index the document it is given.
> 
> Not quite. Nutch crawls and sends documents to Solr for indexing.
> Nutch DOES NOT create/manage/maintain it's own index.
> 
> > So my questions are:
> >
> > 1. When nutch sends it's crawled data to Solr, does Solr reindex or uses
> > nutch's index?
>


Indexing is incremental, Nutch just sends SolrDocuments to Solr or delete 
commands. This means that each indexing job just adds new or mutates or deletes 
existing documents in Solr. 

> This is a question for Solr user lists...
> 
> > 2. If nutch's index is sufficient then how would I process this data without
> > Solr so my nutch wouldn't be dependent on Solr(I know this is a very broad
> > question, but a little snippet that would head me the right direction would
> > be great
> 
> As I explain above, above Nutch 1.2 Nutch doesn't maintain an index
> structure...
> 
> >
> > Thanks
> >
> > -Oak
> >
> > --
> > View this message in context: 
> > http://lucene.472066.n3.nabble.com/Nutch-and-Solr-Redundancy-tp3990598.html
> > Sent from the Nutch - User mailing list archive at Nabble.com.
> 
> 
> 
> -- 
> Lewis
>

RE: Nutch and Solr Redundancy

Reply via email to