-----Original message----- > From:Lewis John Mcgibbney <[email protected]> > Sent: Wed 20-Jun-2012 22:23 > To: [email protected] > Subject: Re: Nutch and Solr Redundancy > > Hi Oakage, > > On Wed, Jun 20, 2012 at 9:08 PM, Oakage <[email protected]> wrote: > > Okay I've just started researching about nutch and knows that nutch index > > its > > crawl and Solr index the document it is given. > > Not quite. Nutch crawls and sends documents to Solr for indexing. > Nutch DOES NOT create/manage/maintain it's own index. > > > So my questions are: > > > > 1. When nutch sends it's crawled data to Solr, does Solr reindex or uses > > nutch's index? >
Indexing is incremental, Nutch just sends SolrDocuments to Solr or delete commands. This means that each indexing job just adds new or mutates or deletes existing documents in Solr. > This is a question for Solr user lists... > > > 2. If nutch's index is sufficient then how would I process this data without > > Solr so my nutch wouldn't be dependent on Solr(I know this is a very broad > > question, but a little snippet that would head me the right direction would > > be great > > As I explain above, above Nutch 1.2 Nutch doesn't maintain an index > structure... > > > > > Thanks > > > > -Oak > > > > -- > > View this message in context: > > http://lucene.472066.n3.nabble.com/Nutch-and-Solr-Redundancy-tp3990598.html > > Sent from the Nutch - User mailing list archive at Nabble.com. > > > > -- > Lewis >

