Hi

I'll have to wait until Julien commits his work for pluggable indexing back 
ends. That should not take a long time now.

Cheers

 
 
-----Original message-----
> From:Sourajit Basak <[email protected]>
> Sent: Thu 24-Jan-2013 05:36
> To: [email protected]
> Subject: Re: conditional indexing
> 
> Thanks for the pointer on ordering.
> 
> Based upon the presence of certain fields, nutch can decide whether to send
> a doc to solr. However, I guess its means changes to the main code line
> instead of driven by plugin. Let me see the source.
> 
> I am eagerly waiting for the patch from Markus.
> 
> Best,
> Sourajit
> 
> On Thu, Jan 24, 2013 at 7:08 AM, feng lu <[email protected]> wrote:
> 
> > Hi Sourajit
> >
> > >>>>
> > We have an implementation of Indexing filter that runs side-by-side the
> > indexer-basic plugin. How is the order determined ?
> > <<<<
> > First, Make sure you indexing filter plugin is set currectly at
> > plugin.includes property in nutch-site.xml configuration file. The indexing
> > filter order is determined by indexingfilter.order property like this.
> > String class1 = "YouIndexingFilter";
> > String class2 = "org.apache.nutch.indexer.basic.BasicIndexingFilter";
> > conf.set(IndexingFilters.INDEXINGFILTER_ORDER, class1 + " " + class2);
> > IndexingFilters filters = new IndexingFilters(conf);
> >
> > >>>>>
> > Also, how do I do conditional indexing i.e. stop certain urls from being
> > indexed ? I think I can apply a filter but that approach will not work
> > since we index based on the page contents.
> > <<<<<
> > May be now you can filter certain urls by returning a null value. you can
> > see the IndexingFilter API comment. But now indexfilter can not delete a
> > existing document in back-end search engine.
> > and Markus will fix this in
> > https://issues.apache.org/jira/browse/NUTCH-1449
> > .
> >
> >
> > On Wed, Jan 23, 2013 at 7:24 PM, Sourajit Basak <[email protected]
> > >wrote:
> >
> > > Markus - Can you please share your patch ?
> > >
> > > On Wed, Jan 23, 2013 at 1:52 PM, Tejas Patil <[email protected]
> > > >wrote:
> > >
> > > > Hi Sourajit,
> > > > See indexingfilter.order in nutch-default.xml
> > > >
> > > > Thanks,
> > > > Tejas Patil
> > > >
> > > > On Wed, Jan 23, 2013 at 12:16 AM, Sourajit Basak
> > > > <[email protected]>wrote:
> > > >
> > > > > We have an implementation of Indexing filter that runs side-by-side
> > the
> > > > > indexer-basic plugin. How is the order determined ?
> > > > > Also, how do I do conditional indexing i.e. stop certain urls from
> > > being
> > > > > indexed ? I think I can apply a filter but that approach will not
> > work
> > > > > since we index based on the page contents.
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > Don't Grow Old, Grow Up... :-)
> >
> 

Reply via email to