Hi I'll have to wait until Julien commits his work for pluggable indexing back ends. That should not take a long time now.
Cheers -----Original message----- > From:Sourajit Basak <[email protected]> > Sent: Thu 24-Jan-2013 05:36 > To: [email protected] > Subject: Re: conditional indexing > > Thanks for the pointer on ordering. > > Based upon the presence of certain fields, nutch can decide whether to send > a doc to solr. However, I guess its means changes to the main code line > instead of driven by plugin. Let me see the source. > > I am eagerly waiting for the patch from Markus. > > Best, > Sourajit > > On Thu, Jan 24, 2013 at 7:08 AM, feng lu <[email protected]> wrote: > > > Hi Sourajit > > > > >>>> > > We have an implementation of Indexing filter that runs side-by-side the > > indexer-basic plugin. How is the order determined ? > > <<<< > > First, Make sure you indexing filter plugin is set currectly at > > plugin.includes property in nutch-site.xml configuration file. The indexing > > filter order is determined by indexingfilter.order property like this. > > String class1 = "YouIndexingFilter"; > > String class2 = "org.apache.nutch.indexer.basic.BasicIndexingFilter"; > > conf.set(IndexingFilters.INDEXINGFILTER_ORDER, class1 + " " + class2); > > IndexingFilters filters = new IndexingFilters(conf); > > > > >>>>> > > Also, how do I do conditional indexing i.e. stop certain urls from being > > indexed ? I think I can apply a filter but that approach will not work > > since we index based on the page contents. > > <<<<< > > May be now you can filter certain urls by returning a null value. you can > > see the IndexingFilter API comment. But now indexfilter can not delete a > > existing document in back-end search engine. > > and Markus will fix this in > > https://issues.apache.org/jira/browse/NUTCH-1449 > > . > > > > > > On Wed, Jan 23, 2013 at 7:24 PM, Sourajit Basak <[email protected] > > >wrote: > > > > > Markus - Can you please share your patch ? > > > > > > On Wed, Jan 23, 2013 at 1:52 PM, Tejas Patil <[email protected] > > > >wrote: > > > > > > > Hi Sourajit, > > > > See indexingfilter.order in nutch-default.xml > > > > > > > > Thanks, > > > > Tejas Patil > > > > > > > > On Wed, Jan 23, 2013 at 12:16 AM, Sourajit Basak > > > > <[email protected]>wrote: > > > > > > > > > We have an implementation of Indexing filter that runs side-by-side > > the > > > > > indexer-basic plugin. How is the order determined ? > > > > > Also, how do I do conditional indexing i.e. stop certain urls from > > > being > > > > > indexed ? I think I can apply a filter but that approach will not > > work > > > > > since we index based on the page contents. > > > > > > > > > > > > > > > > > > > > -- > > Don't Grow Old, Grow Up... :-) > > >

