For examples you can look at CrawlDbReader/CrawlDatum and Generator, Regards, Markus
-----Original message----- > From:Roannel Fernández Hernández <[email protected]> > Sent: Wednesday 23rd August 2017 21:31 > To: [email protected] > Subject: Re: [MASSMAIL]RE: Exchange documents in indexing job > > Hi. > > Thanks for your tips. I like the idea of JEXL expressions. I'm going to > create the ticket and I'll putting to work. > > Thanks a lot. > > ----- Original Message ----- > > From: "Markus Jelsma" <[email protected]> > > To: [email protected] > > Sent: Wednesday, August 23, 2017 2:05:21 PM > > Subject: [MASSMAIL]RE: Exchange documents in indexing job > > > > I think MIME-type filter is a fine method this, the only drawback is that > > you > > need to run the indexer twice. > > > > Althouh a better solution would be to support JEXL expressions in > > IndexWriters and IndexerMapReduce to allow global filtering and > > per-IndexWriter filtering. This would not be very hard to patch in. > > > > -----Original message----- > > > From:Yossi Tamari <[email protected]> > > > Sent: Wednesday 23rd August 2017 19:40 > > > To: [email protected] > > > Subject: RE: Exchange documents in indexing job > > > > > > I don't see a good way to do it in configuration, but it should be very > > > easy to override the write method in the two plugins to have it check the > > > mime type and decide whether to call super.write or not. > > > (One terrible way to do it with configuration only would be to configure > > > only one of the indexers and use mimetype-filter to filter the matching > > > type, and then reconfigure for the other indexer and change > > > mimetype-filter.txt to the other mime type and index again...) > > > > > > -----Original Message----- > > > From: Roannel Fernández Hernández [mailto:[email protected]] > > > Sent: 23 August 2017 18:05 > > > To: [email protected] > > > Subject: Exchange documents in indexing job > > > > > > Hi folks: > > > > > > There is some way in Nutch to send some documents to a particular index > > > writer according to particular values of fields? > > > > > > I explain myself better. I have a document with a field called "mimetype" > > > and I want to send to Solr only the documents with value "text/plain" for > > > this field and send to RabbitMQ the documents with value "text/html". How > > > can I do that? > > > > > > Regards > > > > > > La @universidad_uci es Fidel. Los jóvenes no fallaremos. > > > #HastaSiempreComandante > > > #HastalaVictoriaSiempre > > > > > > > > > La @universidad_uci es Fidel. Los jóvenes no fallaremos. > #HastaSiempreComandante > #HastalaVictoriaSiempre > >

