For examples you can look at CrawlDbReader/CrawlDatum and Generator,
Regards,
Markus
-Original message-
> From:Roannel Fernández Hernández
> Sent: Wednesday 23rd August 2017 21:31
> To: user@nutch.apache.org
> Subject: Re: [MASSMAIL]RE: Exchange documents in indexing job
>
> Hi.
>
Hi.
Thanks for your tips. I like the idea of JEXL expressions. I'm going to create
the ticket and I'll putting to work.
Thanks a lot.
- Original Message -
> From: "Markus Jelsma"
> To: user@nutch.apache.org
> Sent: Wednesday, August 23, 2017 2:05:21 PM
> Subject: [MASSMAIL]RE: Exchange
I think MIME-type filter is a fine method this, the only drawback is that you
need to run the indexer twice.
Althouh a better solution would be to support JEXL expressions in IndexWriters
and IndexerMapReduce to allow global filtering and per-IndexWriter filtering.
This would not be very hard t
I don't see a good way to do it in configuration, but it should be very easy to
override the write method in the two plugins to have it check the mime type and
decide whether to call super.write or not.
(One terrible way to do it with configuration only would be to configure only
one of the inde
Hi folks:
There is some way in Nutch to send some documents to a particular index writer
according to particular values of fields?
I explain myself better. I have a document with a field called "mimetype" and I
want to send to Solr only the documents with value "text/plain" for this field
an
5 matches
Mail list logo