Eelco Lempsink wrote:
Of course, for high volumes of data first indexing, and afterwards removing it, doesn't sound like a good option in my case where only a small part of the fetched data needs to be indexed.

Has anyone solved this problem (elegantly)? I mainly wonder if it's feasible to do it only using plugins, since I suspect I must implement my own Indexer.

Plugins may also return null doc. Standard Indexer would have to be modified to handle this gracefully, but it's trivial:

Indexer.java:239

   try {
     // run indexing filters
     doc = this.filters.filter(doc, parse, (Text)key, fetchDatum, inlinks);
   } catch (IndexingException e) {
     if (LOG.isWarnEnabled()) { LOG.warn("Error indexing "+key+": "+e); }
     return;
   }
+   if (doc == null) return;

--
Best regards,
Andrzej Bialecki     <><
___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Reply via email to