Hi all, recently I've been working with Solr to enable named entity recognition of indexed documents which I did with UIMA so I wonder if that could be an interesting use case for Stanbol as well.
For the mentioned purpose I've developed a custom UpdateHandler[1] for Solr which enables enriching of documents being indexed with Apache UIMA on the basis of the following use case: 1. user sends documents to Solr 2. each document received by Solr is sent to a UIMA analysis pipeline just before it gets indexed 3. the UIMA pipeline extracts enrichments, i.e. named entites 4. the enrichments are written to Solr fields on the basis of a mapping configuration 5. the enriched Solr document is actually written inside the index In my opinion that could be done also with Stanbol Enhancer. Such an integration could run on top of the already developed contrib module [2][3] or with a separate one written from scratch; obviously such options have advantages and drawbacks we can discuss (later?). What do you think? Cheers, Tommaso [1] : http://wiki.apache.org/solr/SolrPlugins#UpdateHandler [2] : http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/contrib/uima/ [3] : http://wiki.apache.org/solr/SolrUIMA
