Re: Solr now supports UIMA

Tommaso Teofili Mon, 04 Apr 2011 09:24:41 -0700

Hi all,

as you may see reading the wiki page linked by Jörn that integration enables
calling a UIMA pipeline from inside a Solr instance.
This is done via a dedicated component one can add to the chain of
UpdateRequestProcessors that are responsible of processing documents when
they come to Solr to be indexed.


So, if you add the UIMAUpdateRequestProcessor to the Solr update chain, the
flow goes like:
1. Document is sent to Solr to be indexed
2. Solr reads the (configurable) fields which contain text to be sent to
UIMA inside the CAS
3. Solr sends the CAS to an AnalysisEngine (configurable)
4. Once the UIMA pipeline has ended, Solr writes UIMA annotations' feature
values to the Solr document's fields (the mapping is configurable)
5. Solr sends the enriched document to the next processor of the Solr update
chain (which leads finally to the actual writing to the index)

The aggregate Analysis Engine shipped with Solr uses some Sandbox, aka UIMA
Addons, components (WhitespaceTokenizer, HMMTagger, AlchemyAPIAnnotator,
OpenCalaisAnnotator) to demonstrate some basic enrichment capabilities. That
can obviously be changed/extended as one wish.

The current implementation runs UIMA pipelines with the simplest way an
external app could do [1] but it'd be good to provide support to add support
for CPEs and UIMA-AS, so anyone interested in helping with that is welcome.

Obviously, any feedback is welcome too :)

Tommaso

[1] :
http://uima.apache.org/d/uimaj-2.3.1/tutorials_and_users_guides.html#ugr.tug.application.using_aes

2011/4/4 Jörn Kottmann <[email protected]>

> Hi all,
>
> some might already know it, the new Solr 3.1 has now support for UIMA:
> http://wiki.apache.org/solr/SolrUIMA
>
> Jörn
>

Re: Solr now supports UIMA

Reply via email to