Dear Wiki user, You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.
The "SolrUIMA" page has been changed by TommasoTeofili. http://wiki.apache.org/solr/SolrUIMA?action=diff&rev1=10&rev2=11 -------------------------------------------------- === Installation (deprecated as patch has been committed) === - 1. download the [[https://issues.apache.org/jira/secure/attachment/12467988/SOLR-2129-version-6.patch|SOLR-2129.patch]] file - 2. go to your Lucene/Solr dev home and run 'patch -p0 < SOLR-2129.patch' - 3. download the [[https://issues.apache.org/jira/secure/attachment/12455557/lib-jars.zip|lib-jars.zip]] file - 4. go to dev/solr/contrib/uima and create a dir named lib - 5. unpack the jars contained in lib-jars.zip inside the dev/solr/contrib/uima/lib directory - 6. from the command line go to dev/solr/contrib/uima and run 'ant dist' + 1. Go to dev/solr/contrib/uima and run 'ant clean dist' - 7. get the package apache-solr-uima-4.0-SNAPSHOT.jar from dev/solr/contrib/uima/build together with the jars under the dev/solr/contrib/uima/lib directory and paste everything inside one of the lib directories of your Solr instance (defined inside the solrconfig.xml). + 2. get the package apache-solr-uima-4.0-SNAPSHOT.jar from dev/solr/contrib/uima/build together with the jars under the dev/solr/contrib/uima/lib directory and paste everything inside one of the lib directories of your Solr instance (defined inside the solrconfig.xml). - 8. modify your Solr instance config files as described in the dev/solr/contrib/uima/README.txt + 3. modify your Solr instance config files as described in the [[https://svn.apache.org/repos/asf/lucene/dev/trunk/solr/contrib/uima/README.txt|solr/contrib/uima/README.txt]] - 9. run your Solr instance and enjoy UIMA enriching documents being indexed + 4. run your Solr instance and enjoy UIMA enriching documents being indexed === Configuration === @@ -51, +46 @@ see [[https://issues.apache.org/jira/browse/SOLR-2129|SOLR-2129]] - ==== UIMA components used ==== + === UIMA components used === UIMA supports the use of existing analysis engines (see [[http://uima.apache.org/sandbox.html|here]] and [[http://uima.apache.org/external-resources.html|here]]) as long as the creation of custom components. The current contrib/uima module uses a predefined set of components : @@ -72, +67 @@ }}} the first node represent an aggregate Analysis Engine which includes the Whitespace Tokenizer and HMM Tagger (recognizing sentences), the second node uses the Open Calais Annotator to extracte named entities, the following nodes use different Alchemy API Annotator services to detect keywords, language, document category, discovered concepts and named entities. - ===== Using other UIMA components ===== + ==== Using other UIMA components ==== To use different UIMA components inside the contrib/uima module you need to: 1. import the component jar 2. change the descriptor inside config/uimaConfig/analysisEngine element of solrconfig.xml 3. optionally adjust Analysis Engine configuration 3. change the types and features' mapping inside config/uimaConfig/fieldMapping element of solrconfig.xml - ====== Import the component jar ====== + ===== Import the component jar ===== If you're using Ant you only need put the component jar inside the solr/contrib/uima/lib directory. - If you're using Maven you need to declare the component you want to use inside the <dependencies> element in the generated pom.xml + If you're using Maven you need to declare the component you want to use inside the <dependencies> element in the generated pom.xml. - ====== Change the descriptor ====== + For example if you want to use UIMA Dictionary Annotator 2.3.1-SNAPSHOT you can either get it from [[https://repository.apache.org/content/repositories/snapshots/org/apache/uima/DictionaryAnnotator/2.3.1-SNAPSHOT/|snapshot repo]] and paste it in solr/contrib/uima/lib and run 'ant clean dist' or paste the following in the generated pom.xml (as child of the <dependencies> tag) and run 'mvn clean package'. + {{{ + <dependency> + <groupId>org.apache.uima</groupId> + <artifactId>DictionaryAnnotator</artifactId> + <version>2.3.1-SNAPSHOT</version> + </dependency> + }}} - ====== Adjust AE configuration (optional) ====== + ===== Change the descriptor ===== + Change the descriptor to be used by this module inside config/uimaConfig/analysisEngine of the solrconfig.xml of your Solr instance. + + One can use the default one bundled inside the component or create a new one. + + For example to use one of the default Dictionary Annotator Analysis Engine descriptors use the following (which runs Whitespace Tokenizer and then Dictionary Annotator): + {{{ + <config> + ... + <uimaConfig> + ... + <analysisEngine>/AggregateAE.xml</analysisEngine> + ... + </uimaConfig> + ... + </config> + }}} + + + ===== Adjust AE configuration (optional) ===== + - ====== Change the types and features' mapping ====== + ===== Change the types and features' mapping ===== + + + == Solrcas ==