[
https://issues.apache.org/jira/browse/STANBOL-1422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rupert Westenthaler resolved STANBOL-1422.
------------------------------------------
Resolution: Fixed
Fix Version/s: 1.0.0
added support for IXA NERC with http://svn.apache.org/r1685056. The extensions
are provided by the new {{o.a.s.commons.exa-pipe-nerc}} module. All extensions
are registered as OSGI service so that they can be found by OpenNLP.
> Add support for ixa-nerc NER models
> -----------------------------------
>
> Key: STANBOL-1422
> URL: https://issues.apache.org/jira/browse/STANBOL-1422
> Project: Stanbol
> Issue Type: Bug
> Reporter: Rupert Westenthaler
> Assignee: Rupert Westenthaler
> Fix For: 1.0.0
>
>
> The ixa-pipe-nec [1] provides good quality Named Entity Recognition models
> for English, Spanish, Dutch, German and Italian. However to use those models
> one needs
> * OpenNLP 1.6.0
> * OpenNLP extensions provided by the ixa-pipe-nec module.
> OpenNLP 1.6.0 is not yet released so we will need to go with a SNAPSHOT
> version for now. The ixa-pipe-nec module does not support OSGI. So we will
> need to embed the required classes into a bundle and provide a bundle
> activator that registers the extensions as OSGI services (with the metadata
> expected by OpenNLP).
> NOTE: This issue will only cover extensions to Apache Stanbol so that one
> cane use the provided models. To use the models Users will need to download
> the ~700Mbyte archive linked on [1] get the OpenNLP modles (*.bin files) and
> put them into datafiles folder of Apache Stanbol.
> The models use PER, ORG, LOC and MISC as types. So using a configuration for
> the CustomNERModelEnhancementEngine should do the trick:
> {code}
> # Configuration of
> org.apache.stanbol.enhancer.engines.opennlp.impl.CustomNERModelEnhancementEngine-ixa_nec.config
> stanbol.engines.opennlp-ner.typeMappings=["PER\ >\
> http://dbpedia.org/ontology/Person","ORG\ >\
> http://dbpedia.org/ontology/Organisation","LOC\ >\
> http://dbpedia.org/ontology/Place","MISC\ >\ skos:Concept"]
> stanbol.enhancer.engine.name="ixa-nerc"
> stanbol.engines.opennlp-ner.nameFinderModels=["de-clusters-dictlbj-conll03.bin","en-91-18-4-class-muc7-conll03-ontonotes-4.0.bin","es-clusters-dictlbj-conll02.bin","it-clusters-evalita09.bin","nl-clusters-dictlbj-conll02.bin","eu-clusters-egunkaria.bin"]
> {code}
> The names of the OpenNLP model files are the values of the
> {{stanbol.engines.opennlp-ner.nameFinderModels}} property. You will find
> those files in the NERC-Models 1.5.0 file. See the documentation on [1] for
> more details and other options.
> [1] https://github.com/ixa-ehu/ixa-pipe-nerc/
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)