Re: IXA pipe nerc integration with Apache Stanbol

Rodrigo Agerri Fri, 19 Jun 2015 02:45:04 -0700

Hello Rupert,

On Thu, Jun 18, 2015 at 10:00 AM, Rupert Westenthaler
<[email protected]> wrote:
> Hi Rodrigo
>
> I took you mail from the contact section of the Github. Feel free to
> forward this to the whole team if you like.


I am the team :)

>
> First I want to thank the whole IXA pipes team team for providing all
> those extensions and especially high quality NER models for OpenNLP.
> As a German speaking person I especially enjoy using the German NER
> model a lot.

Glad to hear, thanks.

>
> With this message I want let you know about the integration of the IXA
> pipe nerc module with Apache Stanbol (see STANBOL-1422 [1]).

I am very happy to hear that a nice project such as Stanbol is
interested in using it.

> The mein contribution of this Integration is code that allows to use
> your models for and extensions to OpenNLP to be used in applications
> running within an OSGI environment (such as Apache Stanbol).

I am also an Apache OpenNLP commiter. I develop/use the ixa pipes for
my academic research and our own university projects and I see
ixa-pipes as being ready to use tools whereas I see OpenNLP as a
library to create your own NLP projects via its API, which is what I
do for some of the pipes, namely, the NER. As you know, ixa-pipe-nerc
uses the machine learning components of OpenNLP (which are very nice
IMHO) to provide customized models built on top of the OpenNLP
infrastructure. Apart from this, whenever it is feasible I try to
commit any specific development of ixa pipes to OpenNLP, apart from
contributing in other ways, of course. For example, the clustering
features implemented in ixa-pipe-nerc and responsible for the good
performance of the models have been contributed to OpenNLP [1], [2],
[3]. Other aspects, such as the ixa-pipe-nerc dictionary features have
not been contributed because there are already DictionaryFeatures
(although different) implemented in OpenNLP.

To cut a long story short, I have not problem whatsoever in helping
with the OSGI to make it easier for Stanbol to integrate the
ixa-pipe-nerc models. However, and followed from what I said, I can
also train NERC models in OpenNLP native mode with a similar
configuration as the ones I distribute in ixa-pipe-nerc; "similar"
because some of features, namely, gazetteers are not implemented in
OpenNLP.

Please let me know which option would be more interesting for the
Apache Stanbol project.

[1] https://issues.apache.org/jira/browse/OPENNLP-714
[2] https://issues.apache.org/jira/browse/OPENNLP-715
[3] https://issues.apache.org/jira/browse/OPENNLP-716

Best,

Rodrigo

Re: IXA pipe nerc integration with Apache Stanbol

Reply via email to