Hello Rupert, On Thu, Jun 18, 2015 at 10:00 AM, Rupert Westenthaler <rupert.westentha...@gmail.com> wrote: > Hi Rodrigo > > I took you mail from the contact section of the Github. Feel free to > forward this to the whole team if you like.
I am the team :) > > First I want to thank the whole IXA pipes team team for providing all > those extensions and especially high quality NER models for OpenNLP. > As a German speaking person I especially enjoy using the German NER > model a lot. Glad to hear, thanks. > > With this message I want let you know about the integration of the IXA > pipe nerc module with Apache Stanbol (see STANBOL-1422 [1]). I am very happy to hear that a nice project such as Stanbol is interested in using it. > The mein contribution of this Integration is code that allows to use > your models for and extensions to OpenNLP to be used in applications > running within an OSGI environment (such as Apache Stanbol). I am also an Apache OpenNLP commiter. I develop/use the ixa pipes for my academic research and our own university projects and I see ixa-pipes as being ready to use tools whereas I see OpenNLP as a library to create your own NLP projects via its API, which is what I do for some of the pipes, namely, the NER. As you know, ixa-pipe-nerc uses the machine learning components of OpenNLP (which are very nice IMHO) to provide customized models built on top of the OpenNLP infrastructure. Apart from this, whenever it is feasible I try to commit any specific development of ixa pipes to OpenNLP, apart from contributing in other ways, of course. For example, the clustering features implemented in ixa-pipe-nerc and responsible for the good performance of the models have been contributed to OpenNLP [1], [2], [3]. Other aspects, such as the ixa-pipe-nerc dictionary features have not been contributed because there are already DictionaryFeatures (although different) implemented in OpenNLP. To cut a long story short, I have not problem whatsoever in helping with the OSGI to make it easier for Stanbol to integrate the ixa-pipe-nerc models. However, and followed from what I said, I can also train NERC models in OpenNLP native mode with a similar configuration as the ones I distribute in ixa-pipe-nerc; "similar" because some of features, namely, gazetteers are not implemented in OpenNLP. Please let me know which option would be more interesting for the Apache Stanbol project. [1] https://issues.apache.org/jira/browse/OPENNLP-714 [2] https://issues.apache.org/jira/browse/OPENNLP-715 [3] https://issues.apache.org/jira/browse/OPENNLP-716 Best, Rodrigo