I had a short look at the paper. For English NER you might want in
addition to publish OntoNotes models. There is format support for that
in OpenNLP.
Maybe it could be interesting for you to contribute the work you did on
the tokenization
or coref component to OpenNLP.
Jörn
On 03/19/2014 01:59 PM, Rodrigo Agerri wrote:
Hi,
We have new models 1.5.3 for pos, ner (conll 2002), parser (Ancora) with
evaluations
etc and so on as part of the IXA pipeline tools.
We also have tokenizer (tried opennlp models and were not adaptable enough)
based on JFlex specification. Coreference resolution (loosely based on Stanford
NLP approach) coming very soon (for May).
More info here:
http://www.rodrigoagerri.net/recent-papers/ixa-pipes.pdf?attredirects=0&d=1
Thanks,
Rodrigo
On 2014/03/19 at 12:39, Charles Jalin wrote:
For tokenizer, sentence, pos tagger y tokchunk.
I amn't sure that i can obtain Spanish corpora.
Thanks.
2014-03-19 12:08 GMT+01:00 Jörn Kottmann <[email protected]>:
On 03/19/2014 12:01 PM, Charles Jalin wrote:
How i do this?
Depends on the model. For which component?
Anyway, the best way to improve the situation would be to
add support to OpenNLP to train it on the available Spanish corpora.
Jörn