Re: Models in spanish

Jörn Kottmann Wed, 19 Mar 2014 06:33:01 -0700

I had a short look at the paper. For English NER you might want in
addition to publish OntoNotes models. There is format support for that
in OpenNLP.

Maybe it could be interesting for you to contribute the work you did onthe tokenization

or coref component to OpenNLP.

Jörn

On 03/19/2014 01:59 PM, Rodrigo Agerri wrote:

Hi,

We have new models 1.5.3 for pos, ner (conll 2002), parser (Ancora) with 
evaluations
etc and so on as part of the IXA pipeline tools.

We also have tokenizer (tried opennlp models and were not adaptable enough)
based on JFlex specification. Coreference resolution (loosely based on Stanford
NLP approach) coming very soon (for May).

More info here:

http://www.rodrigoagerri.net/recent-papers/ixa-pipes.pdf?attredirects=0&d=1

Thanks,

Rodrigo

On 2014/03/19 at 12:39, Charles Jalin wrote:

For tokenizer, sentence, pos tagger y tokchunk.

I amn't sure that i can obtain Spanish corpora.

Thanks.


2014-03-19 12:08 GMT+01:00 Jörn Kottmann <[email protected]>:

On 03/19/2014 12:01 PM, Charles Jalin wrote:

How i do this?

Depends on the model. For which component?

Anyway, the best way to improve the situation would be to
add support to OpenNLP to train it on the available Spanish corpora.

Jörn

Re: Models in spanish

Reply via email to