Hi all,

I am proposing that we move the coref component into the sandbox until we manage to train and test it on a publicly available dataset. In the current state it is complicated to maintain the code because without training it can't be tested properly, which makes bigger changes on OpenNLP
difficult, for example the maxent refactoring.

I tried to implement parsers for the MUC corpus and added training code, but it does not yet work as well as the current models on SourceForge. More work is needed to get everything fixed.

Additionally the code should be refactored like the other components in OpenNLP, e.g. one model instantiation, build in evaluation, simple training, etc. There is a jira issue with
all the details.

Any opinions?

Jörn

Reply via email to