On 06/13/2012 07:07 PM, Carlos Scheidecker wrote:
Thanks. So for now we can only use the models from 1.4. I saw that a
training class was added recently. How do you use that?
Thats still work in progress, on which data do you want to train?
You need to produce data in a certain format, there should be a sample
in the test folder.
Its basically penn treebank style plus some nodes to label the mentions
in the tree.
The parse trees of a document are grouped and send document wise
to the trainer via a stream. After this is done a new model will be trained.
The OpenNLP corferencer works currently only on noun phrases, other mentions
like verbs will not be resolved (in case you wanna train on OntoNotes).
Jörn