[ 
https://issues.apache.org/jira/browse/OPENNLP-543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487685#comment-13487685
 ] 

Joern Kottmann commented on OPENNLP-543:
----------------------------------------

The Corpus Server hosts the data in UIMA CASes serialized in XMI. To train 
OpenNLP on a corpus you have to use the UIMA integration which actually works 
nicely. The UIMA integration can also generate the training data in the OpenNLP 
format which is handy if you wanna just have a look at the data or fine tune 
the training parameters.
                
> Documentation of OpenNLP Traning Format
> ---------------------------------------
>
>                 Key: OPENNLP-543
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-543
>             Project: OpenNLP
>          Issue Type: Bug
>            Reporter: Marc Schreiber
>
> Is there any documentation about the training formats which OpenNLP supports?
> I'm working on a project where we need our own models because the project 
> concentrates on specific domains. It would be really great if there is any 
> help for building your own models. 
> If there is no documentation I would offer my help for creating such a 
> documentation but I need someone who helps me with the training formats.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to