On 2/3/11 10:05 PM, David Juckett wrote:
I am relatively new to NLP and am having difficulty finding specifications
for the openNLP data structure for the Model files. The bin.gz format is
opaque to interpretation. . I'd rather not have to go digging into the
source code to figure it out. I was hoping for a simple specification.
Could you please direct me to any documentation that you know of which
describes the formation of these model files, their format, and how they are
used in the annotators after training.
It is documented in the source code. Just have a look inside the maxent
project.
e.g. at GisModelWriter or PerceptronModelWriter.
We are still a little sparse on documentation ... if you think it is
important
please send us a patch to the docbook project.
Most of the tools components use a zip package which contains all the
files they need, typically that are Maxent/Perceptron models, dictionaries
and other resources.
Jörn