Training Parameter passing

Jörn Kottmann Tue, 17 May 2011 07:23:54 -0700

Hi all,

as proposed earlier I think we should go ahead and define/implement thetrainingparameters format and classes. We need to define the format and decidehow we change

our current training implementation.


I believe it should be part of OpenNLP Tools and not the maxent package,

for two reasons, first it should be possible to define parameters fordifferent models,where maxent only deals with one model at a time, and the new API doesnot depend on

maxent (which will be replaced with opennlp-ml).

The parser contains multiple models, maybe someone wants to train one ofthemwith perceptron and the other with maxent, or experiment with cutoff anditerations

for a certain model.

I propose that we simply use a java properties file.

For the name finder it could look like this:
Algorithm=MAXENT
Iterations=150
Cutoff=4

Or for the parser:
build.Algorithm=MAXENT
build.Iterations=180
build.Threads=4
check.Algorithm=MAXENT
check.Iterations=120
check.Threads=2
tagger.Algorithm=PERCEPTRON
tagger.Iterations=130
tagger.Cutoff=0

The maxent package will provide a small util which can validate theparameters for a certain algorithm

and then do the training according to the parameters.

That could look like this:
isValid(Map<String, String> params);
train(Map<String, String> params, EventStream events)

Depending on the model which should be trained, the Training Parameterscan be reduced by

providing a name space.

To train the build model in the sample above the following would be done
TrainingParamters.getParams("build");
that return a Map<String, String> with this content:
Algorithm=MAXENT
Iterations=180
Threads=4

and the map is passed to the train method to train the model based onthe provided event stream.


Any opinions ?

I am +1 to do this change for 1.5.2, but we need to maintain strictbackward compatibilty.


Jörn

Training Parameter passing

Reply via email to