Hi,

I'm a bit new to OpenNLP, and I'm interested in the name finder functionality.
The embedded organization model works relatively well for me, but not 
sufficiently good. So I decided to go for training. However, I can't achieve 
stable results. I would appreciate if anybody could answer a couple of 
questions:

1) What are the characteristics of a good training data set? I have a training 
data generator that injects many different organizations into some set of 
predefined sentences

2) I guess I need to implement adaptive feature generators? Is there some good 
documentation how to do so? Maybe even some books? Description of how 
namefinder works will definitely be useful.

3) Based on what characteristics I should choose a number of iterations and 
cutoff?

4) Can I train a model for several languages at a time? 

Any other suggestions/pointers are highly appreciated.

Thanks a lot in advance,
Vyacheslav
 

Reply via email to