Thanks, now I understand it better, I think that this information should go into doc, because the information about these values in the documentation is not very clear.
2012/7/7 James Kosin <[email protected]> > Daniel, > > The cutoff value is for filtering out 1-time or special cases that may > hinder training. If the training data doesn't contain at least 5 > occurrences, when set to 5, in the same context it will ignore the > training data. This happens when filtering the data and determining the > number of outcomes for the model to be trained for to find. > > The iterations are how many passes through the training set the trainer > will attempt before stopping. With the Maxent models this helps train > faster by keeping the number of iterations small. This value really > depends on the model generated and the type. Most of the testing is > done with 100 iterations just to get the training done quickly, due to > the size of the training data-sets used sometimes. The key features are > two numbers that get printed for each run (iteration), they indicate how > close to trained on the data-set they are. Be careful, the trick is to > train and not get the models to memorize the training set... this is > because the training set is only a snapshot in time... news article > limitation currently in most of the training; but, too small is also bad. > > There is also another stopping point, when the model is completely > trained and knows the training set. This happens when the numbers get > to optimal values... In my guess when the model predicts the training > set perfectly. > > James > > > On 7/5/2012 6:04 AM, Daniel wrote: > > when I am training nameFinders, I usually use 500 iteratios and 5 > > cutoff, but I really dont know why I should use these values or > > others, can anybody tell me something about this parameters? > > > > Thanks! > > >
