For cutoff, it is the minimum number of times a feature has to occur
before it is included in the model. This reduces noise. You don't
features that only occur once to appear in the model. You might reduce
the cutoff if you have a small training set.

For iterations=100. "The iterations parameter can largely be ignored,
but as the model trains, it'll output for each step of these 100
iterations." (Taming Text, Ingersol, pg. 134).

On Sat, Oct 4, 2014 at 5:35 AM, nikhil jain
<[email protected]> wrote:
> Hi,
> I am using OpenNLP Token Name Finder module for parsing some documents. For 
> generating the model, I am using below command line which is mentioned in the 
> documentation page and working fine.
> opennlp TokenNamefinderTrainer -model <model_name> -lang en -data <training 
> file> -encoding UTF-8
> My question is, can someone explains me the signification of -iterations and 
> -cutoff parameters in layman terms because when i am modifying these 
> parameters by putting these parameters in my command line and give some 
> different values like for iterations 80 or 120, similarly 20 or 40 to cutoff, 
> I can see the difference in my model but I do not understand what is 
> happening exactly.I know default is 100 for iterations and 5 for cutoff.
>
>
> BTW, I am new in machine learning and natural language processing.
> Please explain me with example.
> Thanks in advance.Nikhil Jain



-- 
_________________________________________
johnmiedema.com

Reply via email to