Hi William,
Thanks for the link.
I have tried both model Maxent and perception on my problem and Perception is
working much better than Maxent.
I have one question, when I am creating a perceptron model using cutoff 5 and
iterations 100 then after 5th iteration model is adjusting itself and not going
forward for further iterations, so my question is, is it correct behaviour or I
am doing something wrong.
Adding some code and logs for the reference.
ObjectStream<NameSample> sampleStream = new NameSampleDataStream(lineStream);
TokenNameFinderModel model = null;
TrainingParameters tp = new TrainingParameters();
//tp.put(TrainingParameters.ALGORITHM_PARAM, "MAXENT");
tp.put(TrainingParameters.ALGORITHM_PARAM, "PERCEPTRON");
System.out.println("244:Hybrid parser:PERCEPTRON");
tp.put(TrainingParameters.ITERATIONS_PARAM,
Integer.toString(100));
tp.put(TrainingParameters.CUTOFF_PARAM, Integer.toString(5));
tp.put("Threads", "3");
opennlp.tools.util.featuregen.AdaptiveFeatureGenerator generator
= null;
try {
Map<String, Object> resources = null;
model = NameFinderME.train( "en", "security",
sampleStream, tp, generator, resources);
} catch (IOException e) {
Indexing events using cutoff of 5
Computing event counts... done. 8209384 events
Indexing... done.
Collecting events... Done indexing.
Incorporating indexed data for training...
done.
Number of Event Tokens: 8209384
Number of Outcomes: 34
Number of Predicates: 325780
Computing model parameters...
Performing 100 iterations.
1: . (8209184/8209384) 0.999975637636149
2: . (8209291/8209384) 0.9999886715008093
3: . (8209340/8209384) 0.9999946402799528
4: . (8209356/8209384) 0.9999965892690609
5: . (8209357/8209384) 0.9999967110808802
Stopping: change in training set accuracy less than 1.0E-5
Stats: (8104703/8209384) 0.9872486169486042
...done.
Compressed 325780 parameters to 3957
532 outcome patterns
Thanks
Nikhil
Sent from Yahoo Mail on Android
From:"William Colen" <[email protected]>
Date:Fri, May 29, 2015 at 5:47 PM
Subject:Re: OpenNLP: Named Entity Recognition ( Token Name Finder )
The answer about the differences would be quite long. You can learn about
the theory researching online. Try some papers from here:
https://cwiki.apache.org/confluence/display/OPENNLP/NLP+Papers
Which algorithm is better for you depends on your task and your data. You
can start developing using the standard Maxent and when your environment is
ready you can try other ML implementations.
Regards,
William
2015-05-29 7:07 GMT-03:00 nikhil jain <[email protected]>:
> Hello,
>
>
> Did anyone get a chance to look at the email. I know I am asking a very
> basic question but being a new in this subject, its very difficult to
> understand the terms.
>
>
> I tried to understand by reading wiki pages but not fully understand that
> why I raised a question.
>
>
> Thanks
>
> Nikhil
>
> Sent from Yahoo Mail on Android
>
> From:"nikhil jain" <[email protected]>
> Date:Tue, May 19, 2015 at 11:51 PM
> Subject:OpenNLP: Named Entity Recognition ( Token Name Finder )
>
> Hello Everyone,
>
>
> I was reading a openNLP documentation, and found that OpenNLP supports
> Maxent, Perceptron and Perceptron sequence type models.
>
>
> Could someone please explain me the difference in between them?
>
>
> I am trying to understand which one would be good for tagging sequence of
> data.
>
>
> BTW, I am new in NLP and Machine learning. so please help me to understand
> this.
>
>
> Thanks
>
> Nikhil Jain
>
>
>
>
>
>
>