[jira] [Commented] (OPENNLP-155) unreliable training set accuracy in perceptron

Jason Baldridge (JIRA) Mon, 25 Apr 2011 18:46:44 -0700

    [ 
https://issues.apache.org/jira/browse/OPENNLP-155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13025022#comment-13025022
 ]


Jason Baldridge commented on OPENNLP-155:
-----------------------------------------

2011/4/22 Jörn Kottmann (JIRA) <[email protected]>

Correct.


Great! The basic problem was that the model was probably stopping too soon
because of the failure to obtain the actual accuracy on the training set.
Basically, it thought it was doing very well, when in fact that value was
too high because of the sorted way examples were presented to the training
algorithm.

BTW, the diff is difficult to read because you changed many white spaces and
Sorry -- the code was hard to wade through, and reorganizing it helped me
see what was going on. I also got rid of unnecessary code duplication by
defining a variable updateValue that is +1 for the correct label and -1 for
the incorrect labels. That turned this:

for (int oi = 0;oi<numOutcomes;oi++) {
  if (oi == outcomeList[oei]) {
    if (modelDistribution[oi] <= 0) {
      for (int ci = 0; ci < contexts[ei].length; ci++) {
        int pi = contexts[ei][ci];
        if (values == null) {
          params[pi].updateParameter(oi, 1);
        }
        else {
          params[pi].updateParameter(oi, values[ei][ci]);
        }
        if (useAverage) {
          if (updates[pi][oi][VALUE] != 0) {

averageParams[pi].updateParameter(oi,updates[pi][oi][VALUE]*(numEvents*(iteration-updates[pi][oi][ITER])+(ei-updates[pi][oi][EVENT])));
            //System.err.println("p
avp["+pi+"]."+oi+"="+averageParams[pi].getParameters()[oi]);
          }
          //System.err.println("p
updates["+pi+"]["+oi+"]=("+updates[pi][oi][ITER]+","+updates[pi][oi][EVENT]+","+updates[pi][oi][VALUE]+")
+ ("+iteration+","+ei+","+params[pi].getParameters()[oi]+") ->
"+averageParams[pi].getParameters()[oi]);
          updates[pi][oi][VALUE] = (int) params[pi].getParameters()[oi];
          updates[pi][oi][ITER] = iteration;
          updates[pi][oi][EVENT] = ei;
        }
      }
    }
  }
  else {
    if (modelDistribution[oi] > 0) {
      for (int ci = 0; ci < contexts[ei].length; ci++) {
        int pi = contexts[ei][ci];
        if (values == null) {
          params[pi].updateParameter(oi, -1);
        }
        else {
          params[pi].updateParameter(oi, -1*values[ei][ci]);
        }
        if (useAverage) {
          if (updates[pi][oi][VALUE] != 0) {

averageParams[pi].updateParameter(oi,updates[pi][oi][VALUE]*(numEvents*(iteration-updates[pi][oi][ITER])+(ei-updates[pi][oi][EVENT])));
            //System.err.println("d
avp["+pi+"]."+oi+"="+averageParams[pi].getParameters()[oi]);
          }
          //System.err.println(ei+" d
updates["+pi+"]["+oi+"]=("+updates[pi][oi][ITER]+","+updates[pi][oi][EVENT]+","+updates[pi][oi][VALUE]+")
+ ("+iteration+","+ei+","+params[pi].getParameters()[oi]+") ->
"+averageParams[pi].getParameters()[oi]);
          updates[pi][oi][VALUE] = (int) params[pi].getParameters()[oi];
          updates[pi][oi][ITER] = iteration;
          updates[pi][oi][EVENT] = ei;
        }
      }
    }
  }
}

into this:

for (int oi = 0;oi<numOutcomes;oi++) {
  int updateValue = -1;
  if (oi == outcomeList[oei])
    updateValue = 1;

  if (modelDistribution[oi]*updateValue <= 0) {
    for (int ci = 0; ci < contexts[ei].length; ci++) {
      int pi = contexts[ei][ci];
      if (values == null)
        params[pi].updateParameter(oi, updateValue);
      else
        params[pi].updateParameter(oi, updateValue*values[ei][ci]);

      if (useAverage) {

        if (updates[pi][oi][VALUE] != 0)
          averageParams[pi].updateParameter(oi,
                                            updates[pi][oi][VALUE] *
                                            (numEvents *
(iteration-updates[pi][oi][ITER])
                                             +
(ei-updates[pi][oi][EVENT])));

        updates[pi][oi][VALUE] = (int) params[pi].getParameters()[oi];
        updates[pi][oi][ITER] = iteration;
        updates[pi][oi][EVENT] = ei;
      }
    }
  }
}

-Jason





-- 
Jason Baldridge
Assistant Professor, Department of Linguistics
The University of Texas at Austin
http://www.jasonbaldridge.com
http://twitter.com/jasonbaldridge


> unreliable training set accuracy in perceptron
> ----------------------------------------------
>
>                 Key: OPENNLP-155
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-155
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Maxent
>    Affects Versions: maxent-3.0.1-incubating
>            Reporter: Jason Baldridge
>            Assignee: Jason Baldridge
>            Priority: Minor
>   Original Estimate: 0h
>  Remaining Estimate: 0h
>
> The training accuracies reported during perceptron training were much higher 
> than final training accuracy, which turned out to be an artifact of the way 
> training examples were ordered.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (OPENNLP-155) unreliable training set accuracy in perceptron

Reply via email to