[ 
https://issues.apache.org/jira/browse/OPENNLP-196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13044776#comment-13044776
 ] 

Jörn Kottmann commented on OPENNLP-196:
---------------------------------------

First test without the fix:

Got 64691 sequences
Indexing events using cutoff of 5

        Computing event counts...  done. 1422335 events
        Indexing...  done.
Sorting and merging events... Done indexing.
Incorporating indexed data for training...  
done.
        Number of Event Tokens: 1422335
            Number of Outcomes: 45
          Number of Predicates: 103444
Computing model parameters...
Performing 5 iterations.
  1:  . (1338465/1422335) 0.9410335821026692
  2:  . (1362083/1422335) 0.9576386716209613
  3:  . (1368995/1422335) 0.962498286268706
  4:  . (1373167/1422335) 0.9654314911747233
  5:  . (1376065/1422335) 0.9674689858577621
. (1381396/1422335) 0.971217048023145
...done.
Writing pos tagger model ... Compressed 103444 parameters to 74134
22966 outcome patterns
done (4.927s)

Wrote pos tagger model to
path: /Users/joern/dev/opennlp-apache/opennlp/opennlp-tools/en-pos.bin


real    24m35.059s
user    24m31.178s
sys     1m10.894s


Second test with the fix:

Got 64691 sequences
Indexing events using cutoff of 5

        Computing event counts...  done. 1422335 events
        Indexing...  done.
Sorting and merging events... Done indexing.
Incorporating indexed data for training...  
done.
        Number of Event Tokens: 1422335
            Number of Outcomes: 45
          Number of Predicates: 103444
Computing model parameters...
Performing 5 iterations.
  1:  . (1338465/1422335) 0.9410335821026692
  2:  . (1362083/1422335) 0.9576386716209613
  3:  . (1368995/1422335) 0.962498286268706
  4:  . (1373167/1422335) 0.9654314911747233
  5:  . (1376065/1422335) 0.9674689858577621
. (1381396/1422335) 0.971217048023145
...done.
Writing pos tagger model ... Compressed 103444 parameters to 74134
22966 outcome patterns
done (5.564s)

Wrote pos tagger model to
path: /Users/joern/dev/opennlp-apache/opennlp/opennlp-tools/en-pos.bin


real    14m34.409s
user    13m28.532s
sys     0m36.698s


> POS Tagger Sequence streams calls generateEvents in a loop 
> -----------------------------------------------------------
>
>                 Key: OPENNLP-196
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-196
>             Project: OpenNLP
>          Issue Type: Bug
>          Components: POS Tagger
>    Affects Versions: tools-1.5.1-incubating
>            Reporter: Jörn Kottmann
>            Assignee: Jörn Kottmann
>            Priority: Trivial
>             Fix For: tools-1.5.2-incubating
>
>
> The POS Tagger Sequence Stream class the generateEvents in a loop, but one 
> call is enough.
> To fix this issue remove the loop around generateEvents.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to