[
https://issues.apache.org/jira/browse/OPENNLP-397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13154221#comment-13154221
]
Joern Kottmann commented on OPENNLP-397:
----------------------------------------
Does ebooks.txt contains a sentence per line?
In my tests I usually got a throughput of 1000 sentences per line. But i only
have a Core Duo 2 CPU in my MacBook.
It might also depend on the data file, I usually used the English 300K sentence
file from the leipzig corpus.
> IndexHashTable can be improved
> ------------------------------
>
> Key: OPENNLP-397
> URL: https://issues.apache.org/jira/browse/OPENNLP-397
> Project: OpenNLP
> Issue Type: Improvement
> Components: Maxent
> Affects Versions: maxent-3.0.3-incubating
> Reporter: Catalin Mititelu
> Priority: Minor
> Labels: patch
> Attachments: patch-IndexHashTable.txt
>
>
> Running a profiler on POSTagger with an maxent model showed me a lot of CPU
> usage on IndexHashTable class. This class can be optimized to be faster.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira