Koji Sekiguchi created OPENNLP-1214:
---------------------------------------
Summary: use hash to avoid linear search in
DefaultEndOfSentenceScanner
Key: OPENNLP-1214
URL: https://issues.apache.org/jira/browse/OPENNLP-1214
Project: OpenNLP
Issue Type: Improvement
Affects Versions: 1.9.0
Reporter: Koji Sekiguchi
Fix For: 1.9.1
When DefaultEndOfSentenceScanner scans a sentence, it uses linear search to
check if each characters in the sentence is one of eos characters. I think we'd
better use HashSet to keep eosCharacters instead of char[].
In accordance with this replacement, I'd like to make
getEndOfSentenceCharacters() deprecated because it returns char[] and nobody in
OpenNLP calls it at present, and I'd like to add the equivalent method which
returns Set<Character> of eos chars. Though it cannot keep the order of eos
chars but I don't think it can be a problem anyway.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)