Koji Sekiguchi created OPENNLP-1214:
---------------------------------------

             Summary: use hash to avoid linear search in 
DefaultEndOfSentenceScanner
                 Key: OPENNLP-1214
                 URL: https://issues.apache.org/jira/browse/OPENNLP-1214
             Project: OpenNLP
          Issue Type: Improvement
    Affects Versions: 1.9.0
            Reporter: Koji Sekiguchi
             Fix For: 1.9.1


When DefaultEndOfSentenceScanner scans a sentence, it uses linear search to 
check if each characters in the sentence is one of eos characters. I think we'd 
better use HashSet to keep eosCharacters instead of char[].

In accordance with this replacement, I'd like to make 
getEndOfSentenceCharacters() deprecated because it returns char[] and nobody in 
OpenNLP calls it at present, and I'd like to add the equivalent method which 
returns Set<Character> of eos chars. Though it cannot keep the order of eos 
chars but I don't think it can be a problem anyway.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to