Issue can be found here: https://issues.apache.org/jira/browse/OPENNLP-702
On Tuesday, June 10, 2014 3:38 AM, Jörn Kottmann <[email protected]> wrote: Hello, that looks like a bug. Please open a jira issue. Thanks, Jörn On 06/09/2014 01:08 AM, Richard Head Jr. wrote: > Here's my dictionary: > > <?xml version="1.0" encoding="UTF-8"?> > <dictionary case_sensitive="false"> > <entry> > <token>vitamin</token> > <token>b12</token> > </entry> > <entry> > <token>vitamin</token> > <token>b</token> > </entry> > <entry> > <token>john</token> > <token>doe</token> > </entry> > <entry> > <token>john</token> > <token>d</token> > </entry> > </dictionary> > > When ran on this sentence using a DictionaryNameFinder: My name is john doe, > aka john d. I like vitamin b12. > > The following tokens are found: john doe, john d, vitamin b > > As you can see, when the 2nd token ends in a number, the longest match is > discarded. > Bug, or am I missing something? > > Thanks
