Re: Problem with openNLP Name Finder API....

Jim - FooBar(); Thu, 09 Feb 2012 01:11:15 -0800

On 08/02/12 22:48, Jörn Kottmann wrote:

In OpenNLP the tokenization during training time and execution
time must be identical. Otherwise the performance goes down.
In your case it is whitespace tokenized during training

and tokenized with the english maxent tokenizer during run time.

Ok so you mean that i should train my own tokenizer which will returntokens as " Folic " rather than "Folic"? How on earth can i do that? Idid try a week ago to train my own tokenizer but i got exactly the sameresults as the pretrained one! I don't understand how i can make atokenizer that will include spaces...Tokens must NOT include leadingand trailing spaces, am i right?

Jim

Re: Problem with openNLP Name Finder API....

Reply via email to