Re: Tokenizer in NameFinderME

Jeff Zemerick Wed, 27 Sep 2017 04:04:08 -0700

Markus,

I believe the WhitespaceTokenizer is used [1].


Jeff

[1]
https://github.com/apache/opennlp/blob/4362e02ed0404d12ca75ee3476d4a32f9f671811/opennlp-tools/src/main/java/opennlp/tools/namefind/NameSample.java#L220

On Wed, Sep 27, 2017 at 4:13 AM, Markus Kreuzthaler <
markus.kreuztha...@gmail.com> wrote:

> Hello!
>
> Does anyone know, what tokenizer is used when applying NameFinderME for
> training a custom named entity recognition model? I was searching but I
> could not find this information.
>
> I have to attach the same tokenizer when using the trained model, but I
> don't know which one was used.
>
> Therefore at the moment I just tokenize via:
> String[] tokens = sentence.getCoveredText().split("\\s+");
>
> Thank you for feedback!
>
> lg Markus
>

Re: Tokenizer in NameFinderME

Reply via email to