Markus, I believe the WhitespaceTokenizer is used [1].
Jeff [1] https://github.com/apache/opennlp/blob/4362e02ed0404d12ca75ee3476d4a32f9f671811/opennlp-tools/src/main/java/opennlp/tools/namefind/NameSample.java#L220 On Wed, Sep 27, 2017 at 4:13 AM, Markus Kreuzthaler < markus.kreuztha...@gmail.com> wrote: > Hello! > > Does anyone know, what tokenizer is used when applying NameFinderME for > training a custom named entity recognition model? I was searching but I > could not find this information. > > I have to attach the same tokenizer when using the trained model, but I > don't know which one was used. > > Therefore at the moment I just tokenize via: > String[] tokens = sentence.getCoveredText().split("\\s+"); > > Thank you for feedback! > > lg Markus >