Sentence Detector on TIGER corpus

Stefan Schweter Fri, 29 Sep 2017 15:03:38 -0700

Hi OpenNLP-users,

I have one question about the pretrained model for the German sentence
detector.


The documentation says:

"Usually Sentence Detection is done **before** the text is tokenized and
that's the way the pre-trained models on the web site are trained"

So how was the provided model for German exactly trained? The TIGER
corpus IS tokenized - so was the TIGER corpus detokenized for training?

Is there any documentation available so that I can reproduce the
training steps for the pretrained model?

Thanks + regards,

Stefan

Sentence Detector on TIGER corpus

Reply via email to