Hi William! I found this issue which was obviously fixed: https://issues.apache.org/jira/browse/OPENNLP-602
So when I have a sentence like: The quick brown fox jumps over the lazy dog I will encode my training sentence in one line as: The quick brown fox <LF> jumps over the lazy dog <LF> Eventhough I am not sure if I can avoid the line space after dog so swiching to The quick brown fox <LF> jumps over the lazy dog<LF> I will give it a try, or maybe someone can give me a hint which version is correct... Thank you! lg Markus 2017-09-27 17:44 GMT+02:00 William Colen <william.co...@gmail.com>: > Sentence detector will have a bad time learning from samples without EOS > (end of sentence) mark. This is common in headlines of articles, for > example. > I usually remove from the training/evaluating corpus sentences with no > clear EOS. > During runtime, I apply some code to split sentences in new lines if I can > clear identify it as a complete headline. > > > Regards > William > > 2017-09-27 11:10 GMT-03:00 Gary Underwood <gunderw...@clinacuity.com>: > > > The sentences for training are in the format of 1 per line so it should > be > > fine as it is (unless you have sentences that also span lines). > > > > Gary Underwood > > gunderw...@clinacuity.com > > > > > > > > > On Sep 27, 2017, at 6:49 AM, Markus Kreuzthaler < > > markus.kreuztha...@gmail.com> wrote: > > > > > > Hello! > > > > > > How do I have to prepare the training data for sentence detection when > I > > > have cases where sentences end just via a new line char, without e.g. a > > > period character / full stop at the end of the training sentence. > > > > > > Is there some special encoding for this case? > > > > > > Thank you for you help! > > > > > > lg Markus > > > > >