Thanks Jörn, I did update to 1.5.2 but it still makes the same mistake. I solve the problem by adding an extra white space between the lines.
Best, Svetoslav On 1/4/12 10:49 PM, "Jörn Kottmann" <kottm...@gmail.com> wrote: >Which OpenNLP version do you use? > >We improved the "space" handling in the sentence detector for 1.5.2, if >you are still on 1.5.1, I suggest that you update. > >Jörn > >On 1/4/12 12:20 PM, Svetoslav Marinov wrote: >> Hi all, >> >> I have starting using OpenNLP with the available Swedish models. One >>thing I noticed is that the sentence detection model does not perform >>properly when the full-stop is immediately followed by a newline >>character and the next sentence start immediately after that. So the >>following example: >> --- >> Hunden blir hundstjärnan, Sirius. >> Artemis skyddade de gravida kvinnorna >> --- >> Will be segmented as: >> >> <S> >> Hunden blir hundstjärnan, Sirius.Artemis >> </S> >> <S> >> skyddade de gravida kvinnorna >> </S> >> >> I am curious if someone has experienced similar problems with Swedish >>or other languages. And any ideas why it is so? >> >> I wonder how one can alleviate this behaviour. One way is to train a >>new model but I doubt this will help. Or? Another way is to substitute >>all newline characters with spaces. I do concatenate all lines into a >>single string which I subsequently apply the sentence detection model >>to. Is this the way it should be done (if I read the documentation >>correctly). >> >> Best regards, >> >> Svetoslav >> >