Ok, I think I have a clue where the problem is. I read my files with Java using BufferedReader (InputStreamReader FileInputStream). I read the files line by line and concatenate the lines in one loooong string. So when the BufferedReader encounters say a new line character or carriage return, these are not kept in the string. And when I concatenate the two strings there is no \n or \r between them thus after the full-stop a new sentence starts immediately.
In the end if I have a string like that "Jag tycker om dig.Men du tycker inte om mig." it will be split into two sentences as: <S> Jag tycker om dig.Men </S> <S> du tycker inte om mig. </S> So is this a problem with the model? Or the space handling code? Best, Svetoslav On 1/5/12 12:16 PM, "Jörn Kottmann" <kottm...@gmail.com> wrote: >On 1/5/12 11:00 AM, Svetoslav Marinov wrote: >> Thanks Jörn, I did update to 1.5.2 but it still makes the same mistake. >>I >> solve the problem by adding an extra white space between the lines. >> > >So you now have a white space and a new line ? > >If that helps we might still have a bug in the space handling >code, because that should not make a difference. > >Jörn >