Ok, I think I have a clue where the problem is.
I read my files with Java using BufferedReader (InputStreamReader
FileInputStream). I read the files line by line and concatenate the lines
in one loooong string. So when the BufferedReader encounters say a new
line character or carriage return, these are not kept in the string. And
when I concatenate the two strings there is no \n or \r between them thus
after the full-stop a new sentence starts immediately.

In the end if I have a string like that "Jag tycker om dig.Men du tycker
inte om mig." it will be split into two sentences as:

<S>
Jag tycker om dig.Men
</S>
<S>
du tycker inte om mig.
</S>


So is this a problem with the model? Or the space handling code?

Best,
Svetoslav


On 1/5/12 12:16 PM, "Jörn Kottmann" <kottm...@gmail.com> wrote:

>On 1/5/12 11:00 AM, Svetoslav Marinov wrote:
>> Thanks Jörn, I did update to 1.5.2 but it still makes the same mistake.
>>I
>> solve the problem by adding an extra white space between the lines.
>>
>
>So you now have a white space and a new line ?
>
>If that helps we might still have a bug in the space handling
>code, because that should not make a difference.
>
>Jörn
>

Reply via email to