On Wed, Apr 20, 2011 at 10:58 AM, Jens Grivolla <[email protected]> wrote:
> As it turns out, the other system considers CR+LF (Windows style line
> endings) to be two characters, while UIMA sees it as one.

As Jörn suggested, this is probably a bug in the code somewhere where
you read in the text. Perhaps you're using
org.apache.uima.pear.util.FileUtil.loadTextFile? That's definitely
broken in terms of line endings and I know that gave us trouble
before. We found that org.apache.uima.util.FileUtils.file2String
actually does the right thing, so you could use that instead. Having
been bitten by this though, I tend to avoid the UIMA classes for
handling files, and use com.google.common.io.Files.toString from the
guava libraries instead, which I trust more.

Steve

P.S. Yes, I know I should have filed a bug report. Sorry for not
getting around to it...
-- 
Where did you get that preposterous hypothesis?
Did Steve tell you that?
        --- The Hiphopopotamus

Reply via email to