[
https://issues.apache.org/jira/browse/UIMA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13409428#comment-13409428
]
Peter Klügl commented on UIMA-2359:
-----------------------------------
Is there a generic solution for this problem? I would not restrict the
functionality to either of both cases. Should only one break be created by the
lexer? In my applications, I solved this on the rule-level, but I am open to
any suggestions and improvements.
> Different results of Text Maker in windows and unix
> ---------------------------------------------------
>
> Key: UIMA-2359
> URL: https://issues.apache.org/jira/browse/UIMA-2359
> Project: UIMA
> Issue Type: Bug
> Components: Sandbox, TextMarker
> Affects Versions: build-resources-2
> Environment: Windows
> Reporter: Luca Dini (CELI)
> Assignee: Peter Klügl
> Priority: Minor
> Labels: patch
>
> The class AbstractApplyScriptHandlerJob when called from the workbenck calls,
> for reding text to be analyzed the method:
> org.apache.uima.pear.util.FileUtil.loadTextFile(new File(each), "UTF-8");
> Such a method return nelines in window as 2 new lines. Therefore basic
> TextMarker annotations appears like:
> line BREAK BREAK
> line BREAK BREAK
> Therefore grammars written on windows must take into account the double break
> which make them not applicable when running on unix or when using other read
> methods, such as:
> Scanner sc = new Scanner(inFile, "UTF-8");
> String out = "";
> while (sc.hasNextLine()) {
> out += sc.nextLine() + "\n";
> }
> Relates to:
> https://issues.apache.org/jira/browse/UIMA-2133t
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira