[ 
https://issues.apache.org/jira/browse/UIMA-3512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Klügl updated UIMA-3512:
------------------------------

    Affects Version/s:     (was: 2.1.1ruta)
                       2.1.0ruta

> Add additional engine parameter for Ruta HtmlConverter to configure linebreak 
> replacement.
> ------------------------------------------------------------------------------------------
>
>                 Key: UIMA-3512
>                 URL: https://issues.apache.org/jira/browse/UIMA-3512
>             Project: UIMA
>          Issue Type: Improvement
>          Components: ruta
>    Affects Versions: 2.1.0ruta
>            Reporter: Philip-Daniel Beck
>            Assignee: Peter Klügl
>             Fix For: 2.1.1ruta
>
>         Attachments: linebreakReplacementEngineParameter.core_patch, 
> linebreakReplacementEngineParameter.docbook_patch
>
>
> When converting an HTML file to plain text with HtmlConverter engine in Ruta, 
> there exists an engine parameter "replaceLinebreaks" of type boolean to 
> decide if text linebreaks should be replaced or not. If set to true, all 
> linebreaks are kept in the document. If set to false, all linebreaks are 
> deleted. Therefore, the last word of a line and the first word of the next 
> line are put together without whitespace in between. It would often be better 
> if a linebreak is replaced by a whitespace. To configure this, another engine 
> parameter that defines the String, the linebreak is replaced with, would be 
> useful.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to