[ 
https://issues.apache.org/jira/browse/TIKA-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Allison updated TIKA-2019:
------------------------------
    Description: The xml generated by these parsers was good, but when using 
the ToTextHandler, spaces/tabs were not added correctly.  This leads to 
incorrectly concatenated strings.  Further, because we are extending the 
XMLParser, while the metadata is extracted, it isn't well represented the xml.  
(was: The xml generated by these parsers was good, but when using the 
ToTextHandler, spaces/tabs were not added correctly.  This leads to incorrectly 
concatenated strings.)

> WordMLParser and SpreadsheetMLParser incorrectly concatenate tokens with 
> ToTextHandler
> --------------------------------------------------------------------------------------
>
>                 Key: TIKA-2019
>                 URL: https://issues.apache.org/jira/browse/TIKA-2019
>             Project: Tika
>          Issue Type: Bug
>            Reporter: Tim Allison
>
> The xml generated by these parsers was good, but when using the 
> ToTextHandler, spaces/tabs were not added correctly.  This leads to 
> incorrectly concatenated strings.  Further, because we are extending the 
> XMLParser, while the metadata is extracted, it isn't well represented the xml.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to