[
https://issues.apache.org/jira/browse/TIKA-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison updated TIKA-2019:
------------------------------
Description: The xml generated by these parsers was good, but when using
the ToTextHandler, spaces/tabs were not added correctly. This leads to
incorrectly concatenated strings. Further, because we are extending the
XMLParser, while the metadata is extracted, it isn't well represented the xml.
(was: The xml generated by these parsers was good, but when using the
ToTextHandler, spaces/tabs were not added correctly. This leads to incorrectly
concatenated strings.)
> WordMLParser and SpreadsheetMLParser incorrectly concatenate tokens with
> ToTextHandler
> --------------------------------------------------------------------------------------
>
> Key: TIKA-2019
> URL: https://issues.apache.org/jira/browse/TIKA-2019
> Project: Tika
> Issue Type: Bug
> Reporter: Tim Allison
>
> The xml generated by these parsers was good, but when using the
> ToTextHandler, spaces/tabs were not added correctly. This leads to
> incorrectly concatenated strings. Further, because we are extending the
> XMLParser, while the metadata is extracted, it isn't well represented the xml.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)