[
https://issues.apache.org/jira/browse/TIKA-1748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14935840#comment-14935840
]
Andreas Beeker commented on TIKA-1748:
--------------------------------------
at HSLFExtractor:
- as the extraction is specific to HSLF, I would use the HSLF and not the
common sl classes.
- in my patch (tika-1707) the method textRunsToText wraps paragraphs in divs
... I don't know how the post-processing of the output works, but I wanted to
point out, that you don't deal with just a list of textruns, but with separate
paragraphs having textruns
at PowerPointParserTest:
- the above handling leads to a change in the tests
at XSLFPowerPointExtractorDecorator:
- use XSLF classes instead of common sl
- the new extractTable method is better
Andi.
> Upgrade to POI 3.13-final when available
> ----------------------------------------
>
> Key: TIKA-1748
> URL: https://issues.apache.org/jira/browse/TIKA-1748
> Project: Tika
> Issue Type: Task
> Reporter: Tim Allison
> Priority: Minor
> Attachments: TIKA-1748.patch
>
>
> Upgrade.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)