Nick Burch created TIKA-946:
-------------------------------
Summary: Improve how the PPTX parser uses XLSF from POI
Key: TIKA-946
URL: https://issues.apache.org/jira/browse/TIKA-946
Project: Tika
Issue Type: Bug
Components: parser
Affects Versions: 1.2
Reporter: Nick Burch
One last bit from TIKA-757 and TIKA-805 - the current way that PPTX files are
parsed using XSLF from Apache POI has a couple of last remaining low level
parts.
We should avoid the need to go from the usermodel XMLSlideShow to the low level
XSLFSlideShow to do the text extraction (occurs in
XSLFPowerPointExtractorDecorator).
We should also update the usermodel slide support to extract out the slide
names from docProps/app.xml, so that these can be included in the text output
easily (in XSLFPowerPointExtractor)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira