Seva Alekseyev created TIKA-2157:
------------------------------------
Summary: HSLFException on a valid Powerpoint file
Key: TIKA-2157
URL: https://issues.apache.org/jira/browse/TIKA-2157
Project: Tika
Issue Type: Bug
Components: parser
Affects Versions: 1.13
Environment: Windows 7 x64, JVM 1.8.0_101
Reporter: Seva Alekseyev
Attachments: CRADA 2-09 K Subbarao.ppt
On the attached Powerpoint file, which opens fine with Powerpoint, the Tika
parser throws the following error:
org.apache.poi.hslf.exceptions.HSLFException: java.util.zip.ZipException:
incorrect data check
at org.apache.poi.hslf.blip.PICT.getData(PICT.java:120)
at
org.apache.tika.parser.microsoft.HSLFExtractor.handleSlideEmbeddedPictures(HSLFExtractor.java:324)
at
org.apache.tika.parser.microsoft.HSLFExtractor.parse(HSLFExtractor.java:193)
at
org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:149)
at
org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:117)
Caused by: java.util.zip.ZipException: incorrect data check
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:164)
at java.io.FilterInputStream.read(FilterInputStream.java:107)
at org.apache.poi.hslf.blip.PICT.read(PICT.java:133)
at org.apache.poi.hslf.blip.PICT.getData(PICT.java:116)
... 6 more
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)