Seva Alekseyev created TIKA-2162: ------------------------------------ Summary: "Unknown compression method" on a Powerpoint file Key: TIKA-2162 URL: https://issues.apache.org/jira/browse/TIKA-2162 Project: Tika Issue Type: Bug Components: parser Affects Versions: 1.13 Environment: Windows 7 x64, JVM 1.8.0_101 Reporter: Seva Alekseyev Attachments: DECAY.ppt
On the attached Powerpoint file, which opens fine with Powerpoint, the Tika parser throws the following error: org.apache.poi.hslf.exceptions.HSLFException: java.util.zip.ZipException: unknown compression method at org.apache.poi.hslf.blip.EMF.getData(EMF.java:91) at org.apache.tika.parser.microsoft.HSLFExtractor.handleSlideEmbeddedPictures(HSLFExtractor.java:324) at org.apache.tika.parser.microsoft.HSLFExtractor.parse(HSLFExtractor.java:193) at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:149) at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:117) Caused by: java.util.zip.ZipException: unknown compression method at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:164) at java.io.FilterInputStream.read(FilterInputStream.java:107) at org.apache.poi.hslf.blip.EMF.getData(EMF.java:85) ... 6 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)