Jiri Banszel created TIKA-1827:
----------------------------------
Summary: Error is printed on stderr when parsing some ppt files
Key: TIKA-1827
URL: https://issues.apache.org/jira/browse/TIKA-1827
Project: Tika
Issue Type: Bug
Components: parser
Affects Versions: 1.11
Reporter: Jiri Banszel
Priority: Minor
We have encountered this problem when using Tika 1.11. As I can see that this
is an issue in POI, I have reported the problem to its tracker (there are also
more details): https://bz.apache.org/bugzilla/show_bug.cgi?id=58822
java.lang.ArrayIndexOutOfBoundsException: 110
at org.apache.poi.util.LittleEndian.getShort(LittleEndian.java:224)
at
org.apache.poi.hslf.model.textproperties.TabStopPropCollection.parseProperty(TabStopPropCollection.java:100)
at
org.apache.poi.hslf.model.textproperties.TextPropCollection.buildTextPropList(TextPropCollection.java:224)
at
org.apache.poi.hslf.record.TxMasterStyleAtom.init(TxMasterStyleAtom.java:157)
at
org.apache.poi.hslf.record.TxMasterStyleAtom.<init>(TxMasterStyleAtom.java:67)
at sun.reflect.GeneratedConstructorAccessor498.newInstance(Unknown
Source)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at
org.apache.poi.hslf.record.Record.createRecordForType(Record.java:181)
at org.apache.poi.hslf.record.Record.findChildRecords(Record.java:128)
at org.apache.poi.hslf.record.Environment.<init>(Environment.java:54)
at sun.reflect.GeneratedConstructorAccessor690.newInstance(Unknown
Source)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at
org.apache.poi.hslf.record.Record.createRecordForType(Record.java:181)
at org.apache.poi.hslf.record.Record.findChildRecords(Record.java:128)
at org.apache.poi.hslf.record.Document.<init>(Document.java:122)
at sun.reflect.GeneratedConstructorAccessor688.newInstance(Unknown
Source)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at
org.apache.poi.hslf.record.Record.createRecordForType(Record.java:181)
at
org.apache.poi.hslf.record.Record.buildRecordAtOffset(Record.java:103)
at
org.apache.poi.hslf.usermodel.HSLFSlideShowImpl.read(HSLFSlideShowImpl.java:286)
at
org.apache.poi.hslf.usermodel.HSLFSlideShowImpl.buildRecords(HSLFSlideShowImpl.java:267)
at
org.apache.poi.hslf.usermodel.HSLFSlideShowImpl.<init>(HSLFSlideShowImpl.java:178)
at
org.apache.poi.hslf.usermodel.HSLFSlideShow.<init>(HSLFSlideShow.java:171)
at
org.apache.tika.parser.microsoft.HSLFExtractor.parse(HSLFExtractor.java:61)
at
org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:149)
at
org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:117)
at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
at
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
at
org.apache.tika.parser.DelegatingParser.parse(DelegatingParser.java:72)
at
org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor.parseEmbedded(ParsingEmbeddedDocumentExtractor.java:102)
at
org.apache.tika.parser.pkg.PackageParser.parseEntry(PackageParser.java:219)
at
org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:182)
at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
at
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
at
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:136)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)