Matthieu Neamar created TIKA-1407:
-------------------------------------

             Summary: Unexpected RuntimeException from 
org.apache.tika.parser.microsoft.OfficeParser@5d11346a
                 Key: TIKA-1407
                 URL: https://issues.apache.org/jira/browse/TIKA-1407
             Project: Tika
          Issue Type: Bug
          Components: parser
    Affects Versions: 1.5
         Environment: Kubuntu 14.04
            Reporter: Matthieu Neamar


I'm trying to parse a document created with Powerpoint for Mac.
This crash Tika. However, interestingly, i can open it with LibreOffice. If i 
save it using the same format, it loses some kilobytes and works.
The failing file is at 
http://amoki.fr/anyFetch_pitch_deck_Allianz_EN_withoutslide9.ppt

I get the following error using tika 1.5:

```
Exception in thread "main" org.apache.tika.exception.TikaException: Unexpected 
RuntimeException from org.apache.tika.parser.microsoft.OfficeParser@5d11346a
        at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244)
        at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
        at 
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
        at org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:142)
        at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:418)
        at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:112)
Caused by: java.lang.RuntimeException: Couldn't instantiate the class for type 
with id 5000 on class class 
org.apache.poi.hslf.record.DummyPositionSensitiveRecordWithChildren : 
java.lang.reflect.InvocationTargetException
Cause was : java.lang.RuntimeException: Couldn't instantiate the class for type 
with id 5002 on class class 
org.apache.poi.hslf.record.DummyPositionSensitiveRecordWithChildren : 
java.lang.reflect.InvocationTargetException
Cause was : java.lang.RuntimeException: Couldn't instantiate the class for type 
with id 5003 on class class org.apache.poi.hslf.record.BinaryTagDataBlob : 
java.lang.reflect.InvocationTargetException
Cause was : java.lang.RuntimeException: Couldn't instantiate the class for type 
with id 4012 on class class org.apache.poi.hslf.record.StyleTextProp9Atom : 
java.lang.reflect.InvocationTargetException
Cause was : java.lang.ArrayIndexOutOfBoundsException: 20
        at 
org.apache.poi.hslf.record.Record.createRecordForType(Record.java:185)
        at org.apache.poi.hslf.record.Record.findChildRecords(Record.java:128)
        at 
org.apache.poi.hslf.model.SimpleShape.getClientRecords(SimpleShape.java:347)
        at 
org.apache.poi.hslf.model.SimpleShape.getClientDataRecord(SimpleShape.java:319)
        at 
org.apache.poi.hslf.model.TextShape.getPlaceholderAtom(TextShape.java:596)
        at org.apache.poi.hslf.model.Sheet.getPlaceholder(Sheet.java:443)
        at 
org.apache.poi.hslf.model.HeadersFooters.isVisible(HeadersFooters.java:244)
        at 
org.apache.poi.hslf.model.HeadersFooters.isHeaderVisible(HeadersFooters.java:148)
        at 
org.apache.tika.parser.microsoft.HSLFExtractor.parse(HSLFExtractor.java:62)
        at 
org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:202)
        at 
org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:167)
        at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
        ... 5 more
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
        at 
org.apache.poi.hslf.record.Record.createRecordForType(Record.java:181)
        ... 16 more
Caused by: java.lang.RuntimeException: Couldn't instantiate the class for type 
with id 5002 on class class 
org.apache.poi.hslf.record.DummyPositionSensitiveRecordWithChildren : 
java.lang.reflect.InvocationTargetException
Cause was : java.lang.RuntimeException: Couldn't instantiate the class for type 
with id 5003 on class class org.apache.poi.hslf.record.BinaryTagDataBlob : 
java.lang.reflect.InvocationTargetException
Cause was : java.lang.RuntimeException: Couldn't instantiate the class for type 
with id 4012 on class class org.apache.poi.hslf.record.StyleTextProp9Atom : 
java.lang.reflect.InvocationTargetException
Cause was : java.lang.ArrayIndexOutOfBoundsException: 20
        at 
org.apache.poi.hslf.record.Record.createRecordForType(Record.java:185)
        at org.apache.poi.hslf.record.Record.findChildRecords(Record.java:128)
        at 
org.apache.poi.hslf.record.DummyPositionSensitiveRecordWithChildren.<init>(DummyPositionSensitiveRecordWithChildren.java:52)
        ... 21 more
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
        at 
org.apache.poi.hslf.record.Record.createRecordForType(Record.java:181)
        ... 23 more
Caused by: java.lang.RuntimeException: Couldn't instantiate the class for type 
with id 5003 on class class org.apache.poi.hslf.record.BinaryTagDataBlob : 
java.lang.reflect.InvocationTargetException
Cause was : java.lang.RuntimeException: Couldn't instantiate the class for type 
with id 4012 on class class org.apache.poi.hslf.record.StyleTextProp9Atom : 
java.lang.reflect.InvocationTargetException
Cause was : java.lang.ArrayIndexOutOfBoundsException: 20
        at 
org.apache.poi.hslf.record.Record.createRecordForType(Record.java:185)
        at org.apache.poi.hslf.record.Record.findChildRecords(Record.java:128)
        at 
org.apache.poi.hslf.record.DummyPositionSensitiveRecordWithChildren.<init>(DummyPositionSensitiveRecordWithChildren.java:52)
        ... 28 more
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
        at 
org.apache.poi.hslf.record.Record.createRecordForType(Record.java:181)
        ... 30 more
Caused by: java.lang.RuntimeException: Couldn't instantiate the class for type 
with id 4012 on class class org.apache.poi.hslf.record.StyleTextProp9Atom : 
java.lang.reflect.InvocationTargetException
Cause was : java.lang.ArrayIndexOutOfBoundsException: 20
        at 
org.apache.poi.hslf.record.Record.createRecordForType(Record.java:185)
        at org.apache.poi.hslf.record.Record.findChildRecords(Record.java:128)
        at 
org.apache.poi.hslf.record.BinaryTagDataBlob.<init>(BinaryTagDataBlob.java:52)
        ... 35 more
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
        at 
org.apache.poi.hslf.record.Record.createRecordForType(Record.java:181)
        ... 37 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: 20
        at org.apache.poi.util.LittleEndian.getInt(LittleEndian.java:161)
        at 
org.apache.poi.hslf.record.StyleTextProp9Atom.<init>(StyleTextProp9Atom.java:70)
        ... 42 more
```



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to