Unexpected RuntimeException from 
org.apache.tika.parser.microsoft.OfficeParser@1a8402c
--------------------------------------------------------------------------------------

                 Key: TIKA-685
                 URL: https://issues.apache.org/jira/browse/TIKA-685
             Project: Tika
          Issue Type: Bug
          Components: parser
    Affects Versions: 0.9
         Environment: MS Windows XP Professional Version 2002 Service Pack 3
            Reporter: Jaroslaw Krzeminski


Runtime error while parsing MS Word document with Apache Tika GUI App or from a 
program snippet like:

InputStream inputStream = new FileInputStream(docFile);
ContentHandler contentHandler = new BodyContentHandler(new BufferedWriter(new 
FileWriter(textFile)));
Metadata metadata = new Metadata();
AutoDetectParser parser = new AutoDetectParser();
parser.parse(inputStream, contentHandler, metadata);

Error from Tika App Errors panel:

org.apache.tika.exception.TikaException: Unexpected RuntimeException from 
org.apache.tika.parser.microsoft.OfficeParser@1a8402c
        at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:199)
        at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
        at 
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:135)
        at org.apache.tika.gui.TikaGUI.importStream(TikaGUI.java:186)
        at 
org.apache.tika.gui.ParsingTransferHandler.importData(ParsingTransferHandler.java:99)
        at javax.swing.TransferHandler.importData(Unknown Source)
        at javax.swing.TransferHandler$DropHandler.drop(Unknown Source)
        at java.awt.dnd.DropTarget.drop(Unknown Source)
        at javax.swing.TransferHandler$SwingDropTarget.drop(Unknown Source)
        at sun.awt.dnd.SunDropTargetContextPeer.processDropMessage(Unknown 
Source)
        at 
sun.awt.dnd.SunDropTargetContextPeer$EventDispatcher.dispatchDropEvent(Unknown 
Source)
        at 
sun.awt.dnd.SunDropTargetContextPeer$EventDispatcher.dispatchEvent(Unknown 
Source)
        at sun.awt.dnd.SunDropTargetEvent.dispatch(Unknown Source)
        at java.awt.Component.dispatchEventImpl(Unknown Source)
        at java.awt.Container.dispatchEventImpl(Unknown Source)
        at java.awt.Component.dispatchEvent(Unknown Source)
        at java.awt.LightweightDispatcher.retargetMouseEvent(Unknown Source)
        at java.awt.LightweightDispatcher.processDropTargetEvent(Unknown Source)
        at java.awt.LightweightDispatcher.dispatchEvent(Unknown Source)
        at java.awt.Container.dispatchEventImpl(Unknown Source)
        at java.awt.Window.dispatchEventImpl(Unknown Source)
        at java.awt.Component.dispatchEvent(Unknown Source)
        at java.awt.EventQueue.dispatchEventImpl(Unknown Source)
        at java.awt.EventQueue.access$000(Unknown Source)
        at java.awt.EventQueue$1.run(Unknown Source)
        at java.awt.EventQueue$1.run(Unknown Source)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.security.AccessControlContext$1.doIntersectionPrivilege(Unknown 
Source)
        at java.security.AccessControlContext$1.doIntersectionPrivilege(Unknown 
Source)
        at java.awt.EventQueue$2.run(Unknown Source)
        at java.awt.EventQueue$2.run(Unknown Source)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.security.AccessControlContext$1.doIntersectionPrivilege(Unknown 
Source)
        at java.awt.EventQueue.dispatchEvent(Unknown Source)
        at java.awt.EventDispatchThread.pumpOneEventForFilters(Unknown Source)
        at java.awt.EventDispatchThread.pumpEventsForFilter(Unknown Source)
        at java.awt.EventDispatchThread.pumpEventsForHierarchy(Unknown Source)
        at java.awt.EventDispatchThread.pumpEvents(Unknown Source)
        at java.awt.EventDispatchThread.pumpEvents(Unknown Source)
        at java.awt.EventDispatchThread.run(Unknown Source)
Caused by: java.lang.NullPointerException
        at 
org.apache.poi.hwpf.sprm.CharacterSprmUncompressor.uncompressCHP(CharacterSprmUncompressor.java:39)
        at org.apache.poi.hwpf.model.CHPX.getCharacterProperties(CHPX.java:61)
        at 
org.apache.poi.hwpf.usermodel.CharacterRun.<init>(CharacterRun.java:98)
        at org.apache.poi.hwpf.usermodel.Range.getCharacterRun(Range.java:797)
        at 
org.apache.poi.hwpf.model.PicturesTable.getAllPictures(PicturesTable.java:191)
        at 
org.apache.tika.parser.microsoft.WordExtractor$PicturesSource.<init>(WordExtractor.java:430)
        at 
org.apache.tika.parser.microsoft.WordExtractor$PicturesSource.<init>(WordExtractor.java:420)
        at 
org.apache.tika.parser.microsoft.WordExtractor.parse(WordExtractor.java:75)
        at 
org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:182)
        at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
        ... 39 more

 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to