Borja Serrano created TIKA-2904:
-----------------------------------

             Summary: Error parsing a Word document with a WMF image
                 Key: TIKA-2904
                 URL: https://issues.apache.org/jira/browse/TIKA-2904
             Project: Tika
          Issue Type: Bug
          Components: parser
    Affects Versions: 1.21
            Reporter: Borja Serrano


If you try to parse a document with a WMF file and you are importing the newest 
version of Apache POI (4.1.0 which is marked as compatible) you get a 
NoSuchMethodError exception:
{code:java}
2019-07-11 11:06:59 com.penman.web.configuration.CustomAsyncExceptionHandler 
[ERROR] Exception in async task message - 
org.apache.poi.hwmf.record.HwmfRecord.getRecordType()Lorg/apache/poi/hwmf/record/HwmfRecordType;
java.lang.NoSuchMethodError: 
org.apache.poi.hwmf.record.HwmfRecord.getRecordType()Lorg/apache/poi/hwmf/record/HwmfRecordType;
at org.apache.tika.parser.microsoft.WMFParser.parse(WMFParser.java:72) 
~[tika-parsers-1.21.jar:1.21]
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) 
~[tika-core-1.21.jar:1.21]
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) 
~[tika-core-1.21.jar:1.21]
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143) 
~[tika-core-1.21.jar:1.21]
at org.apache.tika.parser.DelegatingParser.parse(DelegatingParser.java:72) 
~[tika-core-1.21.jar:1.21]
at 
org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor.parseEmbedded(ParsingEmbeddedDocumentExtractor.java:104)
 ~[tika-core-1.21.jar:1.21]
at 
org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.handleEmbeddedFile(AbstractOOXMLExtractor.java:391)
 ~[tika-parsers-1.21.jar:1.21]
at 
org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.handleEmbeddedPart(AbstractOOXMLExtractor.java:264)
 ~[tika-parsers-1.21.jar:1.21]
at 
org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.handleEmbeddedParts(AbstractOOXMLExtractor.java:206)
 ~[tika-parsers-1.21.jar:1.21]
at 
org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.getXHTML(AbstractOOXMLExtractor.java:139)
 ~[tika-parsers-1.21.jar:1.21]
at 
org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:201)
 ~[tika-parsers-1.21.jar:1.21]
at 
org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.java:110) 
~[tika-parsers-1.21.jar:1.21]
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) 
~[tika-core-1.21.jar:1.21]
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) 
~[tika-core-1.21.jar:1.21]
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143) 
~[tika-core-1.21.jar:1.21]

{code}
The problem comes from an update in Apache POI. Since 4.1.0 the function 
getRecordType is no longer usable and we need to use getWmfRecordType (there 
was a discussion about the change in 
[http://apache-poi.1045710.n5.nabble.com/VOTE-Apache-POI-4-1-0-release-RC3-td5733174.html])



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to