[
https://issues.apache.org/jira/browse/TIKA-2284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15890286#comment-15890286
]
Tim Allison commented on TIKA-2284:
-----------------------------------
Would you be able to try a nightly build from
[Jenkins|https://builds.apache.org/view/Tika/job/Tika-trunk/1212/org.apache.tika$tika-app/]
using the new experimental SAX parser for docx?
java -jar tika-app-1.15-SNAPSHOT.jar --config=tika-config.xml testFile.docx
where tika-config.xml is
{noformat}
<properties>
<parsers>
<parser class="org.apache.tika.parser.DefaultParser"/>
<parser class="org.apache.tika.parser.microsoft.ooxml.OOXMLParser">
<params>
<param name="useSAXDocxExtractor" type="bool">true</param>
</params>
</parser>
</parsers>
</properties>
{noformat}
> Caused by: org.apache.xmlbeans.XmlException: error: The document is not a
> ftr@http://schemas.openxmlformats.org/wordprocessingml/2006/main: document
> element local name mismatch expected ftr got hdr
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: TIKA-2284
> URL: https://issues.apache.org/jira/browse/TIKA-2284
> Project: Tika
> Issue Type: Bug
> Components: core, parser
> Affects Versions: 1.13
> Reporter: Sharath Kumar
>
> I get the below parsing error for the attached doc
> Caused by: org.apache.xmlbeans.XmlException: error: The document is not a
> ftr@http://schemas.openxmlformats.org/wordprocessingml/2006/main: document
> element local name mismatch expected ftr got hdr
> at org.apache.xmlbeans.impl.store.Locale.verifyDocumentType(Locale.java:459)
> at org.apache.xmlbeans.impl.store.Locale.autoTypeDocument(Locale.java:364)
> at
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)