[ 
https://issues.apache.org/jira/browse/TIKA-2284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15890286#comment-15890286
 ] 

Tim Allison commented on TIKA-2284:
-----------------------------------

Would you be able to try a nightly build from 
[Jenkins|https://builds.apache.org/view/Tika/job/Tika-trunk/1212/org.apache.tika$tika-app/]
 using the new experimental SAX parser for docx?

java -jar tika-app-1.15-SNAPSHOT.jar --config=tika-config.xml testFile.docx

where tika-config.xml is
{noformat}
<properties>
    <parsers>
        <parser class="org.apache.tika.parser.DefaultParser"/>
        <parser class="org.apache.tika.parser.microsoft.ooxml.OOXMLParser">
            <params>
                <param name="useSAXDocxExtractor" type="bool">true</param>
            </params>
        </parser>
    </parsers>
</properties>
{noformat}



> Caused by: org.apache.xmlbeans.XmlException: error: The document is not a 
> ftr@http://schemas.openxmlformats.org/wordprocessingml/2006/main: document 
> element local name mismatch expected ftr got hdr
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: TIKA-2284
>                 URL: https://issues.apache.org/jira/browse/TIKA-2284
>             Project: Tika
>          Issue Type: Bug
>          Components: core, parser
>    Affects Versions: 1.13
>            Reporter: Sharath Kumar
>
> I get the below parsing error for the attached doc
> Caused by: org.apache.xmlbeans.XmlException: error: The document is not a 
> ftr@http://schemas.openxmlformats.org/wordprocessingml/2006/main: document 
> element local name mismatch expected ftr got hdr
>  at org.apache.xmlbeans.impl.store.Locale.verifyDocumentType(Locale.java:459)
>  at org.apache.xmlbeans.impl.store.Locale.autoTypeDocument(Locale.java:364)
>  at 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to