[ https://issues.apache.org/jira/browse/TIKA-2294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15902906#comment-15902906 ]
Nick Burch commented on TIKA-2294: ---------------------------------- To correctly detect the OOXML sub-type, you either need the filename, or the full contents + detector out of the parsers package See also https://wiki.apache.org/tika/Troubleshooting%20Tika#Content_Incorrectly_Detected > Tika inconsistently detects ooxml files as zip file sometimes > ------------------------------------------------------------- > > Key: TIKA-2294 > URL: https://issues.apache.org/jira/browse/TIKA-2294 > Project: Tika > Issue Type: Bug > Components: mime > Affects Versions: 1.11 > Environment: linux > Reporter: chanchal > > Tika sometimes incorrectly detects ooxml file as zip and sometimes correctly > detects as docx/pptx/xlsx. > Is there a possibility of it happening and how? > I cannot share the file as it has sensitive content. -- This message was sent by Atlassian JIRA (v6.3.15#6346)