[
https://issues.apache.org/jira/browse/TIKA-2294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17017476#comment-17017476
]
Andrey Nizienko commented on TIKA-2294:
---------------------------------------
We found if to save the docx from Google Doc it's detected as zip, but if to
re-save from MS Word - it's detecting correctly as docx. PFA attachment docx
file.
> Tika inconsistently detects ooxml files as zip file sometimes
> -------------------------------------------------------------
>
> Key: TIKA-2294
> URL: https://issues.apache.org/jira/browse/TIKA-2294
> Project: Tika
> Issue Type: Bug
> Components: mime
> Affects Versions: 1.11
> Environment: linux
> Reporter: chanchal
> Priority: Major
> Attachments: google_doc.docx
>
>
> Tika sometimes incorrectly detects ooxml file as zip and sometimes correctly
> detects as docx/pptx/xlsx.
> Is there a possibility of it happening and how?
> I cannot share the file as it has sensitive content.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)