[
https://issues.apache.org/jira/browse/TIKA-2294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15905228#comment-15905228
]
chanchal commented on TIKA-2294:
--------------------------------
Thanks Tim and Nick for looking into this.
What i meant was that same file is returning docx and zip. so if detection is
happening 20 times for same file, then 19 times it returns docx but one time it
returns zip. And this behaviour is happening only for small number of ooxml
files.
So we have tika deployed on multiple machines and on one of the setup we
receives zip as detected mimetype. And each time when zip is getting detected,
machine is not same. so does not look like machine issue.
Although i checked online about thread safety of Tika, but want to confirm once
again: is detector thread safe?
Related to file, i will check and get back, if I can share.
thanks,
> Tika inconsistently detects ooxml files as zip file sometimes
> -------------------------------------------------------------
>
> Key: TIKA-2294
> URL: https://issues.apache.org/jira/browse/TIKA-2294
> Project: Tika
> Issue Type: Bug
> Components: mime
> Affects Versions: 1.11
> Environment: linux
> Reporter: chanchal
>
> Tika sometimes incorrectly detects ooxml file as zip and sometimes correctly
> detects as docx/pptx/xlsx.
> Is there a possibility of it happening and how?
> I cannot share the file as it has sensitive content.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)