[
https://issues.apache.org/jira/browse/TIKA-1538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14304813#comment-14304813
]
Miguel commented on TIKA-1538:
------------------------------
The file was downloaded with curl, receiving a compressed file (so, Tika's
result, "application/gzip" was of course correct) from the server, not really a
JPEG file, until i used curl with the -H 'Accept-encoding: gzip' --compressed
options, which downloaded and unzipped the file. Hope it helps someone, and
sorry for the bug report.
> Wrong mimetype detection
> ------------------------
>
> Key: TIKA-1538
> URL: https://issues.apache.org/jira/browse/TIKA-1538
> Project: Tika
> Issue Type: Bug
> Affects Versions: 1.7
> Reporter: Miguel
> Attachments: Product345037-000.jpg
>
>
> [SCENARIO]
> - Working on a "supposed to be a valid JPEG file" (the file is attached to
> this issue report), which is correctly detected and treated by a browser,
> etc. (Detection works well for almost all other checked images).
> - Using tika-app-1.7.jar
> - Java code snippet:
> Tika tikaObject = new Tika();
> ...
> // image is a byte[] containing the JPEG file
> String contentTypeTika = tikaObject.detect( image );
> [RESULT]
> detected mimetype is "application/gzip" ("application/x-gzip" if using
> tika-app-1.4.jar or tika-app-1.5.jar)
> [EXPECTED]
> "image/jpeg"
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)