Konstantin created TIKA-1485:
--------------------------------

             Summary: Wrong mimetype detection
                 Key: TIKA-1485
                 URL: https://issues.apache.org/jira/browse/TIKA-1485
             Project: Tika
          Issue Type: Bug
    Affects Versions: 1.6, 1.5
            Reporter: Konstantin


[SCENARIO]
 # get a valid jpg (almost all file types that have a header are affected).
 # insert some whitespaces or line breaks in the beginning of file (it becomes 
invalid)
 # check file mimetype:
{code}
        Metadata metadata = new Metadata();
        metadata.set(Metadata.RESOURCE_NAME_KEY, file.toString());
        MediaType mimetype = new 
TikaConfig().getDetector().detect(TikaInputStream.get(file), metadata);
{code}

[RESULT]
 detected mimetype is "image/jpeg"
[EXPECTED]
"application/octet-stream"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to