Gregory Lepore created TIKA-4041:
------------------------------------

             Summary: More rigorous file type checking for .arc files
                 Key: TIKA-4041
                 URL: https://issues.apache.org/jira/browse/TIKA-4041
             Project: Tika
          Issue Type: Improvement
            Reporter: Gregory Lepore
         Attachments: 315.ARC, ACCTG.ARC

I am seeing files with the .arc file extension being identfied as 
application/x-internet-archive. However, if I'm reading the tika-mimetypes.xml 
file correctly, they shouldn't be getting matched since they don't start with 
filedesc://.

 

Is it possible there is an additional mimetype match somewhere?

 

Samples attached.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to