[ 
https://issues.apache.org/jira/browse/TIKA-2689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16585552#comment-16585552
 ] 

Amit Pandey commented on TIKA-2689:
-----------------------------------

I just added magic detection configuration for ".ai" files in tika-mimetype.xml 
same as used for PDF magic detection configuration. It works fine for me, I 
tested some basic test cases using PDF and ".ai" files.

Here is the configuration that I used..

<mime-type type="application/illustrator">
   <alias type="application/vnd.adobe.illustrator"/>
    <acronym>AI</acronym>
    <_comment>Adobe Illustrator Artwork</_comment>
    
<tika:link>http://en.wikipedia.org/wiki/Adobe_Illustrator_Artwork</tika:link>
    <magic priority="50">
      <!-- Normally just %PDF- -->
      <match value="%PDF-" type="string" offset="0"/>
      <!-- Sometimes has a UTF-8 Byte Order Mark first -->
      <match value="\xef\xbb\xbf%PDF-" type="string" offset="0"/>
    </magic>
    <magic priority="20">
      <!-- Low priority match for %PDF-#.# near the start of the file -->
      <!-- Can trigger false positives, so set the priority rather low here -->
      <match value="%PDF-1." type="string" offset="1:512"/>
      <match value="%PDF-2." type="string" offset="1:512"/>
    </magic>
    <glob pattern="*.ai"/>]
    <sub-class-of type="application/postscript"/>
  </mime-type>

 

 

Here is the attached "example.ai "[^example.ai]

 

^Thanks^

^Amit Pandey^

> *.ai type (Adobe illustrator ) files are not detected correctly.
> ----------------------------------------------------------------
>
>                 Key: TIKA-2689
>                 URL: https://issues.apache.org/jira/browse/TIKA-2689
>             Project: Tika
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.16, 1.17, 1.18
>            Reporter: Amit Pandey
>            Priority: Major
>         Attachments: example.ai
>
>
> There is in-consistency in detecting **ai* types files when using different 
> overloaded detect method. When I am using _detect(String filename)_, it gives 
> correct file type - "*application/illustrator*". If I use _detect(InputStream 
> is, String filename)_ or _detect(File fileObj)_ -  it gives file type 
> "*application/pdf*".
> Here is sample code I used.
>   
> [https://stackoverflow.com/questions/51359351/tika-detect-method-not-giving-same-exact-file-type|http://example.com/]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to