[ 
https://issues.apache.org/jira/browse/TIKA-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Johan van der Knijff updated TIKA-2468:
---------------------------------------
    Description: 
While running the tika detector on some old Quattro Pro for DOS spreadsheets, I 
noticed these files are identified as "application/x-123" (Lotus 1-2-3). This 
happens because the magic patterns for  for  "application/x-123" only covers 
the first 4 bytes, which for one of them creates a collision with the Quattro 
Pro for DOS magic pattern. I've created a patch which includes more specific 
mimetype definitions and magic patterns for both Lotus 1-2-3 and Quattro Pro. 
Patch is on its way!

[Pull request|https://github.com/apache/tika/pull/209/files]

  was:While running the tika detector on some old Quattro Pro for DOS 
spreadsheets, I noticed these files are identified as "application/x-123" 
(Lotus 1-2-3). This happens because the magic patterns for  for  
"application/x-123" only covers the first 4 bytes, which for one of them 
creates a collision with the Quattro Pro for DOS magic pattern. I've created a 
patch which includes more specific mimetype definitions and magic patterns for 
both Lotus 1-2-3 and Quattro Pro. Patch is on its way!


> Improved detection of Lotus 1-2-3 and Quattro Pro spreadsheets
> --------------------------------------------------------------
>
>                 Key: TIKA-2468
>                 URL: https://issues.apache.org/jira/browse/TIKA-2468
>             Project: Tika
>          Issue Type: Improvement
>          Components: mime
>    Affects Versions: 1.16
>            Reporter: Johan van der Knijff
>            Priority: Minor
>
> While running the tika detector on some old Quattro Pro for DOS spreadsheets, 
> I noticed these files are identified as "application/x-123" (Lotus 1-2-3). 
> This happens because the magic patterns for  for  "application/x-123" only 
> covers the first 4 bytes, which for one of them creates a collision with the 
> Quattro Pro for DOS magic pattern. I've created a patch which includes more 
> specific mimetype definitions and magic patterns for both Lotus 1-2-3 and 
> Quattro Pro. Patch is on its way!
> [Pull request|https://github.com/apache/tika/pull/209/files]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to