Johan van der Knijff created TIKA-2468:
------------------------------------------

             Summary: Improved detection of Lotus 1-2-3 and Quattro Pro 
spreadsheets
                 Key: TIKA-2468
                 URL: https://issues.apache.org/jira/browse/TIKA-2468
             Project: Tika
          Issue Type: Improvement
          Components: mime
    Affects Versions: 1.16
            Reporter: Johan van der Knijff
            Priority: Minor


While running the tika detector on some old Quattro Pro for DOS spreadsheets, I 
noticed these files are identified as "application/x-123" (Lotus 1-2-3). This 
happens because the magic patterns for  for  "application/x-123" only covers 
the first 4 bytes, which for one of them creates a collision with the Quattro 
Pro for DOS magic pattern. I've created a patch which includes more specific 
mimetype definitions and magic patterns for both Lotus 1-2-3 and Quattro Pro. 
Patch is on its way!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to