Johan van der Knijff created TIKA-2468:
------------------------------------------
Summary: Improved detection of Lotus 1-2-3 and Quattro Pro
spreadsheets
Key: TIKA-2468
URL: https://issues.apache.org/jira/browse/TIKA-2468
Project: Tika
Issue Type: Improvement
Components: mime
Affects Versions: 1.16
Reporter: Johan van der Knijff
Priority: Minor
While running the tika detector on some old Quattro Pro for DOS spreadsheets, I
noticed these files are identified as "application/x-123" (Lotus 1-2-3). This
happens because the magic patterns for for "application/x-123" only covers
the first 4 bytes, which for one of them creates a collision with the Quattro
Pro for DOS magic pattern. I've created a patch which includes more specific
mimetype definitions and magic patterns for both Lotus 1-2-3 and Quattro Pro.
Patch is on its way!
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)