Reduce duplication between POIFSDocumentType (in OfficeParser) and
POIFSContainerDetector
-----------------------------------------------------------------------------------------
Key: TIKA-790
URL: https://issues.apache.org/jira/browse/TIKA-790
Project: Tika
Issue Type: Improvement
Components: parser
Affects Versions: 1.0
Reporter: Nick Burch
Assignee: Nick Burch
For historical reasons, we now have two parts of Tika that handle trying to
identify the type of an OLE2 based file.
POIFSDocumentType is able to detect a few kinds of files that
POIFSContainerDetector is not able to (eg Encrypted and OLE Native), mostly
which may not map well onto mimetypes. POIFSDocumentType also lacks some of the
logic in the main detector, and only does the office parser supported files
We should probably try to reduce the duplication. One option is to add the
extra few types into the Detector some how, the other is to use the detector
first and do additional specific checks after
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira