[
https://issues.apache.org/jira/browse/TIKA-2436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16109247#comment-16109247
]
Tim Allison commented on TIKA-2436:
-----------------------------------
Thank you for explaining this in more detail, [~mcaruanagalizia].
[~gagravarr], given that emf and wmf are nearly always embedded in MSOffice
files, should we do the detection there and enable the EMFParser and WMFParser
to handle both emz and wmz respectively? If someone somehow came across a
standalone, they'd get GZ->EMF.
So, that's alternative 1. Alternatives 2 and 3 I really don't like:
2) create a separate detector for emz/wmz
3) add emf/wmf detection inside the CompressorParser...the HORROR!
Option 4, I guess, is to stop now and leave it as is, but that doesn't meet
[~mcaruanagalizia]'s use case.
> Support for GZIP-compressed EMF files
> -------------------------------------
>
> Key: TIKA-2436
> URL: https://issues.apache.org/jira/browse/TIKA-2436
> Project: Tika
> Issue Type: Improvement
> Components: mime, parser
> Affects Versions: 1.15
> Reporter: Matthew Caruana Galizia
> Attachments: image004.emz
>
>
> Tika is currently detecting EMZ (compressed EMF) files as simple gzip files.
> These files should instead be detected as EMF files and the EMFParser should
> perform decompression transparently.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)