[
https://issues.apache.org/jira/browse/TIKA-4093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17745659#comment-17745659
]
Tim Allison commented on TIKA-4093:
-----------------------------------
Also found a CorelDRAW stream here:
[https://corpora.tika.apache.org/base/docs/commoncrawl3/AZ/AZG2X4VXB3KIEDT3OVZC4R645KU5VSOF]
I was tempted to extract that as the Contents, but the raw CorelDRAW stream
contains a Corel Draw file, and there are supplementary streams with extra data
that may be used(?) to render the image.
A deep dive on these streams would be useful.
> Deep dive on OLE2 object pools
> ------------------------------
>
> Key: TIKA-4093
> URL: https://issues.apache.org/jira/browse/TIKA-4093
> Project: Tika
> Issue Type: Improvement
> Reporter: Tim Allison
> Priority: Trivial
> Attachments: 6TJD5TNSDB73QV6XEW46CPR6MSHXRRBN
>
>
> In looking at some OLE2 files, I noticed the attached file. This has an
> object pool that we're not properly processing. We're looking for the
> "contents" stream, but other streams might be more relevant here... maybe the
> OLE10Native "Equation Native"?
> We should look into this at some point.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)