[
https://issues.apache.org/jira/browse/TIKA-1283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983191#comment-13983191
]
Tim Allison commented on TIKA-1283:
-----------------------------------
Y, I absolutely agree with the distinction. Is there a clean way of
implementing that that wouldn't break too much?
Perhaps treat them as very different from the regular .get(String/Property...)
in Metadata:
{noformat}
byte[] tn = metadata.getThumbnailData()
{noformat}
One argument against this is that clients would then have to add the step of
extracting thumbnails from the metadata and EmbeddedResourceHandler would no
longer pull everything as elegantly as it does now (if the user wants all
attachments and thumbnails).
Let me look into how hard it will be to associate a thumbnail with an embedded
resource. RTF is easy, but the microsoft/ooxml might be a bit messy.
> Add "thumbnail" as possible metadata item to TikaCoreProperties
> ---------------------------------------------------------------
>
> Key: TIKA-1283
> URL: https://issues.apache.org/jira/browse/TIKA-1283
> Project: Tika
> Issue Type: Improvement
> Components: metadata
> Reporter: Tim Allison
> Priority: Minor
>
> TIKA-90 originally requested to add thumbnails to a document's metadata.
> I'd like to have a unified way of determining whether an embedded
> document/resource is a thumbnail or a regular attachment.
> With the changes in TIKA-1223 (ooxml) and TIKA-1010 (rtf), we are now pulling
> out more thumbnails than before.
> I propose adding "tika:thumbnail" to the metadata of each thumbnail image.
> The consumer can then determine what to do with the embedded resource based
> on the metadata.
--
This message was sent by Atlassian JIRA
(v6.2#6252)