[ 
https://issues.apache.org/jira/browse/TIKA-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18066077#comment-18066077
 ] 

Tim Allison commented on TIKA-4563:
-----------------------------------

The visio and the other issue are an alignment problem in tika-eval triggered 
by some actual improvements in 3.3.0.

For visio, the visio docs are getting aligned to zero-byte image files, which 
makes it look like they're losing content, but they aren't.

Because of a fix in how we're digesting embedded OLEs -- we're now getting the 
actual digest for actual bytes instead of the digest for 0-bytes -- tika-eval 
failed to do the right matching because the digests disagreed.

I'll add special handling for zero-byte digests, and then fallback to embedded 
file paths for matching.

> Prep for 3.3.0 release
> ----------------------
>
>                 Key: TIKA-4563
>                 URL: https://issues.apache.org/jira/browse/TIKA-4563
>             Project: Tika
>          Issue Type: Task
>            Reporter: Tim Allison
>            Priority: Major
>         Attachments: kio5_perldoc.mo, tika-3.3.0-20260110.tgz, 
> tika-3.3.0-reports.tgz, tika-3.3.0.tgz
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to