[ 
https://issues.apache.org/jira/browse/TIKA-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17830954#comment-17830954
 ] 

Tilman Hausherr commented on TIKA-4218:
---------------------------------------

6FOMNUPGPA6IG66Z4NIUEQIVOR5ON46Q (an MP4 file) has a loss of metadata 
(bierenbach: 2 | earlier: 2 | https://www.facebook.com/speedlinecablecam: 2 | 
https://www.speedline-cablecam.com: 2 | in: 2 | of: 2 | the: 2 | this: 2 | 
woods: 2 | year: 2)

EEXR753OKDGYAIXL36PZ2EGYPN477SZU and a few other files have one word in 
TOP_10_MORE_IN_A which reappears in TOP_10_MORE_IN_B but with "oebps". Here, 
"secretary" becomes "secretaryoebps". I don't know if this is a bug or not.

> Run regression tests to support 2.9.2 release
> ---------------------------------------------
>
>                 Key: TIKA-4218
>                 URL: https://issues.apache.org/jira/browse/TIKA-4218
>             Project: Tika
>          Issue Type: Task
>            Reporter: Tim Allison
>            Priority: Major
>         Attachments: 2.9.1-876503.pdf.json, 2.9.2-876503.pdf.json
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to