Sam Stephens created TIKA-3769:
----------------------------------
Summary: md5 incorrectly detected as application/marc
Key: TIKA-3769
URL: https://issues.apache.org/jira/browse/TIKA-3769
Project: Tika
Issue Type: Bug
Components: parser
Affects Versions: 2.4.0
Reporter: Sam Stephens
Attachments: md5.txt
When I parse the attached text document using AutoDetectParser, its incorrectly
detected as application/marc with no text. As other md5s I generated randomly
correctly detected as text, I'm guessing that the Marc parser is using some
kind of magic bytes to detect Marc files that this file matches as a false
positive.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)