Sam Stephens created TIKA-3768:
----------------------------------
Summary: message/rfc822 does not include Headers in extracted text
Key: TIKA-3768
URL: https://issues.apache.org/jira/browse/TIKA-3768
Project: Tika
Issue Type: Bug
Components: parser
Affects Versions: 2.4.0
Reporter: Sam Stephens
Attachments: email.txt
When running AutoDetectParser on message/rfc822 structured text documents, such
as the attached [^email.txt], the extracted text does not include any of the
headers, such as the Subject and From and To lines.
However these lines contain useful text I'd like to be able to extract. I'm
surprised it's not there based on the include everything bias I saw on
https://issues.apache.org/jira/browse/TIKA-3710.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)