[ 
https://issues.apache.org/jira/browse/TIKA-3290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17285829#comment-17285829
 ] 

Luís Filipe Nassif edited comment on TIKA-3290 at 2/17/21, 1:54 PM:
--------------------------------------------------------------------

Currently I think it cannot be set in xml. If you are calling Tika from java, 
the code would be something like:
{code:java}
ParseContext context = new ParseContext();
MimeConfig mimeConfig = new MimeConfig();
mimeConfig.setHeadlessParsing("text/plain");
context.set(MimeConfig.class, mimeConfig);
parser.parse(inputStream, contentHandler, metadata, context);{code}
This will apply to all messages.

Or you can override current message/rfc822 definition with the old definition 
which is better for your dataset from your perspective. Take a look at 
[http://tika.apache.org/1.25/parser_guide.html#Add_your_MIME-Type]


was (Author: lfcnassif):
Currently I think it cannot be set in xml. If you are calling Tika from java, 
the code would be something like:
{code:java}
ParseContext context = new ParseContext();
MimeConfig mimeConfig = new MimeConfig;
mimeConfig.setHeadlessParsing("text/plain");
context.set(MimeConfig.class, mimeConfig);
parser.parse(inputStream, contentHandler, metadata, context);{code}
This will apply to all messages.

Or you can override current message/rfc822 definition with the old definition 
which is better for your dataset from your perspective. Take a look at 
http://tika.apache.org/1.25/parser_guide.html#Add_your_MIME-Type

> Extension reading it as eml instead of txt
> ------------------------------------------
>
>                 Key: TIKA-3290
>                 URL: https://issues.apache.org/jira/browse/TIKA-3290
>             Project: Tika
>          Issue Type: Bug
>          Components: core, mime
>    Affects Versions: 1.25
>            Reporter: Vamsi Molli
>            Priority: Major
>              Labels: tika-parsers
>             Fix For: 1.24.1
>
>         Attachments: test_sample_message.txt
>
>
> The attached file extension is reading it as eml instead of txt. With version 
> 1.24.1 it is reading it as txt and now with the upgrade to 1.25, it is 
> reading it as eml. So that while parsing we are getting mail corrupted error.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to