[ 
https://issues.apache.org/jira/browse/TIKA-640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13034933#comment-13034933
 ] 

Jukka Zitting commented on TIKA-640:
------------------------------------

This seems like a rather rare use case, so I'd rather make this configurable 
instead of changing the default behavior. In the default configuration it's far 
better to avoid a possible OOM at the cost of not being able to parse some very 
rare or malformed emails.

In revision 1104444 I added support for passing a custom MimeEntityConfig 
object through the parsing config. This way you can achieve your use case with 
the following code snippet before you pass the ParseContext object to the 
parser.

    MimeEntityConfig config = new MimeEntityConfig();
    config.setMaxLineLen(-1);
    context.set(MimeEntityConfig.class, config);

> RFC822Parser should configure Mime4j not to fail reading mails containing 
> more than 1000 chars in one headers text (even if folded)
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: TIKA-640
>                 URL: https://issues.apache.org/jira/browse/TIKA-640
>             Project: Tika
>          Issue Type: Wish
>          Components: parser
>    Affects Versions: 0.9
>         Environment: All
>            Reporter: Jens Wilmer
>              Labels: mail, rfc822parser
>             Fix For: 1.0
>
>         Attachments: TIKA-640.patch
>
>   Original Estimate: 5m
>  Remaining Estimate: 5m
>
> Standard configuration of Mime4j accepts only 1000 characters per line and 
> 1000 charackters per header. The streaming approach of tika should not need 
> theese limitations, an exception is being thrown and none of the data read is 
> available.
> Solution:
> Replace all occurences of:
> Parser parser = new RFC822Parser();
> by:
> MimeEntityConfig config = new MimeEntityConfig();
> config.setMaxLineLen(-1);
> config.setMaxContentLen(-1);
> Parser parser = new RFC822Parser(config);

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to