[ 
https://issues.apache.org/jira/browse/TIKA-640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Douglas updated TIKA-640:
----------------------------------

    Attachment: TIKA-640.patch

I'll concede that given the fact that the Metadata structure holds entire 
fields in strings, that emails should behave no differently. This patch sets 
the max field length at unlimited, which should not be a problem in all but the 
most unusual of circumstances. Setting MaxContentLength to unlimited, as 
suggested by the jira author, is not necessary as that is the default.

> RFC822Parser should configure Mime4j not to fail reading mails containing 
> more than 1000 chars in one headers text (even if folded)
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: TIKA-640
>                 URL: https://issues.apache.org/jira/browse/TIKA-640
>             Project: Tika
>          Issue Type: Wish
>          Components: parser
>    Affects Versions: 0.9
>         Environment: All
>            Reporter: Jens Wilmer
>              Labels: mail, rfc822parser
>         Attachments: TIKA-640.patch
>
>   Original Estimate: 5m
>  Remaining Estimate: 5m
>
> Standard configuration of Mime4j accepts only 1000 characters per line and 
> 1000 charackters per header. The streaming approach of tika should not need 
> theese limitations, an exception is being thrown and none of the data read is 
> available.
> Solution:
> Replace all occurences of:
> Parser parser = new RFC822Parser();
> by:
> MimeEntityConfig config = new MimeEntityConfig();
> config.setMaxLineLen(-1);
> config.setMaxContentLen(-1);
> Parser parser = new RFC822Parser(config);

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to