[ 
https://issues.apache.org/jira/browse/TIKA-640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020946#comment-13020946
 ] 

Jens Wilmer commented on TIKA-640:
----------------------------------

How long is it taking up too much heap space and what is too much heap space 
and what is the problem of taking up too much heap space? Is a probable 
"OutOfMemoryError" Exception the problem? I would rather not be able to read 
any information and catch an OutOfMemoryError if i have to process an email 
that has a larger headers than i can handle than not being able to read any 
information and catch an IOException caused by an MaxLineLimitException if i 
have to handle a message that contains a header bigger than any arbitrarily 
chosen size that must be smaller than the possible size to take any effect. 
After one of theese exceptions has been thrown and caught, there is no real 
difference in the programs flow and despite any limit whatsoever you still have 
to handle both Exceptions if there is a limit because "too much heap space" 
heavily depends on how much heap space is available which in turn depends on 
many parameters and is changing over time.


> RFC822Parser should configure Mime4j not to fail reading mails containing 
> more than 1000 chars in one headers text (even if folded)
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: TIKA-640
>                 URL: https://issues.apache.org/jira/browse/TIKA-640
>             Project: Tika
>          Issue Type: Wish
>          Components: parser
>    Affects Versions: 0.9
>         Environment: All
>            Reporter: Jens Wilmer
>              Labels: mail, rfc822parser
>   Original Estimate: 5m
>  Remaining Estimate: 5m
>
> Standard configuration of Mime4j accepts only 1000 characters per line and 
> 1000 charackters per header. The streaming approach of tika should not need 
> theese limitations, an exception is being thrown and none of the data read is 
> available.
> Solution:
> Replace all occurences of:
> Parser parser = new RFC822Parser();
> by:
> MimeEntityConfig config = new MimeEntityConfig();
> config.setMaxLineLen(-1);
> config.setMaxContentLen(-1);
> Parser parser = new RFC822Parser(config);

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to