[ 
https://issues.apache.org/jira/browse/TIKA-2575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicholas DiPiazza updated TIKA-2575:
------------------------------------
    Description: 
Sometimes, for example, you use tika to parse an XLS file that isn't really 
that big, maybe 60 MB. and suddenly the JVM heap size taken is >800Mb which 
causes an OOM in my case.

Can we make an "abort threshold" where the tika parse will halt if parse output 
bytes exceeds this value?

Or it is possible for users to already do this themselves by watching the input 
stream as it grows somehow?

 

 

  was:
Sometimes, for example, you use tika to parse an XLS file that isn't really 
that big, maybe 60 MB. and suddenly the JVM heap size taken is >800Mb which 
causes an OOM in my case.

Can we make an "abort threshold" where the tika parse will halt if heap bytes 
utilized grows passed a certain amount? 

Or it is possible for users to already do this themselves by watching the input 
stream as it grows somehow?

 

 


> Provide a way to abort tika parses when tika input stream buffer grows passed 
> a certain threshold
> -------------------------------------------------------------------------------------------------
>
>                 Key: TIKA-2575
>                 URL: https://issues.apache.org/jira/browse/TIKA-2575
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>            Reporter: Nicholas DiPiazza
>            Priority: Major
>
> Sometimes, for example, you use tika to parse an XLS file that isn't really 
> that big, maybe 60 MB. and suddenly the JVM heap size taken is >800Mb which 
> causes an OOM in my case.
> Can we make an "abort threshold" where the tika parse will halt if parse 
> output bytes exceeds this value?
> Or it is possible for users to already do this themselves by watching the 
> input stream as it grows somehow?
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to