On Thu, 17 Jul 2014, Shannon Brown wrote:
Problem:
How to avoid Out of Memory errors during Tika parsing.
Typical approaches are either to use the ForkParser, or the Tika Server.
Both ensure that if there's a fatal problem with parsing (eg OOM) then
the JVM with your main application in it
I'm working on adding a daemon to Tika Server so that it will restart when it
hits an OOM or other big problem (infinite hangs). That won't be available
until Tika 1.7.
To amplify Nick's recommendations:
ForkParser or Server are your best options for now.
Are there specific files/file