Narendran Solai Sridharan created TIKA-3919:
-----------------------------------------------
Summary: Out of Memory during file parsing in AutoDetectParser
Key: TIKA-3919
URL: https://issues.apache.org/jira/browse/TIKA-3919
Project: Tika
Issue Type: Bug
Components: tika-core
Affects Versions: 2.4.1
Environment: While testing load in our existing environment, which has
been upgraded from tika version 1.28.1 to 2.4.1.
The following file which is almost empty [^Model_comparison.xls] had been
parsed via client program multiple times via JMeter. Seems, we are getting Out
of Memory due to a limit set "markLimit = 134217728", but not sure.
!Large Object.PNG!
!Thread dump.PNG!
Reporter: Narendran Solai Sridharan
Attachments: Large Object.PNG, Model_comparison.xls, Thread dump-1.PNG
Out of Memory during file parsing in AutoDetectParser
java.lang.OutOfMemoryError: Java heap space
at
org.apache.tika.io.LookaheadInputStream.<init>(LookaheadInputStream.java:66)
at org.apache.tika.io.TikaInputStream.getPath(TikaInputStream.java:683)
at
org.apache.tika.detect.microsoft.POIFSContainerDetector.getTopLevelNames(POIFSContainerDetector.java:467)
at
org.apache.tika.detect.microsoft.POIFSContainerDetector.detect(POIFSContainerDetector.java:530)
at
org.apache.tika.detect.CompositeDetector.detect(CompositeDetector.java:85)
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:142)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)