[ https://issues.apache.org/jira/browse/TIKA-3097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17149797#comment-17149797 ]
suchendra commented on TIKA-3097: --------------------------------- Finally ran I the load over the application, does Tika have streaming option for pdf files ? (docx and pptx streaming option helped a lot) > Out of memory while parsing docx > -------------------------------- > > Key: TIKA-3097 > URL: https://issues.apache.org/jira/browse/TIKA-3097 > Project: Tika > Issue Type: Bug > Components: core, parser > Affects Versions: 1.24 > Reporter: suchendra > Priority: Major > Attachments: Screenshot from 2020-05-07 08-14-25.png, samplefile.txt, > test.docx > > > I have written simple Scala code to extract the content from uploaded file > which is docx. JVM goes OOM when tika tries to parse the file. I have > configured JVM heap to 1GB and tried with 2GB same issue occurs, issue both > with jar as well as in my code. > Attached the file for reference. -- This message was sent by Atlassian Jira (v8.3.4#803005)