[ https://issues.apache.org/jira/browse/PDFBOX-5902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17898824#comment-17898824 ]
Tilman Hausherr edited comment on PDFBOX-5902 at 11/16/24 4:18 AM: ------------------------------------------------------------------- Text extracts in less than a minute with {{-Xmx2g}} on 2.0.32 with the command line tool. A bit slower with {{-Xmx1g}} but still less than a minute. (feeling, not exact timing) was (Author: tilman): Text extracts in less than a minute with {{-Xmx2g}} on 2.0.32 with the command line tool. > The CPU usage of a PDF file with a size of 85.6 MB is abnormal > -------------------------------------------------------------- > > Key: PDFBOX-5902 > URL: https://issues.apache.org/jira/browse/PDFBOX-5902 > Project: PDFBox > Issue Type: Bug > Affects Versions: 2.0.31, 3.0.2 PDFBox > Reporter: ltzzZ > Priority: Major > Attachments: image-2024-11-15-17-07-17-802.png > > > When I try to extract the text content from a pdf file with a size of 85.6MB, > at this point the CPU usage is abnormal, the threshold of the alarm is > reached, and the extraction speed is also very slow, didn't get results for a > few minutes, not a memory problem, also tried to upgrade the version of the > library, this problem still exists. > !image-2024-11-15-17-07-17-802.png! -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org