[
https://issues.apache.org/jira/browse/PDFBOX-2445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14180329#comment-14180329
]
Michael Goddard commented on PDFBOX-2445:
-----------------------------------------
Here's where I ran the PDFBox app JAR and saw this:
[mymac:~]$ java -version
java version "1.8.0_05"
Java(TM) SE Runtime Environment (build 1.8.0_05-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.5-b02, mixed mode)
[mymac:~]$ uname -aDarwin mymac.localdomain 13.1.0 Darwin Kernel Version
13.1.0: Thu Jan 16 19:40:37 PST 2014; root:xnu-2422.90.20~2/RELEASE_X
86_64 x86_64[mymac:~]$ java -Xmx1g -jar pdfbox-app-1.8.7.jar ExtractText
-console -encoding UTF-8 ./Apache_Solr_4.7_Ref_Guide.pdf
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.AbstractCollection.toArray(AbstractCollection.java:136)
at java.util.ArrayList.<init>(ArrayList.java:168)
at org.apache.pdfbox.cos.COSDocument.getObjects(COSDocument.java:534)
at org.apache.pdfbox.cos.COSDocument.close(COSDocument.java:591)
at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:258)
at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1233)
Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler
in thread "main"
I will try on CentOS, too:
[mylinux ~]$ java -version
java version "1.7.0_55"
Java(TM) SE Runtime Environment (build 1.7.0_55-b13)
Java HotSpot(TM) 64-Bit Server VM (build 24.55-b03, mixed mode)
[mylinux ~]$ uname -a
Linux mylinux.localdomain 2.6.32-358.el6.x86_64 #1 SMP Fri Feb 22 00:31:26 UTC
2013 x86_64 x86_64 x86_64 GNU/Linux
[mylinux ~]$ cat /etc/redhat-release
CentOS release 6.4 (Final)
[mylinux ~]$ java -Xmx1g -jar pdfbox-app-1.8.7.jar ExtractText -console
-encoding UTF-8 ./Apache_Solr_4.7_Ref_Guide.pdf
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.AbstractCollection.toArray(AbstractCollection.java:136)
at java.util.ArrayList.<init>(ArrayList.java:164)
at org.apache.pdfbox.cos.COSDocument.getObjects(COSDocument.java:534)
at org.apache.pdfbox.cos.COSDocument.close(COSDocument.java:591)
at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:258)
at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1233)
at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1198)
at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1123)
at org.apache.pdfbox.ExtractText.startExtraction(ExtractText.java:212)
at org.apache.pdfbox.ExtractText.main(ExtractText.java:85)
at org.apache.pdfbox.PDFBox.main(PDFBox.java:58)
> Out of Memory - Extract text for Apache_Solr_4.7_Ref_Guide.pdf
> --------------------------------------------------------------
>
> Key: PDFBOX-2445
> URL: https://issues.apache.org/jira/browse/PDFBOX-2445
> Project: PDFBox
> Issue Type: Bug
> Components: Parsing, PDModel
> Affects Versions: 1.8.7, 2.0.0
> Reporter: Maruan Sahyoun
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)