[
https://issues.apache.org/jira/browse/PDFBOX-1748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13794267#comment-13794267
]
Kurt M commented on PDFBOX-1748:
--------------------------------
Yes, it failed with 1.8.2. I just checked out and built the 2.0 snapshot from
the trunk. It appears that these methods are no longer implemented in the 2.0
version. I tried using RenderUtil instead and this is failing with a different
error.
fullImg = RenderUtil.convertToImage( page,
BufferedImage.TYPE_INT_RGB, fullRes );
[10/14/13 10:14:07:235 CDT] 0000002b SystemOut O 2013-10-14 10:14:07,235
[TaskWorker5] ERROR org.apache.pdfbox.pdmodel.font.PDSimpleFont - Error: Could
not load embedded ToUnicode CMap
[10/14/13 10:14:07:235 CDT] 0000002b SystemOut O 2013-10-14 10:14:07,235
[TaskWorker5] ERROR org.apache.pdfbox.pdmodel.font.PDSimpleFont - Error: Could
not load embedded ToUnicode CMap
[10/14/13 10:14:07:235 CDT] 0000002b SystemOut O 2013-10-14 10:14:07,235
[TaskWorker5] ERROR org.apache.pdfbox.pdmodel.font.PDSimpleFont - Error: Could
not load embedded ToUnicode CMap
[10/14/13 10:14:07:437 CDT] 0000002b SystemOut O 2013-10-14 10:14:07,437
[TaskWorker5] ERROR org.apache.pdfbox.pdmodel.font.PDSimpleFont - Can't
determine the width of the space character using 250 as default
java.lang.NullPointerException
at
org.apache.pdfbox.pdmodel.font.PDSimpleFont.getSpaceWidth(PDSimpleFont.java:406)
at
org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:343)
at
org.apache.pdfbox.util.operator.ShowTextGlyph.process(ShowTextGlyph.java:62)
at
org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:529)
at
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:258)
at
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:225)
at
org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:205)
at org.apache.pdfbox.pdfviewer.PageDrawer.drawPage(PageDrawer.java:150)
at org.apache.pdfbox.util.RenderUtil.renderPage(RenderUtil.java:213)
at org.apache.pdfbox.util.RenderUtil.convertToImage(RenderUtil.java:177)
at generateSlideImages(PDFUtils.java:216)
I'm not sure if I can post one of the PDFs or not, although I don't know that
it will help. We cannot get the failure to occur on any of our test/dev
systems either. So far, it seems particular to this one system.
We've obviously been trying to figure out what's different in this environment,
but no dice so far. I don't even know what sort of environmental issues could
affect this code.
Each thread has its own instance of PDDocument.
> PDPage.convertToImage fails with IndexOutOfBoundsException: Index: 0, Size: 0
> -----------------------------------------------------------------------------
>
> Key: PDFBOX-1748
> URL: https://issues.apache.org/jira/browse/PDFBOX-1748
> Project: PDFBox
> Issue Type: Bug
> Affects Versions: 1.7.1, 1.8.2
> Reporter: Kurt M
>
> I am trying to create jpg images from pages in a pdf document. The following
> code works in most environments, but we recently installed on a new system
> and are experiencing a high rate of failure there. The same files that fail
> there work in other environments.
> Here is my code:
> PDDocument doc;
> try {
> RandomAccess scratchFile = new RandomAccessBuffer();
> doc = PDDocument.loadNonSeq( pdfFile, scratchFile );
> int pageNum = 0;
> int fullRes = 96;
> List<PDPage> pages = doc.getDocumentCatalog().getAllPages();
> BufferedImage fullImg;
> for( PDPage page : pages ) {
> pageNum++;
> fullImg = page.convertToImage( BufferedImage.TYPE_INT_RGB,
> fullRes );
> and here is the stack trace (this is from the 1.7.1 version I think):
> java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
> at java.util.ArrayList.get(ArrayList.java:352)
> at
> org.apache.pdfbox.io.RandomAccessBuffer.seek(RandomAccessBuffer.java:84)
> at
> org.apache.pdfbox.io.RandomAccessFileInputStream.read(RandomAccessFileInputStream.java:96)
> at java.io.BufferedInputStream.read1(BufferedInputStream.java:267)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:328)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:229)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:248)
> at java.io.FilterInputStream.read(FilterInputStream.java:77)
> at java.io.PushbackInputStream.read(PushbackInputStream.java:133)
> at
> org.apache.pdfbox.io.PushBackInputStream.read(PushBackInputStream.java:91)
> at
> org.apache.pdfbox.pdfparser.BaseParser.skipSpaces(BaseParser.java:1547)
> at
> org.apache.pdfbox.pdfparser.PDFStreamParser.parseNextToken(PDFStreamParser.java:230)
> at
> org.apache.pdfbox.pdfparser.PDFStreamParser.access$000(PDFStreamParser.java:46)
> at
> org.apache.pdfbox.pdfparser.PDFStreamParser$1.tryNext(PDFStreamParser.java:182)
> at
> org.apache.pdfbox.pdfparser.PDFStreamParser$1.hasNext(PDFStreamParser.java:194)
> at
> org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:257)
> at
> org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:237)
> at
> org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:217)
> at org.apache.pdfbox.pdfviewer.PageDrawer.drawPage(PageDrawer.java:119)
> at org.apache.pdfbox.pdmodel.PDPage.convertToImage(PDPage.java:730)
> at generateSlideImages(PDFUtils.java:213) <-- my code
--
This message was sent by Atlassian JIRA
(v6.1#6144)