Sorry for the resubmit, gmail sent while I was still editing...

---------- Forwarded message ----------
From: Todd Chandler <b.todd.chand...@gmail.com>
Date: Mon, Sep 12, 2011 at 11:37 PM
Subject: Submitting patch for bug found in
com.itextpdf.text.pdf.DocumentFont
To: itext-questions@lists.sourceforge.net


Hello

I was trying a simple program to extract text from a PDF file.  I kept
getting a array out of bounds exception with the following stack trace:

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 256
    at com.itextpdf.text.pdf.DocumentFont.doType1TT(DocumentFont.java:372)
    at com.itextpdf.text.pdf.DocumentFont.<init>(DocumentFont.java:121)
    at
com.itextpdf.text.pdf.CMapAwareDocumentFont.<init>(CMapAwareDocumentFont.java:79)
    at
com.itextpdf.text.pdf.parser.PdfContentStreamProcessor.getFont(PdfContentStreamProcessor.java:152)
    at
com.itextpdf.text.pdf.parser.PdfContentStreamProcessor$SetTextFont.invoke(PdfContentStreamProcessor.java:578)
    at
com.itextpdf.text.pdf.parser.PdfContentStreamProcessor.invokeOperator(PdfContentStreamProcessor.java:246)
    at
com.itextpdf.text.pdf.parser.PdfContentStreamProcessor.processContent(PdfContentStreamProcessor.java:366)
    at
com.itextpdf.text.pdf.parser.PdfReaderContentParser.processContent(PdfReaderContentParser.java:79)
    at
com.itextpdf.text.pdf.parser.PdfTextExtractor.getTextFromPage(PdfTextExtractor.java:73)
    at
com.itextpdf.text.pdf.parser.PdfTextExtractor.getTextFromPage(PdfTextExtractor.java:88)
    at org.todd.Insurance.ParsePdf.main(ParsePdf.java:15)

I took a look at the code below and believe to have found the problem:

       if (first != null && last != null && newWidths != null) {
            int f = first.intValue();
            for (int k = 0; k < newWidths.size(); ++k) {
                widths[f + k] = newWidths.getAsNumber(k).intValue();
            }
        }

the array newWidths is larger than 256, which is the array size that widths
is initialized with.  My patch for it follows:

===================================================================
--- src/main/java/com/itextpdf/text/pdf/DocumentFont.java       (revision
4962)
+++ src/main/java/com/itextpdf/text/pdf/DocumentFont.java       (working
copy)
@@ -368,6 +368,13 @@
         }
         if (first != null && last != null && newWidths != null) {
             int f = first.intValue();
+            if (widths.length < (f + newWidths.size())) {
+               int [] swap = new int[f + newWidths.size()];
+               for (int i = 0; i < f; ++i) {
+                       swap[i] = widths[i];
+               }
+               widths = swap;
+            }
             for (int k = 0; k < newWidths.size(); ++k) {
                 widths[f + k] = newWidths.getAsNumber(k).intValue();
             }

It simply does a sanity check on the array sizes before it begins the copy
in the for loop with k.  If it is too small, it makes a new array of
appropriate size, copies the old data into it and then moves on ready for
the rest of the data to be copied in from newWidths.

Enjoy,

Todd Chandler
------------------------------------------------------------------------------
BlackBerry&reg; DevCon Americas, Oct. 18-20, San Francisco, CA
Learn about the latest advances in developing for the 
BlackBerry&reg; mobile platform with sessions, labs & more.
See new tools and technologies. Register for BlackBerry&reg; DevCon today!
http://p.sf.net/sfu/rim-devcon-copy1 
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference 
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: 
http://itextpdf.com/themes/keywords.php

Reply via email to