Thanks for the reply John!

Unfortunately we cannot supply the problem PDF as it's customer data.
However, please see the log lines below when calling PDDocument.load:



org.apache.pdfbox.pdfparser.BaseParser parseCOSDictionary

WARNING: Invalid dictionary, found: '[' but expected: '/'

WARN |1214-122639 493|main|extractors.PDFTextExtractor|java.io.IOException:
expected='R' actual='0' at offset 9983



As you can see, PDDocument.load throws java IOException here, whereas
previously with the force option set to true load would not throw.


Any idea what has caused the changed behaviour?


Kind regards,

Joe

On Fri, Dec 11, 2015 at 5:59 PM, John Hewson <[email protected]> wrote:

> Hi Joe,
>
> The force option in 1.8 only did one thing: it skipped invalid characters
> in strings.
> We have better handing for this in 2.0 and so force is no longer necessary.
>
> Perhaps the problem you’re encountering is due to other changes in the
> parser
> in 2.0, if you could post a PDF publicly then we can take a look at it.
>
> — John
>
> > On 11 Dec 2015, at 07:02, Joe Ye <[email protected]> wrote:
> >
> > Hi,
> >
> >
> > With the latest version 2.0.0-RC2, I found that the force flag of the
> below
> > method signature (to skip corrupt PDF objects) no longer exists. This
> broke
> > some of our existing usage. Could you advise if there's an alternative
> way
> > to do it (i.e. skip corrupt objects)?
> >
> >
> >
> > public static PDDocument
> > <
> http://pdfbox.apache.org/docs/1.8.10/javadocs/org/apache/pdfbox/pdmodel/PDDocument.html
> >
> > load(InputStream
> > <
> http://download.oracle.com/javase/1.5.0/docs/api/java/io/InputStream.html?is-external=true
> >
> > input,
> >              boolean force)
> >                       throws IOException
> > <
> http://download.oracle.com/javase/1.5.0/docs/api/java/io/IOException.html?is-external=true
> >
> >
> >
> >
> > Many thanks,
> >
> > Joe
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Reply via email to