If you show us the stacktrace we might be able to point you in the right direction or refer you to the PDF specification. Another option would be to open test.pdf and search to "IA" (case sensitive) and see if you can determine which object it's failing to read. If there are too many instances of "IA" you can try debugging.
Do other operations work with this file? For example, can you read the bookmarks, copy pages, etc? Try using some of the programs from the utilities package. They might help determine if the issue is with the file or not. If nothing will work, then it's probably a core document structure like the document outline. If you know what program was used to create this PDF, that may help us duplicate the problem. For example if text can not be extracted from all PDFs created with XYZ, then we can see if it's a conforming PDF; if it is, we can update the library. If it's non-conforming (i.e. it doesn't follow the PDF specification), we'll take a look and see what the best way to handle it would be. ---- Thanks, Adam From: "Robson Bortoleto" <[email protected]> To: [email protected] Date: 07/27/2010 06:29 Subject: Re: Problem with Text Extraction in pdfbox 1.2.1 Hi Have you checked if the file is protected (read only)? I have never used the PDFtextStripper, but many times I had different response between files due to write protection. ----- original message -------- Subject: Problem with Text Extraction in pdfbox 1.2.1 Sent: Tue, 27 Jul 2010 From: Jorge Imar Canché Álvarez<[email protected]> > Hi, I am having problems with pdfbox 1.2.1. I want to extract text from > a pdf file but my program throws an exception, the exception message is: > > java.io.IOException: Error: Expected operator 'ID' actual='IA' > > > My test class is: > > PDDocument doc = PDDocument.load("test.pdf"); > PDFtextStripper strip = new PDFTextStripper(); > String text = strip.getText(doc); > > > If I change the test.pdf for another file it works, but I must extract > the text of "test.pdf" > > > Thanks for your help. > > --- original message end ---- ? Click here to submit conditions This email and any content within or attached hereto from Sun West Mortgage Company, Inc. is confidential and/or legally privileged. The information is intended only for the use of the individual or entity named on this email. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or the taking of any action in reliance on the contents of this email information is strictly prohibited, and that the documents should be returned to this office immediately by email. Receipt by anyone other than the intended recipient is not a waiver of any privilege. Please do not include your social security number, account number, or any other personal or financial information in the content of the email. Should you have any questions, please call (800) 453 7884.

