**** Correction **** (see # 3)
Here is some updated information on this problem I`m having. .
. . Alan
1. ReplaceString works OK on another PDF file, except when a text
string is broken up across PDF Operators.
2. I ran the PDF Debugger utility on both files. The problem file, a
30-page document, has four Dictionary(Pages) parts under
Root:Dictionary(Catalog), Pages:Dictionary(Pages), Kids:Array. The normal
file, a one-page document, has only one. The problem file does not show a
StructTreeRoot, whereas the normal file does. The problem file does not
have a PieceInfo:Dictionary, whereas the normal file does. The problem file
does not have a MarkInfo:Dictionary part, whereas the normal file does. Not
sure if any of this matters.
3. When I use PdfTextStripper on either file, it outputs the text I
see in the file just fine!
From: Alan Thomas [mailto:[email protected]]
Sent: Tuesday, December 28, 2010 9:34 PM
To: '[email protected]'
Cc: 'Alan Thomas'
Subject: ReplaceString example
I used the ReplaceString example that comes with PDFBox on a PDF file I
have. However, it does not find the text I want to replace.
In looking at the code and putting in some debugging statements, I found
out that the code was looking for a "PDFOperator" operation
(from the getOperation() method) of "Tj" and "TJ". However, my PDF file has
neither.
Question: Where can I find the list of all the operators that display
strings in a PDF file? (Or is there an easier way to search and replace
strings?)
Thanks, Alan
PDFOperator op = (PDFOperator)next;
//Tj and TJ are the two operators that display
//strings in a PDF
if( op.getOperation().equals( "Tj" ) )