**** Correction **** (see # 3)

 

              Here is some updated information on this problem I`m having. .
. .    Alan

 

1.       ReplaceString works OK on another PDF file, except when a text
string is broken up across PDF Operators.

 

2.       I ran the PDF Debugger utility on both files.  The problem file, a
30-page document, has four Dictionary(Pages) parts under
Root:Dictionary(Catalog), Pages:Dictionary(Pages), Kids:Array.  The normal
file, a one-page document, has only one.  The problem file does not show a
StructTreeRoot, whereas the normal file does.  The problem file does not
have a PieceInfo:Dictionary, whereas the normal file does.  The problem file
does not have a MarkInfo:Dictionary part, whereas the normal file does.  Not
sure if any of this matters.

 

3.       When I use PdfTextStripper on either file, it outputs the text I
see in the file just fine!

 

From: Alan Thomas [mailto:[email protected]] 
Sent: Tuesday, December 28, 2010 9:34 PM
To: '[email protected]'
Cc: 'Alan Thomas'
Subject: ReplaceString example

 

    I used the ReplaceString example that comes with PDFBox on a PDF file I
have.  However, it does not find the text I want to replace.

 

   In looking at the code and putting in some debugging statements, I found
out that the code was looking for a "PDFOperator" operation 

(from the getOperation() method) of "Tj" and "TJ".  However, my PDF file has
neither.  

 

     Question:   Where can I find the list of all the operators that display
strings in a PDF file?  (Or is there an easier way to search and replace
strings?)

 

                              Thanks, Alan 

 

PDFOperator op = (PDFOperator)next;

                        //Tj and TJ are the two operators that display

                        //strings in a PDF

                        if( op.getOperation().equals( "Tj" ) )

 

 

 

Reply via email to