So replacing text in any PDF is going to be very complicated, if I want
to keep the formatting the same.

 

    I put some print statements in ReplaceString.java.  I never see the text
(even in pieces) for which I am looking!

However, PdfTextStripper does produce it.  This seems bizarre!  

 

    This document has eight PDF streams in it.  Could this be the reason?  

 

                                                    Thanks, Alan 

 


Subject:

 <http://markmail.org/message/ygcdrv2o4zg5iqq2> Re: ReplaceString example

 <http://markmail.org/message/ygcdrv2o4zg5iqq2>
http://pdfbox.markmail.org/images/permalink.gif


From:

Andreas Lehmkuehler ([email protected])

        

Date:

Dec 30, 2010 2:18:44 am

        

List:

org.apache.pdfbox.users

        

Hi,

 

Am 29.12.2010 03:33, schrieb Alan Thomas:

     I used the ReplaceString example that comes with PDFBox on a PDF file I

have.  However, it does not find the text I want to replace.

 

    In looking at the code and putting in some debugging statements, I found

out that the code was looking for a "PDFOperator" operation

Correct.

 

(from the getOperation() method) of "Tj" and "TJ".  However, my PDF file has

neither.

 

      Question:   Where can I find the list of all the operators that
display

strings in a PDF file?  (Or is there an easier way to search and replace

strings?)

Textcontent may be defined in different ways within pdfs. In most cases text


will be splitted into several chunks. They often consist of one or more 

characters, but not necessarily whole words or lines of text. Consequently
one 

has to combine all these text chunks to identify the given text. The 

PDFTextStripper class [1] works like that.

 

Have a look at the PDF reference at [2] section 9.3 "Text State Parameters
and 

Operators" for further information.

 

BR

Andreas Lehmkühler

[1] 

http://svn.apache.org/repos/asf/pdfbox/trunk/pdfbox/src/main/java/org/apache
/pdfbox/util/PDFTextStripper.java

[2] http://www.adobe.com/devnet/pdf/pdf_reference.html

 

Reply via email to