Hello,

I need to do string replacement in a certain region of a pdf page. The
different pieces of the puzzle to do this are present in pdfbox:
- org.apache.pdfbox.examples.pdmodel.ReplaceString
- org.apache.pdfbox.util.PDFTextStripperByArea

I am confident this can be done, but I am struggling to do it in an
elegant way (to optimize reuse of existing classes). 

The first thing I had to do for a replacement is accessing the
COSString. This can be done by creating a variant of
org.apache.pdfbox.util.operator.ShowText.

The second thing to do is writing the changed tokens to the page stream,
but I can not access them: the token iterator is in processSubStream of
PDFStreamEngine. Overriding this method is not an option because it uses
private attributes (without getter). Then I started creating my own
version of PDFStreamEngine, but then I also had to create an own version
of OperatorProcessor, etc...

I am not familiar with the pdfbox code, so maybe someone can give me a
tip on how to do this in an elegant way (or tell me it is not possible
with the current PDFStreamEngine).

cheers,
dirk

Reply via email to