Re: Re: Deleting images from a PDF

Andreas Lehmkühler Tue, 05 May 2009 23:12:51 -0700

> I refered to the PDF Ref document u mentioned earlier. According to that
> the
> following represents one of the inline images that i wish to get rid off. I
> am using PDFStreamParser to read the tokens and skip the following tokens
> while retaining other tokens. I thought this would remove the inline
> images.
> But the entire content on the page is disappearing.
I guess it would by sufficient to skip the lines with "/R15 Do"


> q 79.56 0 0 27 18 687 cm
> BI
> /CS/R14
> /W 106
> /H 36
> /BPC 8
> /F/Fl
> ID <.....Some Stream.....>
> EI Q
> q 90 0 0 27 97.56 687 cm
> /R15 Do
> Q
> q 90 0 0 27 187.56 687 cm
> /R15 Do
> Q
> q 90 0 0 27 277.56 687 cm
> /R15 Do
> Q
> q 90 0 0 27 367.56 687 cm
> /R15 Do
> Q
> 
> Some questions that I have:
> 1. Is PDFStreamParser the one that can deal with this problem?
Whatever you'll do to get rid of the images it'll be a dirty hack. 
If you hack the Parser, you have to take care not to skip to much. Perhaps 
it'll be a better idea to hack the operator-class itself.
Have a look at "org.apache.pdfbox.util.operator.pagedrawer.Invoke". It should 
be possible to just skip the image from a specific stream. e.g. R15.

> 2. Does "q...cm...BI....EI...Q" and "q...cm...R15...Do" represent the two
> sections responsible for drawing inline images?
As I already mentioned above, the Do-operators paints the image.

HTH,
Andreas Lehmkühler

> Regards,
> Shishir.
> 
> 2009/5/5 Andreas Lehmkühler <[email protected]>
> 
> > > I wish to delete specific images from a PDF. How can this be done?
> > > I am aware of the operators like BI, EI, etc. But i don't know which
> > > sections i should remove from the stream to delete specific images.
> > I many cases images are represented by an image XObject and they will be
> > painted by the Do-operator (PDF-1.7 reference chapters 4.7 and 4.8).
> >
> > Andreas Lehmkühler
> >
> 

--- original Nachricht Ende ----

Re: Re: Deleting images from a PDF

Reply via email to