> I refered to the PDF Ref document u mentioned earlier. According to that > the > following represents one of the inline images that i wish to get rid off. I > am using PDFStreamParser to read the tokens and skip the following tokens > while retaining other tokens. I thought this would remove the inline > images. > But the entire content on the page is disappearing. I guess it would by sufficient to skip the lines with "/R15 Do"
> q 79.56 0 0 27 18 687 cm > BI > /CS/R14 > /W 106 > /H 36 > /BPC 8 > /F/Fl > ID <.....Some Stream.....> > EI Q > q 90 0 0 27 97.56 687 cm > /R15 Do > Q > q 90 0 0 27 187.56 687 cm > /R15 Do > Q > q 90 0 0 27 277.56 687 cm > /R15 Do > Q > q 90 0 0 27 367.56 687 cm > /R15 Do > Q > > Some questions that I have: > 1. Is PDFStreamParser the one that can deal with this problem? Whatever you'll do to get rid of the images it'll be a dirty hack. If you hack the Parser, you have to take care not to skip to much. Perhaps it'll be a better idea to hack the operator-class itself. Have a look at "org.apache.pdfbox.util.operator.pagedrawer.Invoke". It should be possible to just skip the image from a specific stream. e.g. R15. > 2. Does "q...cm...BI....EI...Q" and "q...cm...R15...Do" represent the two > sections responsible for drawing inline images? As I already mentioned above, the Do-operators paints the image. HTH, Andreas Lehmkühler > Regards, > Shishir. > > 2009/5/5 Andreas Lehmkühler <[email protected]> > > > > I wish to delete specific images from a PDF. How can this be done? > > > I am aware of the operators like BI, EI, etc. But i don't know which > > > sections i should remove from the stream to delete specific images. > > I many cases images are represented by an image XObject and they will be > > painted by the Do-operator (PDF-1.7 reference chapters 4.7 and 4.8). > > > > Andreas Lehmkühler > > > --- original Nachricht Ende ----
