On 6/08/2011 19:11, Lukas Johansson wrote:
Hello,
I'm currently evaluating iTextPDF for a project and now I'm stuck with
a problem.
Looking at the things you've tried, you've read the book well.
Now you need to know some advanced stuff.
My use case requires me to find the left image (PDF format) in the
document and do some stuff with it and then do the same thing with the
right image. The PDF-document is created by In-design and the images
does not have any specialId.
I don't know if you can add extra IDs to stream dictionaries containing
images.
Maybe, maybe not. So let's look at other options.
At first I just thought I would traverse all images and then compare
there filenames to see if I found the right one, but as I understand
the filenames are not preserved when adding them to a PDF-document.
No, file names are not preserved.
Moreover some images types are converted to another image type before
they are added to a PDF.
For instance: a PNG will be converted to another type of image (what
type? that depends on the tool used to create the PDF).
I then tried to use the image's metadata (which I can see in XML if I
look at the file in a texteditor) by addin a title attribute and check
against that, but I couldn't find how to get hold on this metadata
from a PdfObject/PdfImage.
If the image type is converted to another type of image, chances are
that the XML has disappeared.
We'd have to see a PDF to make sure if the XML is still there.
If it is, you need to get the PRStream of the Image and get the bytes of
that stream.
I then placed the images in there own layer called left and right and
tried to either traverse the layers to find the each layers image or
traversing all images and checking what layer they belong to.
Depending on the tool that creates the PDF, the info about the layers
can be:
[1] part of the content stream of the page
[2] an entry of the stream object of the image
If [1] is the case, then you'll have a lot of work to parse the content
stream.
If [2] is the case, you can use stream.get(PdfName.OC); to find a
reference to the Optional Content dictionary.
Once you have the Optional Content dictionary, you know what layer the
image belongs to.
Note that [2] is preferred over [1] when creating PDFs with images that
belong to a specific layer.
* Found the Image element in two ways
1. int n = reader.getXrefSize();
PdfObject object;
PRStream stream;
for (int i = 0; i < n; i++) {
object = reader.getPdfObject(i);
stream = (PRStream)object;
if (stream.get(PdfName.SUBTYPE) != null &&
stream.get(PdfName.SUBTYPE).equals(PdfName.IMAGE)){
PdfImageObject image = new PdfImageObject(stream);
}
}
That's the "dirty" way: you loop over ALL the objects in the PDF.
This way, you may even find images that aren't even shown on any page in
the PDF.
2. Using PdfReaderContentParser as in the ExtractImages example.
I think this is the better way.
You're missing only one little piece of information (not mentioned in
the book).
See
http://api.itextpdf.com/itext/com/itextpdf/text/pdf/parser/ImageRenderInfo.html
Currently, you retrieve the PdfImageObject with the getImage() method.
To find out its location, you should also use the getImageCTM() method.
CTM is short for Current Transformation Matrix.
Once you have a Matrix object, you can retrieve the X and Y translation:
float x = matrix.get(Matrix.I31);
float y = matrix.get(Matrix.I32);
However, I haven't been able to find any reference to the layer in
the PdfImageObjects that I retrieve.
The reference to the layer should be imgObject.get(PdfName.OC);
I would really appreciate any pointers how to proceed with this.
I hope the above answers help you on the way.
Putting the images inside a layer is a good idea,
but please try the getImageCTM() first and let us know if it works as
expected.
Feedback is always appreciated.
------------------------------------------------------------------------------
BlackBerry® DevCon Americas, Oct. 18-20, San Francisco, CA
The must-attend event for mobile developers. Connect with experts.
Get tools for creating Super Apps. See the latest technologies.
Sessions, hands-on labs, demos & much more. Register early & save!
http://p.sf.net/sfu/rim-blackberry-1
_______________________________________________
iText-questions mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/itext-questions
iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples:
http://itextpdf.com/themes/keywords.php