BTW, can we assume that you really have Form XObjects in your PDFs? That question is still open.
According to the PDFTron code it never enters a Form XObject, it goes straight into a Path Object.
For example:

I actually managed to get an answer from Adobe about how they export images. First I should say that Acrobat has an "Export All Images" functionality that only exports raster images. What I am using is the "Save As HTML" functionality that exports all images including the vector graphics but converts them to bitmaps. According to Leonard Rosenthol (http://www.acrobatusers.com/blogs/leonardr) when Acrobat comes across some path data that is not tagged or in Form XObject they group the data into "areas" according to some algorithm.

Seeing these final images separated like this is what made me think that my vector graphics were grouped inside the content stream somehow and my failing attempts to retrieve it like so was at no fault to PDFBox or any other Java PDF Library.

Still your help is greatly appreciated,
Graeme

Reply via email to