Hello. I am using PDFBox and one feature is the extraction of images along with
their positions on the page. For that purpose, I have a class that extends the
OperatorProcessor and assign it to the "Do" operator. Now I can get most of the
images just fine - along with their positions. However, there are cases where
this doesn't work that well and I get the following warnings plus the created
images are empty:
extracting the image on a page:
31.05.2010 09:32:19 org.apache.pdfbox.pdmodel.graphics.color.PDSeparation
getColorValues
WARNUNG: Unsupported tint transformation type:
org.apache.pdfbox.pdmodel.common.function.PDFunctionType0 in
org.apache.pdfbox.pdmodel.graphics.color.PDSeparation.getColorValues() using
color black instead.
31.05.2010 09:32:19 org.apache.pdfbox.pdmodel.graphics.color.PDSeparation
getColorValues
WARNUNG: Unsupported tint transformation type:
org.apache.pdfbox.pdmodel.common.function.PDFunctionType0 in
org.apache.pdfbox.pdmodel.graphics.color.PDSeparation.getColorValues() using
color black instead.
31.05.2010 09:32:19 org.apache.pdfbox.pdmodel.graphics.color.PDSeparation
createColorModel
INFO: About to create ColorModel for DeviceCMYK{ }
31.05.2010 09:32:19 org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap
getRGBImage
INFO: ColorModel: ColorModel: #pixelBits = 32 numComponents = 4 color space =
org.apache.pdfbox.pdmodel.graphics.color.colorspacec...@11a64ed transparency =
1 has alpha = false isAlphaPre = false
31.05.2010 09:32:19 org.apache.pdfbox.pdmodel.graphics.color.PDSeparation
createColorModel
INFO: About to create ColorModel for DeviceCMYK{ }
31.05.2010 09:32:19 org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap
getRGBImage
INFO: ColorModel: ColorModel: #pixelBits = 32 numComponents = 4 color space =
org.apache.pdfbox.pdmodel.graphics.color.colorspacec...@11a64ed transparency =
1 has alpha = false isAlphaPre = false
31.05.2010 09:32:19 org.apache.pdfbox.pdmodel.graphics.color.PDSeparation
getColorValues
WARNUNG: Unsupported tint transformation type:
org.apache.pdfbox.pdmodel.common.function.PDFunctionType0 in
org.apache.pdfbox.pdmodel.graphics.color.PDSeparation.getColorValues() using
color black instead.
another page:
31.05.2010 09:32:19 org.apache.pdfbox.util.PDFStreamEngine processOperator
INFO: unsupported/disabled operation: ri
31.05.2010 09:32:19 org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap
getRGBImage
INFO: ColorModel: ColorModel: #pixelBits = 32 numComponents = 4 color space =
org.apache.pdfbox.pdmodel.graphics.color.colorspacec...@11a64ed transparency =
1 has alpha = false isAlphaPre = false
31.05.2010 09:32:19 org.apache.pdfbox.pdmodel.graphics.color.PDSeparation
getColorValues
WARNUNG: Unsupported tint transformation type:
org.apache.pdfbox.pdmodel.common.function.PDFunctionType0 in
org.apache.pdfbox.pdmodel.graphics.color.PDSeparation.getColorValues() using
color black instead.
As I already said, the created images on those pages are empty. However there
is one that works, but the color is inverted, why? Here the warnings:
31.05.2010 09:44:32 org.apache.pdfbox.pdmodel.graphics.color.PDSeparation
getColorValues
WARNUNG: Unsupported tint transformation type:
org.apache.pdfbox.pdmodel.common.function.PDFunctionType0 in
org.apache.pdfbox.pdmodel.graphics.color.PDSeparation.getColorValues() using
color black instead.
31.05.2010 09:44:32 org.apache.pdfbox.pdmodel.graphics.color.PDSeparation
getColorValues
WARNUNG: Unsupported tint transformation type:
org.apache.pdfbox.pdmodel.common.function.PDFunctionType0 in
org.apache.pdfbox.pdmodel.graphics.color.PDSeparation.getColorValues() using
color black instead.
And one final problem: I noticed that several extracted images were somehow
brighter, the colours were brighter than when I extracted the image using the
Adobe Acrobat Pro to manually extract the images, here are the warnings for 2
images on a page with that problem:
31.05.2010 09:47:28 org.apache.pdfbox.util.PDFStreamEngine processOperator
INFO: unsupported/disabled operation: ri
31.05.2010 09:47:28 org.apache.pdfbox.util.PDFStreamEngine processOperator
INFO: unsupported/disabled operation: BDC
31.05.2010 09:47:28 org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap
getRGBImage
INFO: ColorModel: IndexColorModel: #pixelBits = 8 numComponents = 3 color space
= java.awt.color.icc_colorsp...@530cf2 transparency = 1 transIndex = -1 has
alpha = false isAlphaPre = false
31.05.2010 09:47:28 org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap
getRGBImage
INFO: ColorModel: IndexColorModel: #pixelBits = 8 numComponents = 3 color space
= java.awt.color.icc_colorsp...@530cf2 transparency = 1 transIndex = -1 has
alpha = false isAlphaPre = false
31.05.2010 09:47:28 org.apache.pdfbox.pdmodel.graphics.color.PDSeparation
getColorValues
WARNUNG: Unsupported tint transformation type:
org.apache.pdfbox.pdmodel.common.function.PDFunctionType0 in
org.apache.pdfbox.pdmodel.graphics.color.PDSeparation.getColorValues() using
color black instead.
31.05.2010 09:47:28 org.apache.pdfbox.pdmodel.graphics.color.PDSeparation
getColorValues
WARNUNG: Unsupported tint transformation type:
org.apache.pdfbox.pdmodel.common.function.PDFunctionType0 in
org.apache.pdfbox.pdmodel.graphics.color.PDSeparation.getColorValues() using
color black instead.
31.05.2010 09:47:28 org.apache.pdfbox.util.PDFStreamEngine processOperator
INFO: unsupported/disabled operation: EMC
31.05.2010 09:47:28 org.apache.pdfbox.pdmodel.graphics.color.PDSeparation
getColorValues
WARNUNG: Unsupported tint transformation type:
org.apache.pdfbox.pdmodel.common.function.PDFunctionType0 in
org.apache.pdfbox.pdmodel.graphics.color.PDSeparation.getColorValues() using
color black instead.
The strange thing is, when converting the pages to images using PDFToImage,
some images on those pages are correctly on the converted images but some are
not or are smaller, but 4 times next to each other... it's strange.
Unfortunately I cannot upload the pdf, but maybe I can find one that is free
and has the same problems.
Please enlighten me :)
Sebastian Freuck
--
GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT!
Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01