Can't get images from a PDF
---------------------------
Key: PDFBOX-1075
URL: https://issues.apache.org/jira/browse/PDFBOX-1075
Project: PDFBox
Issue Type: Bug
Components: Parsing
Affects Versions: 1.6.0, 1.7.0
Reporter: Antoni Mylka
Fix For: 1.4.0
This is a regression. In 1.4.0 I was able to extract images from a PDF file. In
1.6 and the current trunk I get exceptions:
SEVERE: java.lang.IllegalArgumentException: Raster BytePackedRaster: width =
1000 height = 32 #channels 1 xOff = 0 yOff = 0 is incompatible with ColorModel
IndexColorModel: #pixelBits = 1 numComponents = 3 color space =
java.awt.color.ICC_ColorSpace@1050169 transparency = 1 transIndex = -1 has
alpha = false isAlphaPre = false
java.lang.IllegalArgumentException: Raster BytePackedRaster: width = 1000
height = 32 #channels 1 xOff = 0 yOff = 0 is incompatible with ColorModel
IndexColorModel: #pixelBits = 1 numComponents = 3 color space =
java.awt.color.ICC_ColorSpace@1050169 transparency = 1 transIndex = -1 has
alpha = false isAlphaPre = false
at java.awt.image.BufferedImage.<init>(BufferedImage.java:611)
at
org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap.getRGBImage(PDPixelMap.java:252)
at
org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap.write2OutputStream(PDPixelMap.java:289)
at
org.apache.pdfbox.TestGovdocs148902.test148902(TestGovdocs148902.java:58)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at junit.framework.TestCase.runTest(TestCase.java:168)
at junit.framework.TestCase.runBare(TestCase.java:134)
at junit.framework.TestResult$1.protect(TestResult.java:110)
at junit.framework.TestResult.runProtected(TestResult.java:128)
at junit.framework.TestResult.run(TestResult.java:113)
at junit.framework.TestCase.run(TestCase.java:124)
at junit.framework.TestSuite.runTest(TestSuite.java:232)
at junit.framework.TestSuite.run(TestSuite.java:227)
at
org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
at
org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:49)
at
org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
http://domex.nps.edu/corp/files/govdocs1/148/148902.pdf
I pasted it into the src/test/resources/pdfparser folder and run a test case
like this:
public class TestGovdocs148902 extends TestCase
{
public void test148902() throws IOException {
PDDocument doc = PDDocument.load(
"src/test/resources/pdfparser/148902.pdf");
int imageCounter = 0;
COSDocument cosDoc = doc.getDocument();
List<COSObject> list = cosDoc.getObjectsByType(COSName.XOBJECT);
for (COSObject cosOb : list) {
COSBase baseObject = cosOb.getObject();
if (baseObject != null && baseObject instanceof COSStream) {
COSStream st = (COSStream)baseObject;
String subtype = st.getNameAsString(COSName.SUBTYPE);
if (subtype != null && subtype.equalsIgnoreCase("image")) {
PDXObjectImage ximage =
(PDXObjectImage)PDXObject.createXObject( st );
if (ximage != null && ximage.getWidth() >= 5 &&
ximage.getHeight() >= 5) {
ByteArrayOutputStream baos = new
ByteArrayOutputStream();
ximage.write2OutputStream(baos);
byte [] bytes = baos.toByteArray();
if (bytes.length > 0) {
imageCounter++;
}
}
}
}
}
assertEquals(32, imageCounter);
}
}
The test cases passes in 1.4.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira