[ https://issues.apache.org/jira/browse/PDFBOX-1067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13494182#comment-13494182 ]
Dave Smith commented on PDFBOX-1067: ------------------------------------ OK. the above patch turns out was SOOO close. This fixes the problem. Turns out http://code.google.com/p/jbig2-imageio/ was returning a 8 bit pixel image. So I added a convert method and that did the trick. I copied the whole code above. JBIG2Filter.java ... @Override public void decode( InputStream compressedData, OutputStream result, COSDictionary options, int filterIndex ) throws IOException { Iterator<ImageReader> readers = ImageIO.getImageReadersByFormatName("JBIG2"); if (!readers.hasNext()) { log.error( "Can't find an ImageIO plugin to decode the JBIG2 encoded datastream."); return; } ImageReader reader = readers.next(); COSDictionary decodeP = (COSDictionary) options.getDictionaryObject(COSName.DECODE_PARMS); COSInteger bits = (COSInteger) options.getDictionaryObject(COSName.BITS_PER_COMPONENT); COSStream st = (COSStream) decodeP.getDictionaryObject(COSName.getPDFName("JBIG2Globals")); reader.setInput(ImageIO.createImageInputStream(JBIG2StreamMerge(st.getFilteredStream(),compressedData))); BufferedImage bi = reader.read(0); reader.dispose(); if ( bi != null ) { if(bi.getColorModel().getPixelSize() != bits.intValue()) // I am assuming since JBIG2 is always black and white depending on your renderer this might or might be needed { if(bits.intValue()!=1) { log.error("Do not know how to deal with JBIG2 with more than 1 bit"); return; } BufferedImage packaedImage = new BufferedImage(bi.getWidth(), bi.getHeight(), BufferedImage.TYPE_BYTE_BINARY); packaedImage.getGraphics().drawImage(bi, 0, 0, null); bi=packaedImage; } DataBuffer dBuf = bi.getData().getDataBuffer(); if ( dBuf.getDataType() == DataBuffer.TYPE_BYTE ) { result.write( ( ( DataBufferByte ) dBuf ).getData() ); } else { log.error( "Image data buffer not of type byte but type " + dBuf.getDataType() ); } } else { log.error( "Something went wrong when decoding the JBIG2 encoded datastream."); } } // ugly. Should use some sort of stream merge ... protected static InputStream JBIG2StreamMerge(InputStream globals,InputStream body) throws IOException { ByteArrayOutputStream out = new ByteArrayOutputStream(); byte buf[] = new byte[1024]; int read = globals.read(buf); while(read != -1) { out.write(buf, 0, read); read = globals.read(buf); } read = body.read(buf); while(read != -1) { out.write(buf, 0, read); read = body.read(buf); } out.close(); return new ByteArrayInputStream(out.toByteArray()); } > PDF Scan from Xerox WorkCentre 5030 renders as all black > -------------------------------------------------------- > > Key: PDFBOX-1067 > URL: https://issues.apache.org/jira/browse/PDFBOX-1067 > Project: PDFBox > Issue Type: New Feature > Components: PDModel > Affects Versions: 1.6.0 > Environment: Tested on MacOS X 10.6.7, Ubuntu 10.10, Windows 7 > Reporter: Sarah Kelley > Labels: JBIG2 > Attachments: ItDoesntWorkScan.jbig2, ItDoesntWorkScan.pdf, > sakelley_pdf_rendering_problem.zip > > > The file "ItDoesntWorkScan.pdf" renders to an empty > black page. This file is a copy of "ItDoesntWorkPrinted.pdf" > that has been printed on paper, and then scanned with > a Xerox WorkCentre 5030 scanner, which then emails a pdf file > back to the user. > Tested On: > - Mac OS 10.6 > - Windows 7 > - Ubuntu 10.10 > Unfortunately, the WorkCentre 5030 doesn't appear to have > many user-settable options for scanning to PDF, so we weren't > really able to try scanning with settings other than the defaults. > Will attach pdf and code to demonstrate. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira