PdfDebugger is working fine - so the issue must be with how I'm using the library, or how I'm extracting the globals stream...
I checked the globals stream contents that I'm extracting and compared to the globals in PDFDebugger, and they are identical bytes. I also checked the image content stream, and it has identical bytes as well. I even changed my code to be identical to yours: JBIG2ImageReader reader = (JBIG2ImageReader) ImageIO.getImageReadersByFormatName("JBIG2").next(); JBIG2Globals globals = reader.processGlobals(ImageIO.createImageInputStream(new ByteArrayInputStream(globalBytes))); reader.setGlobals(globals); reader.setInput(ImageIO.createImageInputStream(new ByteArrayInputStream(imageBytes))); return reader.read(0, reader.getDefaultReadParam()); and it still fails. But PDFDebugger works fine. So it would seem like the way that PDFBox invokes JBIG2ImageReader is not the above? Could that be right?? - K Kevin Day *trumpet**p| *480.961.6003 x1002 *e| *ke...@trumpetinc.com *www.trumpetinc.com <http://trumpetinc.com/>* LinkedIn <https://www.linkedin.com/company/trumpet-inc.>| Trumpet Blog <http://trumpetinc.com/blog/>| Twitter <https://twitter.com/trumpetinc> On Fri, Sep 20, 2019 at 9:28 PM Tilman Hausherr <thaush...@t-online.de> wrote: > I wonder if the PDF can be displayed with PDFDebugger. If no => bug. If > yes, then you should debug this to see what calls are done, and whether > you have the same data input. Your calls seem to be OK, they look > similar to those I did when I debugged something in the jbig2 reader > (link is before it went to Apache, don't open issues on github): > https://github.com/levigo/jbig2-imageio/issues/21 > > Tilman > > Am 20.09.2019 um 22:23 schrieb Kevin Day: > > I am trying to use JBIG2ImageReader to parse JBIG2 data from a PDF (the > > image stream and globals are being provided - we are not using PdfBox to > > parse the PDF itself). Please let me know if I should be using a > different > > communication avenue for JBIG2 specific questions. > > > > > > Here's what I'm trying to do: > > > > JBIG2ImageReader jbig2Reader = new JBIG2ImageReader(new > > JBIG2ImageReaderSpi()); > > > > byte[] globalBytes = //raw bytes from PDF > > DECODEPARAMS, JBIG2GLOBALS > > > > ImageInputStream globalsInputStream = new > > DefaultInputStreamFactory().getInputStream(new > > ByteArrayInputStream(globalBytes)); > > > > JBIG2Globals globals = > > jbig2Reader.processGlobals(globalsInputStream); > > jbig2Reader.setGlobals(globals); > > > > byte[] imageBytes = // raw JBIG2 image stream bytes > from > > PDF > > ImageInputStream imageInputStream = new > > DefaultInputStreamFactory().getInputStream(new > > ByteArrayInputStream(image.getImageAsBytes())); > > jbig2Reader.setInput(imageInputStream); > > > > return jbig2Reader.read(0); > > > > > > When I do this, I get a null pointer exception: > > > > Exception in thread "main" java.lang.RuntimeException: Can't instantiate > > segment classException in thread "main" java.lang.RuntimeException: Can't > > instantiate segment class at > > > org.apache.pdfbox.jbig2.SegmentHeader.getSegmentData(SegmentHeader.java:420) > > at org.apache.pdfbox.jbig2.JBIG2Page.createNormalPage(JBIG2Page.java:202) > > at org.apache.pdfbox.jbig2.JBIG2Page.createPage(JBIG2Page.java:168) at > > org.apache.pdfbox.jbig2.JBIG2Page.composePageBitmap(JBIG2Page.java:157) > at > > org.apache.pdfbox.jbig2.JBIG2Page.getBitmap(JBIG2Page.java:133) at > > org.apache.pdfbox.jbig2.JBIG2ImageReader.read(JBIG2ImageReader.java:249) > at > > javax.imageio.ImageReader.read(ImageReader.java:939) > > > > .... > > > > Caused by: java.lang.NullPointerException at > > > org.apache.pdfbox.jbig2.segments.TextRegion.initSymbols(TextRegion.java:1010) > > at > > > org.apache.pdfbox.jbig2.segments.TextRegion.getSymbols(TextRegion.java:273) > > at > > > org.apache.pdfbox.jbig2.segments.TextRegion.parseHeader(TextRegion.java:154) > > at org.apache.pdfbox.jbig2.segments.TextRegion.init(TextRegion.java:1128) > > at > > > org.apache.pdfbox.jbig2.SegmentHeader.getSegmentData(SegmentHeader.java:413) > > ... 19 more > > > > > > > > > > > > > > > > The SegmentHeader array in TextRegion looks like this: > > > > (org.apache.pdfbox.jbig2.SegmentHeader[]) [null, > > > > #SegmentNr: 377 > > SegmentType: 0 > > PageAssociation: 1 > > Referred-to segments: none > > ] > > > > > > > > Note that the first element is null. I'm not sure why this is (maybe > it's > > not a valid JBIG2 data stream??). This file opens and displays fine in > PDF > > viewers, so I'm assuming it must be something that I'm doing wrong. > > > > > > Any pointers? > > > > - K > > > > Kevin Day > > > > *trumpet**p| *480.961.6003 x1002 > > *e| *ke...@trumpetinc.com > > *www.trumpetinc.com <http://trumpetinc.com/>* > > > > LinkedIn <https://www.linkedin.com/company/trumpet-inc.>| Trumpet Blog > > <http://trumpetinc.com/blog/>| Twitter <https://twitter.com/trumpetinc> > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org > For additional commands, e-mail: users-h...@pdfbox.apache.org > >