PdfDebugger is working fine - so the issue must be with how I'm using the
library, or how I'm extracting the globals stream...

I checked the globals stream contents that I'm extracting and compared to
the globals in PDFDebugger, and they are identical bytes.

I also checked the image content stream, and it has identical bytes as well.


I even changed my code to be identical to yours:

                JBIG2ImageReader reader = (JBIG2ImageReader)
ImageIO.getImageReadersByFormatName("JBIG2").next();
                JBIG2Globals globals =
reader.processGlobals(ImageIO.createImageInputStream(new
ByteArrayInputStream(globalBytes)));
                reader.setGlobals(globals);
                reader.setInput(ImageIO.createImageInputStream(new
ByteArrayInputStream(imageBytes)));
                return reader.read(0, reader.getDefaultReadParam());

and it still fails.

But PDFDebugger works fine.


So it would seem like the way that PDFBox invokes JBIG2ImageReader is not
the above?  Could that be right??

- K


Kevin Day

*trumpet**p| *480.961.6003 x1002
*e| *ke...@trumpetinc.com
*www.trumpetinc.com <http://trumpetinc.com/>*

LinkedIn <https://www.linkedin.com/company/trumpet-inc.>| Trumpet Blog
<http://trumpetinc.com/blog/>| Twitter  <https://twitter.com/trumpetinc>


On Fri, Sep 20, 2019 at 9:28 PM Tilman Hausherr <thaush...@t-online.de>
wrote:

> I wonder if the PDF can be displayed with PDFDebugger. If no => bug. If
> yes, then you should debug this to see what calls are done, and whether
> you have the same data input. Your calls seem to be OK, they look
> similar to those I did when I debugged something in the jbig2 reader
> (link is before it went to Apache, don't open issues on github):
> https://github.com/levigo/jbig2-imageio/issues/21
>
> Tilman
>
> Am 20.09.2019 um 22:23 schrieb Kevin Day:
> > I am trying to use JBIG2ImageReader to parse JBIG2 data from a PDF (the
> > image stream and globals are being provided - we are not using PdfBox to
> > parse the PDF itself).  Please let me know if I should be using a
> different
> > communication avenue for JBIG2 specific questions.
> >
> >
> > Here's what I'm trying to do:
> >
> >                 JBIG2ImageReader jbig2Reader = new JBIG2ImageReader(new
> > JBIG2ImageReaderSpi());
> >
> >                          byte[] globalBytes = //raw bytes from PDF
> > DECODEPARAMS, JBIG2GLOBALS
> >
> >                          ImageInputStream globalsInputStream = new
> > DefaultInputStreamFactory().getInputStream(new
> > ByteArrayInputStream(globalBytes));
> >
> >                          JBIG2Globals globals =
> > jbig2Reader.processGlobals(globalsInputStream);
> >                          jbig2Reader.setGlobals(globals);
> >
> >                   byte[] imageBytes = // raw JBIG2 image stream bytes
> from
> > PDF
> >                  ImageInputStream imageInputStream = new
> > DefaultInputStreamFactory().getInputStream(new
> > ByteArrayInputStream(image.getImageAsBytes()));
> >                  jbig2Reader.setInput(imageInputStream);
> >
> >                  return jbig2Reader.read(0);
> >
> >
> > When I do this, I get a null pointer exception:
> >
> > Exception in thread "main" java.lang.RuntimeException: Can't instantiate
> > segment classException in thread "main" java.lang.RuntimeException: Can't
> > instantiate segment class at
> >
> org.apache.pdfbox.jbig2.SegmentHeader.getSegmentData(SegmentHeader.java:420)
> > at org.apache.pdfbox.jbig2.JBIG2Page.createNormalPage(JBIG2Page.java:202)
> > at org.apache.pdfbox.jbig2.JBIG2Page.createPage(JBIG2Page.java:168) at
> > org.apache.pdfbox.jbig2.JBIG2Page.composePageBitmap(JBIG2Page.java:157)
> at
> > org.apache.pdfbox.jbig2.JBIG2Page.getBitmap(JBIG2Page.java:133) at
> > org.apache.pdfbox.jbig2.JBIG2ImageReader.read(JBIG2ImageReader.java:249)
> at
> > javax.imageio.ImageReader.read(ImageReader.java:939)
> >
> > ....
> >
> > Caused by: java.lang.NullPointerException at
> >
> org.apache.pdfbox.jbig2.segments.TextRegion.initSymbols(TextRegion.java:1010)
> > at
> >
> org.apache.pdfbox.jbig2.segments.TextRegion.getSymbols(TextRegion.java:273)
> > at
> >
> org.apache.pdfbox.jbig2.segments.TextRegion.parseHeader(TextRegion.java:154)
> > at org.apache.pdfbox.jbig2.segments.TextRegion.init(TextRegion.java:1128)
> > at
> >
> org.apache.pdfbox.jbig2.SegmentHeader.getSegmentData(SegmentHeader.java:413)
> > ... 19 more
> >
> >
> >
> >
> >
> >
> >
> > The SegmentHeader array in TextRegion looks like this:
> >
> >   (org.apache.pdfbox.jbig2.SegmentHeader[]) [null,
> >
> > #SegmentNr: 377
> > SegmentType: 0
> > PageAssociation: 1
> > Referred-to segments: none
> > ]
> >
> >
> >
> > Note that the first element is null.  I'm not sure why this is (maybe
> it's
> > not a valid JBIG2 data stream??).  This file opens and displays fine in
> PDF
> > viewers, so I'm assuming it must be something that I'm doing wrong.
> >
> >
> > Any pointers?
> >
> > - K
> >
> > Kevin Day
> >
> > *trumpet**p| *480.961.6003 x1002
> > *e| *ke...@trumpetinc.com
> > *www.trumpetinc.com <http://trumpetinc.com/>*
> >
> > LinkedIn <https://www.linkedin.com/company/trumpet-inc.>| Trumpet Blog
> > <http://trumpetinc.com/blog/>| Twitter  <https://twitter.com/trumpetinc>
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: users-h...@pdfbox.apache.org
>
>

Reply via email to