Am 23.09.2019 um 23:40 schrieb Kevin Day:
PdfDebugger is working fine - so the issue must be with how I'm using the
library, or how I'm extracting the globals stream...
I checked the globals stream contents that I'm extracting and compared to
the globals in PDFDebugger, and they are identical bytes.
I also checked the image content stream, and it has identical bytes as well.
I even changed my code to be identical to yours:
JBIG2ImageReader reader = (JBIG2ImageReader)
ImageIO.getImageReadersByFormatName("JBIG2").next();
JBIG2Globals globals =
reader.processGlobals(ImageIO.createImageInputStream(new
ByteArrayInputStream(globalBytes)));
reader.setGlobals(globals);
reader.setInput(ImageIO.createImageInputStream(new
ByteArrayInputStream(imageBytes)));
return reader.read(0, reader.getDefaultReadParam());
and it still fails.
But PDFDebugger works fine.
So it would seem like the way that PDFBox invokes JBIG2ImageReader is not
the above? Could that be right??
That is true, we're using the reader in a plugin independent way, which
is shown in the source of JBIG2Filter.java:
InputStream encoded = the input stream of the main image (without the
globals)
InputStream source = encoded;
InputStream source = new SequenceInputStream(((COSStream)
globals).createInputStream(), encoded);
...
ImageInputStream iis = ImageIO.createImageInputStream(source);
reader.setInput(iis);
image = reader.read(0, irp);
Tilman
- K
Kevin Day
*trumpet**p| *480.961.6003 x1002
*e| *ke...@trumpetinc.com
*www.trumpetinc.com <http://trumpetinc.com/>*
LinkedIn <https://www.linkedin.com/company/trumpet-inc.>| Trumpet Blog
<http://trumpetinc.com/blog/>| Twitter <https://twitter.com/trumpetinc>
On Fri, Sep 20, 2019 at 9:28 PM Tilman Hausherr <thaush...@t-online.de>
wrote:
I wonder if the PDF can be displayed with PDFDebugger. If no => bug. If
yes, then you should debug this to see what calls are done, and whether
you have the same data input. Your calls seem to be OK, they look
similar to those I did when I debugged something in the jbig2 reader
(link is before it went to Apache, don't open issues on github):
https://github.com/levigo/jbig2-imageio/issues/21
Tilman
Am 20.09.2019 um 22:23 schrieb Kevin Day:
I am trying to use JBIG2ImageReader to parse JBIG2 data from a PDF (the
image stream and globals are being provided - we are not using PdfBox to
parse the PDF itself). Please let me know if I should be using a
different
communication avenue for JBIG2 specific questions.
Here's what I'm trying to do:
JBIG2ImageReader jbig2Reader = new JBIG2ImageReader(new
JBIG2ImageReaderSpi());
byte[] globalBytes = //raw bytes from PDF
DECODEPARAMS, JBIG2GLOBALS
ImageInputStream globalsInputStream = new
DefaultInputStreamFactory().getInputStream(new
ByteArrayInputStream(globalBytes));
JBIG2Globals globals =
jbig2Reader.processGlobals(globalsInputStream);
jbig2Reader.setGlobals(globals);
byte[] imageBytes = // raw JBIG2 image stream bytes
from
PDF
ImageInputStream imageInputStream = new
DefaultInputStreamFactory().getInputStream(new
ByteArrayInputStream(image.getImageAsBytes()));
jbig2Reader.setInput(imageInputStream);
return jbig2Reader.read(0);
When I do this, I get a null pointer exception:
Exception in thread "main" java.lang.RuntimeException: Can't instantiate
segment classException in thread "main" java.lang.RuntimeException: Can't
instantiate segment class at
org.apache.pdfbox.jbig2.SegmentHeader.getSegmentData(SegmentHeader.java:420)
at org.apache.pdfbox.jbig2.JBIG2Page.createNormalPage(JBIG2Page.java:202)
at org.apache.pdfbox.jbig2.JBIG2Page.createPage(JBIG2Page.java:168) at
org.apache.pdfbox.jbig2.JBIG2Page.composePageBitmap(JBIG2Page.java:157)
at
org.apache.pdfbox.jbig2.JBIG2Page.getBitmap(JBIG2Page.java:133) at
org.apache.pdfbox.jbig2.JBIG2ImageReader.read(JBIG2ImageReader.java:249)
at
javax.imageio.ImageReader.read(ImageReader.java:939)
....
Caused by: java.lang.NullPointerException at
org.apache.pdfbox.jbig2.segments.TextRegion.initSymbols(TextRegion.java:1010)
at
org.apache.pdfbox.jbig2.segments.TextRegion.getSymbols(TextRegion.java:273)
at
org.apache.pdfbox.jbig2.segments.TextRegion.parseHeader(TextRegion.java:154)
at org.apache.pdfbox.jbig2.segments.TextRegion.init(TextRegion.java:1128)
at
org.apache.pdfbox.jbig2.SegmentHeader.getSegmentData(SegmentHeader.java:413)
... 19 more
The SegmentHeader array in TextRegion looks like this:
(org.apache.pdfbox.jbig2.SegmentHeader[]) [null,
#SegmentNr: 377
SegmentType: 0
PageAssociation: 1
Referred-to segments: none
]
Note that the first element is null. I'm not sure why this is (maybe
it's
not a valid JBIG2 data stream??). This file opens and displays fine in
PDF
viewers, so I'm assuming it must be something that I'm doing wrong.
Any pointers?
- K
Kevin Day
*trumpet**p| *480.961.6003 x1002
*e| *ke...@trumpetinc.com
*www.trumpetinc.com <http://trumpetinc.com/>*
LinkedIn <https://www.linkedin.com/company/trumpet-inc.>| Trumpet Blog
<http://trumpetinc.com/blog/>| Twitter <https://twitter.com/trumpetinc>
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org