I've used the Extract Images Command Line Tool to get the images.

Erik

-----Original Message-----
From: Tilman Hausherr [mailto:[email protected]] 
Sent: Dienstag, 8. November 2016 09:16
To: [email protected]
Subject: Re: Issues with MRC Compressed using JBIG2-image

What methods did you use to get the images?

What I did is to look at the rendering and it looks like in Adobe Reader.

I also looked at the images with PDFDebugger, that one shows the images with 
the mask applied. The second image is at
Root/Pages/Kids/[0]/Resources/XObject/Im002
and it shows colored text. The image is DCT encoded. The mask is black and 
white text that is jbig2 encoded.
http://imgur.com/a/2ofjD

What do you get?

Is the jbig2 decoder in your class path? For PDFDebugger, you need to do
this:

java -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider -cp 
"pdfbox-app-XXXX.jar;lib/*" org.apache.pdfbox.tools.PDFBox PDFReader filename

the subdir "lib" has the additional jars.

Tilman

Am 08.11.2016 um 08:31 schrieb Zeiske, Erik (DualStudy):
> Hello Tilman,
>
> You solved the NPE but there is something else wrong with the outputted 
> images. In the PDF there are 3 images an 2 masks for two of those images. 
> (The PDF is compressed like it is shown here: 
> https://www.abbyy.com/en-us/ocr-sdk-embedded/pdf-mrc/. The Foreground is the 
> second image of the PDF and uses the JBIG2 image as a mask to get the 
> coloured text. The third image and its mask is for the watermark of the PDF 
> and is extracted perfectly fine.) The library doesn't apply the mask 
> correctly to the second image. The resulting image should be only the Text 
> with its colour. But the result is only the colour without the mask applied.
> I hope this makes sense.
>
> Erik.
>
> -----Original Message-----
> From: Tilman Hausherr [mailto:[email protected]]
> Sent: Montag, 7. November 2016 18:27
> To: [email protected]
> Subject: Re: Issues with MRC Compressed using JBIG2-immage
>
> Hello Erik,
>
> I've opened
> https://issues.apache.org/jira/browse/PDFBOX-3558
> and fixed the cause for the NPE in the sources. I have not fully understood 
> your text or maybe misunderstood something, and maybe something is now moot; 
> can you please test with a snapshot that the rendering is like you want it? 
> The build will be there within a few hours.
> https://repository.apache.org/content/groups/snapshots/org/apache/pdfb
> ox/pdfbox-app/2.0.4-SNAPSHOT/
>
> Tilman
>
> Am 07.11.2016 um 08:06 schrieb Zeiske, Erik (DualStudy):
>> Here is a Dropbox link to download the PDF:
>> https://www.dropbox.com/s/q1t58ov6vybu3k7/scan300_1-6.pdf?dl=0
>> I am using version 2.0.3 of PDF-Box
>>
>> -----Original Message-----
>> From: Tilman Hausherr [mailto:[email protected]]
>> Sent: Donnerstag, 3. November 2016 18:07
>> To: [email protected]
>> Subject: Re: Issues with MRC Compressed using JBIG2-immage
>>
>> Am 03.11.2016 um 09:58 schrieb Zeiske, Erik (DualStudy):
>>> Hello everybody,
>>>
>>> I have an issue with PDFBox and the handling of a MRC Compressed PDF.
>>>
>>> The issue is related to the JBIG2 Compression used in the PDF. If I 
>>> try to extract the different Images used in the PDF attached, the 
>>> library throws an NullPointerException cause the Bits are not 
>>> defined in the JBIG2-Filter. I think this is because in the PDF 
>>> there is no "Bits per Component" defined in the JBIG2-Immage. If I 
>>> try to define the Bits in the JAVA-Code the program runs without an 
>>> error, but it doesn't apply the JBIG2 mask properly to the 
>>> foreground-colour-image of the PDF. To fix this issue I tried to 
>>> extract the mask into a file, but it seems like the mask-image is the same 
>>> as the foreground-image.
>>> I couldn't find the reason for this and I don't think it is related 
>>> to the PDF itself.
>>>
>>> The PDF I was using with is in the attached to this e-mail.
>>>
>> Please upload the file to a sharehoster, PDF attachments are not 
>> allowed. Please tell also what version you are using and what
>>
>> Tilman
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to