[jira] [Commented] (PDFBOX-4184) [PATCH]: Support simple lossless compression of 16 bit RGB images

Emmeran Seehuber (JIRA) Sun, 08 Apr 2018 10:29:01 -0700

    [ 
https://issues.apache.org/jira/browse/PDFBOX-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16429822#comment-16429822
 ]


Emmeran Seehuber commented on PDFBOX-4184:
------------------------------------------

Oh yes, you are right. And I totally overlooked that the getRGB() used always 
converts into sRGB ...

I already do colorspace tagging in 
[https://github.com/rototor/pdfbox-graphics2d/blob/master/src/main/java/de/rototor/pdfbox/graphics2d/PdfBoxGraphics2DLosslessImageEncoder.java]
 
{code:java}
                                /*
                                 * Do we have a color profile we need to embed?
                                 */
                                if (bi.getColorModel().getColorSpace() 
instanceof ICC_ColorSpace) {
                                        ICC_Profile profile = ((ICC_ColorSpace) 
bi.getColorModel().getColorSpace()).getProfile();
                                        /*
                                         * Only tag a profile if it is not the 
default sRGB profile.
                                         */
                                        if (((ICC_ColorSpace) 
bi.getColorModel().getColorSpace()).getProfile() != ICC_Profile
                                                        
.getInstance(ColorSpace.CS_sRGB)) {

                                                SoftReference<PDICCBased> 
pdProfileRef = profileMap.get(new ProfileSoftReference(profile));

                                                PDICCBased pdProfile = 
pdProfileRef == null ? null : pdProfileRef.get();
                                                if (pdProfile == null) {
                                                        pdProfile = new 
PDICCBased(document);
                                                        OutputStream 
outputStream = pdProfile.getPDStream()
                                                                        
.createOutputStream(COSName.FLATE_DECODE);
                                                        
outputStream.write(profile.getData());
                                                        outputStream.close();
                                                        
pdProfile.getPDStream().getCOSObject().setInt(COSName.N, 
profile.getNumComponents());
                                                        profileMap.put(new 
ProfileSoftReference(profile), new SoftReference<PDICCBased>(pdProfile));
                                                }
                                                
imageXObject.setColorSpace(pdProfile);
                                        }
                                }
{code}
which is of course stupid if the color always get converted to sRGB.... Its not 
only stupid, but also wrong, because it causes color shifts ... argh....

So at the moment PDFBox is not usably for any "real" prepress stuff, as the 
sRGB colorspace is way to small. (At the moment i still use iText 2.1 for my 
prepress stuff, but I want to get rid of it in the long term....)

sRGB as used at the moment in the LosslessFactory is fine for web / display 
only PDFs. But for prepress not so much .... Hmm, I should really try to find 
some time to implement a "ImageEncoderFactory" and implement all different 
encodings correctly (which are mostly 8-bit and 16-bit images, everything with 
less bit depth is likely fine with getRGB() as now - and of course not only 
encode RGB but also encode CMYK...).  (No, I wont use any code of iText; They 
have tons of special hacks to e.g. reuse already encoded PNG data etc which I 
think is not worth the effort and way to complex / to much code).

I have a factory with an API like this in mind: (everything with method 
chaining)
{code:java}
ImageEncoder myEncoder = ImageEncoderFactory.newBuilder(pdDocument)

// Lossy / JPEG quality 0.9
.jpeg(0.9)

// or lossless
.lossless()
// Lossless Compression the fast way with a not so great compression ratio like 
at the moment
.fastCompression()
// Lossless Compression the slow way with maximum possible compression ratio 
(using predictors etc.)
.slowCompression()
// Set conversion to sRGB 8-Bit. Default would be to always use the color space 
/ ICC Profile of the image.
.toSRGB()

// and finally 
.build();

PDImage pdImg = myEncoder.encode(img);
PDImage pdImg2 = myEncoder.encode(img2);
// ... reuse myEncoder as much as possible, but not multithreaded{code}
What do you think?

> [PATCH]: Support simple lossless compression of 16 bit RGB images
> -----------------------------------------------------------------
>
>                 Key: PDFBOX-4184
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4184
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Writing
>    Affects Versions: 2.0.9
>            Reporter: Emmeran Seehuber
>            Priority: Minor
>             Fix For: 2.0.10, 3.0.0 PDFBox
>
>         Attachments: pdfbox_support_16bit_image_write.patch, 
> png16-arrow-bad-no-smask.pdf, png16-arrow-bad.pdf, 
> png16-arrow-good-no-mask.pdf, png16-arrow-good.pdf
>
>
> The attached patch add support to write 16 bit per component images 
> correctly. I've integrated a test for this here: 
> [https://github.com/rototor/pdfbox-graphics2d/commit/8bf089cb74945bd4f0f15054754f51dd5b361fe9]
> It only supports 16-Bit TYPE_CUSTOM with DataType == USHORT images - but this 
> is what you usually get when you read a 16 bit PNG file.
> This would also fix [https://github.com/danfickle/openhtmltopdf/issues/173].
> The patch is against 2.0.9, but should apply to 3.0.0 too.
> There is still some room for improvements when writing lossless images, as 
> the images are currently not efficiently encoded. I.e. you could use PNG 
> encodings to get a better compression. (By adding a COSName.DECODE_PARMS with 
> a COSName.PREDICTOR == 15 and encoding the images as PNG). But this is 
> something for a later patch. It would also need another API, as there is a 
> tradeoff speed vs compression ratio. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

[jira] [Commented] (PDFBOX-4184) [PATCH]: Support simple lossless compression of 16 bit RGB images

Reply via email to