[ 
https://issues.apache.org/jira/browse/PDFBOX-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16470976#comment-16470976
 ] 

Emmeran Seehuber commented on PDFBOX-4184:
------------------------------------------

As the topic image encoding comes up again in OpenHTMLToPDF (see 
[https://github.com/danfickle/openhtmltopdf/issues/212)] I reworked my 16 bit 
predictor based encoding I had laying around and extended it to support most 
BufferedImage formats and CMYK images. I originally did this for using with 
iText some time ago. See [^lossless_predictor_based_imageencoding.patch]

It implements image encoding with a PNG predictor. Depending on the image to 
encoding this results in massive space savings compared to simple image 
encoding without a predictor. Also image with extended color profiles work. 

To test the CMYK support I need a CMYK profile. Any one would do. For a quick 
test I used a profile from here:  
[http://download.adobe.com/pub/adobe/iccprofiles/win/AdobeICCProfiles.zip]

I have no idea if we are allowed to include this profile in the test resources. 
It's missing in the patch, you must copy it from the download archive. I think 
we might also be allowed to use a profile from 
[http://www.eci.org/en/downloads]. But they did not publish any license 
information :(

I did not do any performance tests yet, but the predictor encoding should be 
faster then the existing encoding, as it tries to be more friendly to the cache 
(e.g. writing a row directly into a zip stream).

Please review this patch. Do I need to sign a CLA?

> [PATCH]: Support simple lossless compression of 16 bit RGB images
> -----------------------------------------------------------------
>
>                 Key: PDFBOX-4184
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4184
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Writing
>    Affects Versions: 2.0.9
>            Reporter: Emmeran Seehuber
>            Priority: Minor
>             Fix For: 2.0.10, 3.0.0 PDFBox
>
>         Attachments: lossless_predictor_based_imageencoding.patch, 
> pdfbox_support_16bit_image_write.patch, png16-arrow-bad-no-smask.pdf, 
> png16-arrow-bad.pdf, png16-arrow-good-no-mask.pdf, png16-arrow-good.pdf
>
>
> The attached patch add support to write 16 bit per component images 
> correctly. I've integrated a test for this here: 
> [https://github.com/rototor/pdfbox-graphics2d/commit/8bf089cb74945bd4f0f15054754f51dd5b361fe9]
> It only supports 16-Bit TYPE_CUSTOM with DataType == USHORT images - but this 
> is what you usually get when you read a 16 bit PNG file.
> This would also fix [https://github.com/danfickle/openhtmltopdf/issues/173].
> The patch is against 2.0.9, but should apply to 3.0.0 too.
> There is still some room for improvements when writing lossless images, as 
> the images are currently not efficiently encoded. I.e. you could use PNG 
> encodings to get a better compression. (By adding a COSName.DECODE_PARMS with 
> a COSName.PREDICTOR == 15 and encoding the images as PNG). But this is 
> something for a later patch. It would also need another API, as there is a 
> tradeoff speed vs compression ratio. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to