[ 
https://issues.apache.org/jira/browse/PDFBOX-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16475470#comment-16475470
 ] 

Emmeran Seehuber commented on PDFBOX-4184:
------------------------------------------

Just got an idea in the shower ...
{code:java}
Benchmark                                   (zipLevel)   Mode  Cnt    Score   
Error  Units
LosslessFactoryBenchmark.predictor                   3  thrpt    5  168.186 ± 
1.884  ops/s
LosslessFactoryBenchmark.predictor                   6  thrpt    5  109.865 ± 
2.022  ops/s
LosslessFactoryBenchmark.predictor                   9  thrpt    5   20.382 ± 
0.432  ops/s
LosslessFactoryBenchmark.predictorBig                3  thrpt    5    2.617 ± 
0.047  ops/s
LosslessFactoryBenchmark.predictorBig                6  thrpt    5    2.211 ± 
0.029  ops/s
LosslessFactoryBenchmark.predictorBig                9  thrpt    5    1.627 ± 
0.039  ops/s
LosslessFactoryBenchmark.predictorBigBytes           3  thrpt    5    2.219 ± 
0.055  ops/s
LosslessFactoryBenchmark.predictorBigBytes           6  thrpt    5    1.880 ± 
0.057  ops/s
LosslessFactoryBenchmark.predictorBigBytes           9  thrpt    5    1.454 ± 
0.025  ops/s
LosslessFactoryBenchmark.rgbOnly                     3  thrpt    5  247.996 ± 
7.758  ops/s
LosslessFactoryBenchmark.rgbOnly                     6  thrpt    5  128.242 ± 
3.246  ops/s
LosslessFactoryBenchmark.rgbOnly                     9  thrpt    5   14.259 ± 
0.339  ops/s
LosslessFactoryBenchmark.rgbOnlyBig                  3  thrpt    5    8.113 ± 
0.290  ops/s
LosslessFactoryBenchmark.rgbOnlyBig                  6  thrpt    5    3.317 ± 
0.059  ops/s
LosslessFactoryBenchmark.rgbOnlyBig                  9  thrpt    5    1.308 ± 
0.025  ops/s
LosslessFactoryBenchmark.rgbOnlyBigBytes             3  thrpt    5    3.506 ± 
0.066  ops/s
LosslessFactoryBenchmark.rgbOnlyBigBytes             6  thrpt    5    2.149 ± 
0.070  ops/s
LosslessFactoryBenchmark.rgbOnlyBigBytes             9  thrpt    5    1.081 ± 
0.019  ops/s
{code}
Now the predictor is always faster at zip level 9. It is still slower at the 
other zip levels, but not that much. 
[^lossless_predictor_based_imageencoding_v4.patch]

I would be fine with this, so no api change would be needed.

> [PATCH]: Support simple lossless compression of 16 bit RGB images
> -----------------------------------------------------------------
>
>                 Key: PDFBOX-4184
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4184
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Writing
>    Affects Versions: 2.0.9
>            Reporter: Emmeran Seehuber
>            Priority: Minor
>             Fix For: 2.0.10, 3.0.0 PDFBox
>
>         Attachments: LoadGovdocs.java, 
> lossless_predictor_based_imageencoding.patch, 
> lossless_predictor_based_imageencoding_v2.patch, 
> lossless_predictor_based_imageencoding_v3.patch, 
> lossless_predictor_based_imageencoding_v4.patch, 
> pdfbox_support_16bit_image_write.patch, png16-arrow-bad-no-smask.pdf, 
> png16-arrow-bad.pdf, png16-arrow-good-no-mask.pdf, png16-arrow-good.pdf
>
>
> The attached patch add support to write 16 bit per component images 
> correctly. I've integrated a test for this here: 
> [https://github.com/rototor/pdfbox-graphics2d/commit/8bf089cb74945bd4f0f15054754f51dd5b361fe9]
> It only supports 16-Bit TYPE_CUSTOM with DataType == USHORT images - but this 
> is what you usually get when you read a 16 bit PNG file.
> This would also fix [https://github.com/danfickle/openhtmltopdf/issues/173].
> The patch is against 2.0.9, but should apply to 3.0.0 too.
> There is still some room for improvements when writing lossless images, as 
> the images are currently not efficiently encoded. I.e. you could use PNG 
> encodings to get a better compression. (By adding a COSName.DECODE_PARMS with 
> a COSName.PREDICTOR == 15 and encoding the images as PNG). But this is 
> something for a later patch. It would also need another API, as there is a 
> tradeoff speed vs compression ratio. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to