[
https://issues.apache.org/jira/browse/PDFBOX-2007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13960439#comment-13960439
]
John Hewson edited comment on PDFBOX-2007 at 4/4/14 9:35 PM:
-------------------------------------------------------------
{quote}
a few weeks ago I noticed that one optimization potential in getRGBImage()
would be to check whether the source domain = destination domain = 0...255.
This would save some of the math.
{quote}
This was my first though too, because it is getRGBImage() itself which is slow.
I've added a fast path in [revision
1584914|https://github.com/apache/pdfbox/commit/258acece60d1b1d3cc4a14df37d3fcf1159ec61b]
for 8-bit images which don't use non-default Decode arrays or color masks,
i.e. the majority of images. On my machine I'm getting around 2.8s for the
trunk now vs 2.6 for version 1.8 :)
Update: I should add that getRGBImage() is now very fast, under 1% of the
execution time. The remaining difference in performance between 1.8 and 2.0 is
now due to fromBGRtoRGB at 5.7% of execution time.
was (Author: jahewson):
{quote}
a few weeks ago I noticed that one optimization potential in getRGBImage()
would be to check whether the source domain = destination domain = 0...255.
This would save some of the math.
{quote}
This was my first though too, because it is getRGBImage() itself which is slow.
I've added a fast path in [revision
1584914|https://github.com/apache/pdfbox/commit/258acece60d1b1d3cc4a14df37d3fcf1159ec61b]
for 8-bit images which don't use non-default Decode arrays or color masks,
i.e. the majority of images. On my machine I'm getting around 2.8s for the
trunk now vs 2.6 for version 1.8 :)
Update: I should add that getRGBImage() is now very fast, under 1% of the
execution timThe remaining difference in performance between 1.8 and 2.0 is now
due to fromBGRtoRGB at 5.7% of execution time.
> Performance regression since PDFRenderer
> ----------------------------------------
>
> Key: PDFBOX-2007
> URL: https://issues.apache.org/jira/browse/PDFBOX-2007
> Project: PDFBox
> Issue Type: Bug
> Components: Rendering
> Affects Versions: 2.0.0
> Reporter: François Bernier
> Labels: perfomance, regression
> Attachments: PDFBOX-2007.pdf
>
>
> Hi,
> I have the following toy project where I use PDFBox:
> https://github.com/fbernier/taz-clj
> I've been using the snapshot versions of PDFBox for quite a while and
> recently since the move from RenderUtil#convertToImage to
> PDFRenderer#renderImage (this commit:
> https://github.com/fbernier/taz-clj/commit/47917d494f2a9a0999da7f36827c45145d4bb42c),
> there is quite a big performance regression. If I change the PDFBox
> dependency to 1.8.x, everything is good. Here are my benchmarks:
> PDFBox 1.8.x:
> Running 1m test @ http://127.0.0.1:8080/testing.pdf?page=1
> 4 threads and 4 connections
> Thread Stats Avg Stdev Max +/- Stdev
> Latency 208.98ms 58.27ms 391.43ms 52.08%
> Req/Sec 4.63 1.73 8.00 62.88%
> 1224 requests in 1.00m, 72.34MB read
> Requests/sec: 20.40
> Transfer/sec: 1.21MB
> PDFBox 2.0.0:
> Running 1m test @ http://127.0.0.1:8080/testing.pdf?page=1
> 4 threads and 4 connections
> Thread Stats Avg Stdev Max +/- Stdev
> Latency 920.25ms 378.94ms 2.76s 91.38%
> Req/Sec 0.80 0.40 1.00 80.17%
> 275 requests in 1.00m, 15.85MB read
> Requests/sec: 4.58
> Transfer/sec: 270.41KB
> I have not looked any further than this and have no more data to give you
> (yet).
--
This message was sent by Atlassian JIRA
(v6.2#6252)