[
https://issues.apache.org/jira/browse/PDFBOX-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14048206#comment-14048206
]
John Hewson edited comment on PDFBOX-2126 at 6/30/14 9:47 PM:
--------------------------------------------------------------
Ok, I took a look at Petr's optimisations and did some profiling of setClip.
The main performance issue was that processTextPosition is called for each
glyph and in PageDrawer this calls setClip, however the clipping path can't
actually change in between BT and ET operators, so we were calling setClip
needlessly 15,000 times or so. I added a new beginText() method to
PDFStreamEngine and override that in PageDrawer to set up the graphics for the
text - that took the time from 7sec on my machine to about 4.8sec. While making
these changes I removed the creation of TextPosition instances from
PDFStreamEngine and moved it into a new PDFTextStreamEngine class, because
we're interested in glyphs, not in text runs - this fits very well with the
refactoring I'm doing in PDFBOX-2149 already.
Next I applied Petr's optimisations for not cloning the clipping path in
PDGraphicsState and then checking if the clip is needed in PageDrawer. I called
this method setClip(). It saves another 0.5sec or so on my machine.
I made these changes in [r1606936|http://svn.apache.org/r1606936].
was (Author: jahewson):
Ok, I took a look at Petr's optimisations and did some profiling of setClip.
The main performance issue was that processTextPosition is called for each
glyph and in PageDrawer this calls setClip, however the clipping path can't
actually change in between BT and ET operators, so we were calling setClip
15,000 times or so. I added a new beginText() method to PDFStreamEngine and
override that in PageDrawer to set up the graphics for the text - that took the
time from 7sec on my machine to about 4.8sec. While making these changes I
removed the creation of TextPosition instances from PDFStreamEngine and moved
it into a new PDFTextStreamEngine class, because we're interested in glyphs,
not in text runs - this fits very well with the refactoring I'm doing in
PDFBOX-2149 already.
Next I applied Petr's optimisations for not cloning the clipping path in
PDGraphicsState and then checking if the clip is needed in PageDrawer. I called
this method setClip(). It saves another 0.5sec or so on my machine.
I made these changes in [r1606936|http://svn.apache.org/r1606936].
> Optimize clipping
> -----------------
>
> Key: PDFBOX-2126
> URL: https://issues.apache.org/jira/browse/PDFBOX-2126
> Project: PDFBox
> Issue Type: Improvement
> Components: Rendering
> Affects Versions: 2.0.0
> Reporter: Petr Slaby
> Attachments: ClipPath.1.patch, ClipPath.patch, example_010.pdf
>
>
> As already stated in a TODO comment in PageDrawer, the call of
> Graphics2D#setClip() is time and memory consuming. The attached patch
> optimizes clipping by calling Graphics2D#setClip() only if the clipping path
> has changed. The effect depends on the document, e.g. the attached one
> renders in 10.5s without the optimization and in 5.5 seconds in the optimized
> version.
> The clipping has to be re-applied whenever the transform in Graphics2D
> changes. This is not explicitly checked for, the implementation rather
> depends on the cached value being reset manually. Currently this is only
> needed at one place when processing annotations (AcroForms). Also, the
> implementation relies upon the clipping path object stored in PDGraphicsState
> to never change so that a comparison using == can be used. This works fine,
> but needs a bit of awareness in future changes. To make the design more
> clean, the clipping path could be made private to PDGraphcisState and thus
> really "immutable" from outside.
--
This message was sent by Atlassian JIRA
(v6.2#6252)