Re: [OpenJDK 2D-Dev] sun.java2D.Pisces renderer Performance and Memory enhancements

Laurent Bourgès Mon, 06 May 2013 06:45:21 -0700

Jim, Andrea & java2d members,

I am happy to announce an updated Pisces patch that is faster again:


- Patched Pisces vs OpenJDK Pisces (ref): ~ 2.5 to 4.5 times faster

   score small *1* 20 248,04% 247,90% 464,65% 248,04% *253,49%* 232,64%
207,77%  *2* 40 276,49% 276,09% 1317,15% 279,32% *308,52%* 251,96% 288,31%
*4* 80 295,18% 295,49% 629,06% 298,08% *316,24%* 269,51% 181,64%
*
*




*
*

 score big *1* 20 356,13% 356,44% 1862,18% 356,47% *360,04%* 345,63% 360,26%
*2* 40 413,56% 414,14% 350,96% 414,06% *411,88%* 412,23% 385,51%  *4* 80
458,96% 459,48% 941,17% 459,68% *467,40%* 425,12% 450,10%
- Patched Pisces vs Oracle JDK 8 (ductus): ~ equal (1T) ~ 60% faster (2T) ~
2 to 3 times faster (4T)

   score small *1* 20 94,02% 93,58% 61,96% 93,53% *92,77%* 93,69% 128,83%  *
2* 40 138,06% 137,95% 763,67% 140,09% *157,44%* 102,14% 183,03%  *4* 80
179,10% 179,17% 494,78% 182,03% *198,80%* 119,86% 176,89%
*
*




*
*

 score big *1* 20 122,67% 122,69% 112,98% 122,69% *122,67%* 122,70% 122,23%
*2* 40 173,02% 173,17% 335,41% 173,50% *178,99%* 160,51% 175,63%  *4* 80
325,52% 326,50% 574,24% 326,59% *330,57%* 226,20% 321,69%
JAVA_OPTS="-server -XX:+PrintCommandLineFlags -XX:-PrintFlagsFinal
-XX:-TieredCompilation "
JAVA_TUNING=" -Xms128m  -Xmx128m"

Full results:
http://jmmc.fr/~bourgesl/share/java2d-pisces/compareRef_Patch_2.ods

http://jmmc.fr/~bourgesl/share/java2d-pisces/patch_opt_05_05_20s.log
http://jmmc.fr/~bourgesl/share/java2d-pisces/ductus_tests_10s.log
http://jmmc.fr/~bourgesl/share/java2d-pisces/ref_test_long.log

Here is the updated pisces patch:
http://jmmc.fr/~bourgesl/share/java2d-pisces/webrev-4/

Changes:
- PiscesCache: use rowAAStride[32][x0; x1; alpha sum(x)] to use alpha data
directly instead of encoding / decoding RLE data
- fixed PiscesTileGenerator.getAlpha() to use directly and optimally
rowAAStride
- Renderer: edges array split into edges [CURX, SLOPE] / edgesInt [NEXT,
YMAX, OR] to avoid float / int conversions
- added "monitors" ie custom cpu / stats probes to gather usage statistics
and cpu timings (minimal overhead): enable them using
PiscesConst.doMonitors flag
- minor tweaks

Remaining tasks:
- basic clipping algorithm to handle trivial shape or line rejection when
no affine transform or simple one (scaling) is in use
- enhance curve / round caps and joins handling to take into account the
spatial resolution: for example, round caps representing less than 2 AA
pixels are visually useless and counter productive (cpu): to be discussed
- cleanup / indentation STILL IN PROGRESS

Jim, I found few bugs / mistakes related to bbox + 1 (piscesCache) and
alpha array (+ 1):
I agree pixel coordinates / edges / crossing should be converted to
integers in an uniform manner (consistent and more accurate) : to be
discusses and remain to be fixed.


Finally I updated MapBench & MapDisplay:
http://jmmc.fr/~bourgesl/share/java2d-pisces/MapBench/

New features:
- each test runs at least 5s (configurable as first CLI arg) to ensure
enough test runs to compute accurate average and statistics
- perform scaling / translation tests (affineTransform) (clipping tests in
progress)
- flush monitors after each test


Jim, few comments below:

2013/5/4 Jim Graham <[email protected]>

>
>  I am perplex and I am going to check pisces code against your given
>> approach.
>>
>
> If for no other reason that to make sure that there aren't two parts of
> the system trying to communicate with different philosophies.  You don't
> want a caller to hand a closed interval to a method which treats the values
> as half-open, for instance.  If the rounding is "different, but
> consistent", then I think we can leave it for now and treat it as a future
> refinement to check if it makes any practical difference and correct.  But,
> if it shows up a left-hand-not-talking-to-**right-hand bug, then that
> would be good to fix sooner rather than later.
>

As said before, minor bugs:
- alpha array (Renderer) handling seems going over its upper limit: I need
to clear it until pix_to + 1 + 1 (pix_to inclusive) !
- edge / crossing coordinate rounding: fix bias to 0.5 => ceil (x - 0.5)


> I think it is OK to focus on your current task of performance and memory
> turmoil, but I wanted to give you the proper background to try to
> understand what you were reading primarily, and possibly to get you
> interested in cleaning up the code as you went as a secondary consideration.


Agreed.

Could you explain me a bit the renderer's scan line algorithm related to
crossing in next() and _endRendering methods ?


>     If every coordinate has already been biased by the -0.5 then ceil is
>>     just the tail end of the rounding equation I gave above.
>>
>>
>> That's not the case => buggy:  x1, y1 and x2, y2 are directly the point
>> coordinates as float values.
>>
>
>
> Then using the ceil() on both is still consistent with half-open
> intervals, it just has a different interpretation of where the sampling
> cut-off lies within the subpixel sample.  When you determine where the
> "crossings" lie, then it would be proper to do the same ceil(y +/- some
> offset) operation to compute the first crossing that is included and the
> first crossing that is excluded.

Ok.


> In this case it appears that the offset if just 0.0 which doesn't really
> meet my expectations, but is a minor issue.  These crossings then become a
> half-open interval of scanline indices in which to compute the value.

To be fixed soon.


>
> I think rounding errors can lead to pixel / shape rasterization
>> deformations ... ?
>>
>
>
> As long as the test is "y < _edges[ptr+YMAX]" then that is consistent with
> a half-open interval sampled at the top of every sub-pixel region, isn't
> it?

Ok.


> I agree with the half-open part of it, but would have preferred a "center
> of sub-pixel" offset for the actual sampling.

Again.


> I am a bit embarrassed to verify maths performed in
>> ScanLineIterator.next() which use edges and edgeBucketCounts arrays ...
>> could you have a look ?
>> Apparently, it uses the following for loop that respects the semi-open
>> interval philosophy:
>>              for (int i = 0, ecur, j, k; i < count; i++) {
>> ...
>>
>
> I'll come back to that at a later time, but it sounds like you are
> starting to get a handle on the design here.


Thanks.

>
>                boolean endRendering() {
>>                   // TODO: perform shape clipping to avoid dealing with
>>         segments
>>         out of bounding box
>>
>>                   // Ensure shape edges are within bbox:
>>                   if (edgeMinX > edgeMaxX || edgeMaxX < 0f) {
>>                       return false; // undefined X bounds or negative Xmax
>>                   }
>>                   if (edgeMinY > edgeMaxY || edgeMaxY < 0f) {
>>                       return false; // undefined Y bounds or negative Ymax
>>                   }
>>
>>
>>     I'd use min >= max since if min==max then I think nothing gets
>>     generated as a result of all edges having both the in and out
>>     crossings on the same coordinate.  Also, why not test against the
>>     clip bounds instead? The code after that will clip the edgeMinMaxXY
>>     values against the boundsMinMax values.  If you do this kind of test
>>     after that clipping is done (on spminmaxxy) then you can discover if
>>     all of the coordinates are outside the clip or the region of interest.
>>
>>
>> I tried here to perform few "fast" checks before doing float to int
>> conversions (costly because millions are performed): I think it can be
>> still improved: edgeMinX > edgeMaxX only ensures edgeMinX is defined and
>> both are positive !
>>
>
> endRendering is called once per shape.  I don't think moving tests above
> its conversions to int will affect our throughput compared to calculations
> done per-vertex.
>

Agreed but having a small gain for each shape is still interesting when
thousands or millions of shapes are rendered !


>
> I'm going to have to review the rest of this email at a future time, my
> apologies...



Looking forward reading you soon and getting more feedback on last changes.

Regards,
Laurent

Re: [OpenJDK 2D-Dev] sun.java2D.Pisces renderer Performance and Memory enhancements

Reply via email to