Re: [OpenJDK Rasterizer] Fwd: Re: Fwd: RFR: Marlin renderer #3

Laurent Bourgès Fri, 10 Jul 2015 14:09:10 -0700

Jim,

Here are some news before leaving for a long week end:


> It uses FloatMath.ceil() that internally use the ceil_int()
implementation for performance.
> I agree it should use directly the ceil_int() to be more explicit.

I did it and made some tests with endRendering () disabled to evaluate the
complete pipeline including addLine cost:
- floating point:
1st map: 35ms/115ms = 30%
complex map: 180ms/780ms = 25%
- fixed point:
1st: 40ms/120ms
complex: 200/800ms

I can send you detailled results if you want.

As this test only compares impacts of my changes to addLine(), the small
slowdown is mainly due to its increased complexity: many ceil / floor +
math ops.
So making efforts on improving addLine and curve decimation will improve
complex map rendering ~ 25% of the rendering time.

>> Another technique to try would be to use longs which would involve a
64-bit shift to get the integer part, but there is already a 32-bit shift
to add the error overflow anyway.

I tried quickly: getLong/putLong but packing/unpacking integers seem slower
not faster ~ 3%.

I will send you that Renderer variant next week to let you have a look.

> I may try as a last chance if removing Unsafe usage is not faster.
> I really like this approach as it will remove a lot of code = Unsafe
usage + OffHeapEdgeArray + dispose / cleaner thread.

> Moreover, hotspot may optimize more such normal array accesses than
Unsafe calls (intrinsics); however, it may also introduce array bound
checks ...

I made that variant too: it is 1% slower than Unsafe but not faster :
however, the code is a lot more readable and the performance difference is
too small to justify using Unsafe (and I experienced many seg faults while
making changes...)
I also tested cache line (32 per edge) and page size (4k) alignment without
any gain on the Unsafe variant.

Probably bound checks are causing the minor slowdown but it is safer.
I will send you soon a webrev to let you understand.

Few ideas to discuss:
1/ I wonder now if the gridding = ceil (x/y - 0.5) should be done
differently: why not apply the offset to - 0.5 to points before curve
decimation or adding lines: it may saves a lot of substractions:
AddLine (x1,y1,x2,y2) implies 4 substractions whereas lineTo (x2,y2) only
needs to adjust the last point.
Idem for curve decimation, shifting points may help.

- do you know if the breakCurveAndAddLines (quad or cubic) really takes
into account the supersampling scale to generate only segments needed and
no more ?

- I use fixed-point (32.32 + error) as you did but it is maybe too precise:
the slope, bumpx and error could be determined from integer coordinates for
starting / ending points = ceil (x1 - 0.5), ceil (y - 0.5) directly

Any advice ?

Cheers,
Laurent

Re: [OpenJDK Rasterizer] Fwd: Re: Fwd: RFR: Marlin renderer #3

Reply via email to