Re: [OpenJDK Rasterizer] AWT & gcc 4.8 optimization options

Laurent Bourgès Thu, 21 Jan 2016 00:48:52 -0800

Sergey,

>> So it looks scalar operations on vector (4) ie vectorization should be
>> applicable.
>
>
> yes, I think so.


I googled a bit and it seems tricky to implement alpha blending with sse2
but many projects succeeded by using writing directly sse2 primitives !

>> Maybe the conditions (pathA > 0) && (pathA < 0xff) are a bigger penalty
>> as they can not be easily predicted (but may happen often).
>> Sometimes it is faster to perform useless math operations without
>> branching (gpu approach).
>>
>> Do you have other ideas to make it faster ? as it represents 30% of the
>> ellipse fill test (huge ellipses).
>> I noticed that larger tiles (64x64) are a bit faster (larger tile width
>> / height, less jni calls)
>
>
> I just commented out some of the code inside this method and checks the
performance. It seems that the simple code like:
> inloop->readBytes->decodeRGB->encodeBytes->saveBytes is quite fast. But
if some branch/multiplication are added after decodeRGB then the code
became really slow(x10 slower on my system). This is expected because we
complete huge number of multiplications, but if I try to make the same math
standalone(without byte decoding) then the result is fast also. So it seems
that we slow because of mixing of byteReading/branches/multipliation.

It seems possible to for RGBA:
- compute A+G and R+B together (2×16bits) to double the throughput
- use bit shifts instead of mul / div

Could you try implementing such variants ?

>> Should I try (as I did in the past) to implement the MaskFill in Java to
>> benefit from hotspot optimizations (like Marlin) ?
>
>
> It will be interesting. I remember that someone already tried to do the
same, but I do not remember the result. Probably Jim can suggest something.

I implemented alpha blending in java last year (using custom composite
operator hack):
http://mail.openjdk.java.net/pipermail/2d-dev/2014-August/004751.html

I could try soon optimizing my java impl...

Cheers,
Laurent

Re: [OpenJDK Rasterizer] AWT & gcc 4.8 optimization options

Reply via email to