Re: [OpenJDK Rasterizer] Marlin #4

2016-12-21 Thread Laurent Bourgès
Hi Sergey, thank you to look at this problem. I confirm that your simple patch improves the performance on my laptop ubuntu 16.4 (gcc 5.4 as yours) with intel i4700 cpu when I run the ellipse JMH test. - ojdk9 without patch: Benchmark (size) Mode Cnt Score Error Units El

Re: [OpenJDK Rasterizer] Marlin #4

2016-12-21 Thread Sergey Bylokhov
Hi, Laurent. Can you please check the next patch: == diff -r 8a61c000a194 make/lib/Awt2dLibraries.gmk --- a/make/lib/Awt2dLibraries.gmk Tue Dec 20 09:52:14 2016 -0800 +++ b/make/lib/Awt2dLibraries.gmk Wed Dec 21 17:33:36 2016 +0300 @@ -222,6 +222,7 @@ # applies to debug builds.

Re: [OpenJDK Rasterizer] Marlin #4

2015-11-23 Thread Jim Graham
Hi Laurent, On 11/23/15 9:02 AM, Laurent Bourgès wrote: I know that Marlin is slightly slower than ductus for shape size ~ 20: Ductus seems using 16x16 blocks whereas Marlin uses 32x32 tiles so the new RLE approach is not in use (raw encoding) i.e. lots of zero-fill / array copy operations. Wh

Re: [OpenJDK Rasterizer] Marlin #4

2015-11-23 Thread Jim Graham
Hi Laurent, On 11/23/15 1:35 PM, Laurent Bourgès wrote: It seems you are right: there is a potential remaining failure ! I tested my code in CrashTest but it passed as the off heap growth exponentially ie the mentioned case never happened ! Yes, I believe that the growth algorithms make this a

Re: [OpenJDK Rasterizer] Marlin #4

2015-11-23 Thread Laurent Bourgès
Hi Jim, Sorry I sent the message partially edited by mistake. > The point is that the hard failure is a condition of when we need more than we can provide, not when we "already have" more than we can provide. needSize should cause the hard failure, not the current size. And if needSize is going

Re: [OpenJDK Rasterizer] Marlin #4

2015-11-23 Thread Laurent Bourgès
Hi Jim, >> I'm code reading now: >> >> ArrayCache.java, line 205 - should that be needSize there? Also, >> should these tests be > or >=? >> >> I wanted to limit the size to 2M (Integer.MAX_VALUE) but it wanted 2 >> passes: first, return 2M, then if more needed, fail ! >> If prefer us

Re: [OpenJDK Rasterizer] Marlin #4

2015-11-23 Thread Laurent Bourgès
Jim, Here are few answers to several questions during last friday's sprint: > I ran a bunch of tests on 4.2 and saw no issues and the performance looked > good. There were still some things that Ductus was faster on, but I just > did a brief run of a few tests I cobbled together so I don't know

Re: [OpenJDK Rasterizer] Marlin #4

2015-11-20 Thread Laurent Bourgès
Jim, I am ok with your changes to use the new Unsafe and all others also. PS: I added recently the enableLogs flag to make Marlin silent to mimic Phil's approach to disable stdout logs 6 months ago. I am going to sleep now. Good luck & good night, Laurent Le 20 nov. 2015 23:53, "Jim Graham" a

Re: [OpenJDK Rasterizer] Marlin #4

2015-11-20 Thread Phil Race
OK. So it is fine as you have it. -phil. On 11/20/2015 01:11 PM, Jim Graham wrote: On 11/20/15 12:53 PM, Phil Race wrote: On 11/20/2015 08:50 AM, Jim Graham wrote: Here is the webrev for the remaining pre-integration tasks below. This includes: - turning off Marlin logging static final b

Re: [OpenJDK Rasterizer] Marlin #4

2015-11-20 Thread Jim Graham
On 11/20/15 12:53 PM, Phil Race wrote: On 11/20/2015 08:50 AM, Jim Graham wrote: Here is the webrev for the remaining pre-integration tasks below. This includes: - turning off Marlin logging static final boolean enableLogs = false; 34 // enable Logger 35 static final boolean u

Re: [OpenJDK Rasterizer] Marlin #4

2015-11-20 Thread Laurent Bourgès
Jim, It seems ok. Laurent Le 20 nov. 2015 17:50, "Jim Graham" a écrit : > > Here is the webrev for the remaining pre-integration tasks below. This includes: > > - turning off Marlin logging > - switching to Marlin as the default renderer > - adding a flag to print out which renderer is used on

Re: [OpenJDK Rasterizer] Marlin #4

2015-11-20 Thread Jim Graham
Here is the webrev for the remaining pre-integration tasks below. This includes: - turning off Marlin logging - switching to Marlin as the default renderer - adding a flag to print out which renderer is used on startup webrev: http://cr.openjdk.java.net/~flar/Marlin/Defaults/webrev.00/ Please

Re: [OpenJDK Rasterizer] Marlin #4

2015-11-20 Thread Jim Graham
I will be pushing this code cleanup changeset momentarily followed by one more push to accomplish all of the changing of the various defaults... Second code cleanup task - switching to jdk.internal.misc.Unsafe... webrev: http://cr.openjdk.java.net/~flar/Marlin/NewUnsafeClass.0/

Re: [OpenJDK Rasterizer] Marlin #4

2015-11-20 Thread Phil Race
On 11/20/2015 08:50 AM, Jim Graham wrote: Here is the webrev for the remaining pre-integration tasks below. This includes: - turning off Marlin logging static final boolean enableLogs = false; 34 // enable Logger 35 static final boolean useLogger = enableLogs && MarlinProperties

Re: [OpenJDK Rasterizer] Marlin #4

2015-11-20 Thread Jim Graham
I will be pushing this momentarily along with a few other pushes to accomplish the list of pre-integration changes I mentioned below... First cleanup push is using the blessed modifier order script. webrev: http://cr.openjdk.java.net/~flar/Marlin/BlessedModifiers/webrev.4.3/ There should be no

Re: [OpenJDK Rasterizer] Marlin #4

2015-11-20 Thread Jim Graham
Hi Laurent, I ran a bunch of tests on 4.2 and saw no issues and the performance looked good. There were still some things that Ductus was faster on, but I just did a brief run of a few tests I cobbled together so I don't know how representative they are. Marlin beat Ductus on a number of te

Re: [OpenJDK Rasterizer] Marlin #4

2015-11-19 Thread Laurent Bourgès
Jim, > My goal is to be able to publish a webrev to the 2D mailing list by tomorrow so hopefully I'll see a diff soon. Excellent news ! It would be nice to have a webrev against the GR forest and also a webrev against 4.2 to streamline reviewing. I sent you a webrev against the GR forrest on my

Re: [OpenJDK Rasterizer] Marlin #4

2015-11-19 Thread Laurent Bourgès
One more thing: Did you agree with phil to make Marlin the default renderer for both OpenJDK and Oracle JDK (closed source) to obtain the maximum exposure very early ? Laurent

Re: [OpenJDK Rasterizer] Marlin #4

2015-11-19 Thread Laurent Bourgès
Jim, Here is the new webrev Marlin #4.3: http://cr.openjdk.java.net/~lbourges/marlin/marlin-s4.3/ Changes: - MarlinCache: fillRLE only: I clear both arrays (alphaRow...) inline as only few values are expected => 10% better ! - Fixed minor bugs, unused imports and system.out.println calls ... - ad

Re: [OpenJDK Rasterizer] Marlin #4

2015-11-19 Thread Jim Graham
Hi Laurent, My goal is to be able to publish a webrev to the 2D mailing list by tomorrow so hopefully I'll see a diff soon. It would be nice to have a webrev against the GR forest and also a webrev against 4.2 to streamline reviewing. I may do some pushes to the GR-forest on your behalf ove

Re: [OpenJDK Rasterizer] Marlin #4

2015-11-19 Thread Jim Graham
Yes... ...jim On 11/19/15 2:01 PM, Laurent Bourgès wrote: One more thing: Did you agree with phil to make Marlin the default renderer for both OpenJDK and Oracle JDK (closed source) to obtain the maximum exposure very early ? Laurent

Re: [OpenJDK Rasterizer] Marlin #4

2015-11-18 Thread Laurent Bourgès
Hi Jim, Good to see it is moving forward. > I am going to move forward with intent to get this version 4.2 into the client repos as the version we will go into Feature Complete milestone with. Let me know if there is a more recent version I should be looking at. I will publish a new webrev asap

Re: [OpenJDK Rasterizer] Marlin #4

2015-11-18 Thread Jim Graham
Hi Laurent, I am going to move forward with intent to get this version 4.2 into the client repos as the version we will go into Feature Complete milestone with. Let me know if there is a more recent version I should be looking at. I'm about to do some test builds and check performance and ru

Re: [OpenJDK Rasterizer] Marlin #4

2015-11-16 Thread Jim Graham
Hi Laurent, That is my number one focus this week, including doing a test build with it and running my own tests. The big fly in the ointment, though, is that I've recently hosed my JDK build environment for Windows (I normally work on the FX source and only rarely build the JDK sources) and

Re: [OpenJDK Rasterizer] Marlin #4

2015-11-11 Thread Laurent Bourgès
Jim, Just a reminder: could you review the last marlin patch 4.2 (10/19) ? > >> 1. Do you prefer I send you another webrev including my last changes ? > > > > > > Let me look through the latest webrev first. > > Ok. FYI new changes are very small. If you want, I can send you an up-to-date webrev

Re: [OpenJDK Rasterizer] Marlin #4

2015-10-30 Thread Laurent Bourgès
Hi Jim, >> 1. Do you prefer I send you another webrev including my last changes ? > > > Let me look through the latest webrev first. Ok. FYI new changes are very small. I started writing the CrashTest class testing several corner cases with huge images & paths to force growable arrays to resize

Re: [OpenJDK Rasterizer] Marlin #4

2015-10-30 Thread Phil Race
I think making it *the* default would certainly shake out problems better and faster. Internal testing is mostly done on Oracle JDK builds, not OpenJDK builds. And if you want to change this from ductus to marlin *after* feature freeze you will encounter a lot more questions (push back due to inv

Re: [OpenJDK Rasterizer] Marlin #4

2015-10-30 Thread Jim Graham
One question on the integration... On 10/29/15 4:22 AM, Laurent Bourgès wrote: integration into jdk9 forest: Integration Due:2015-11-27 What is the plan according to you ? Do we want to switch the flag to make Marlin the default before then? We should probably at least make it the default Op

Re: [OpenJDK Rasterizer] Marlin #4

2015-10-30 Thread Jim Graham
Hi Laurent, On 10/29/15 4:22 AM, Laurent Bourgès wrote: 1. Do you prefer I send you another webrev including my last changes ? Let me look through the latest webrev first. 2. I will be busy in november so I would like to anticipate Marlin That's unfortunate as now that JavaOne is over I ha

Re: [OpenJDK Rasterizer] Marlin #4

2015-10-29 Thread Laurent Bourgès
Jim, just a reminder on my last patch waiting for review. Meanwhile I improved a bit the array cleanup in the RLE case and got some gains: Test Threads Ops Med Pct95 Avg StdDev Min Max TotalOps CircleTests.ser 1 170 61.993 62.304 62.018 0.159 61.687 62.667 170 *EllipseTests-fill-false.ser * *1*

Re: [OpenJDK Rasterizer] Marlin #4

2015-10-19 Thread Laurent Bourgès
Hi Jim, Here is the new webrev: http://cr.openjdk.java.net/~lbourges/marlin/marlin-s4.2/ I added the OffHeapArray class used by Renderer and now by MarlinCache to store rowAAChunk data. Moreover I performed other small optimizations (heuristics, Renderer.addLine() split in 2 methods) and many ben

Re: [OpenJDK Rasterizer] Marlin #4

2015-10-12 Thread Jim Graham
Hi Laurent, These are great results! And they are much easier to read with the tables (which seem to get lost in my reply, oops!). If it is just the dashing results I can believe that as Ductus does a pretty good job of minimizing the number of segments in its stroked output paths. The los

Re: [OpenJDK Rasterizer] Marlin #4

2015-10-12 Thread Laurent Bourgès
Hi Jim, Here is below the webrev I prepared last saturday night. However, I made progress since as I inlined few methods and now use Unsafe for rowAAChunk storage (to save few percents avoiding bound checks). *So please, just have a look to see the new hybrid approach but do not make a full revi

Re: [OpenJDK Rasterizer] Marlin #4

2015-10-09 Thread Jim Graham
Hi Laurent, I've been looking at it a little lately. One thing that occurred to me is that the 2 strategies - RLE vs uncompressed - might be easier to follow and manage if they were broken out into separate classes: MarlinCache +--- MarlinRLECache +--- MarlinUncompressedCache

Re: [OpenJDK Rasterizer] Marlin #4

2015-10-09 Thread Laurent Bourgès
Jim, FYI I worked on improving Marlin patch #4 to send you asap a new webrev: it will be simpler with only 2 variants (uncompressed and RLE+blockFlags). I also fixed a bug in the array cache related to crossings (ptrLen != ptrEnd) in Renderer.endRendering () and wrote a test that makes all cached

Re: [OpenJDK Rasterizer] Marlin #4

2015-10-06 Thread Jim Graham
We should also be wary of compiler options that are a win on one processor family and a loss on another. Anything that schedules instructions may be specific to a particular generation of CPUs, for instance. Or for i5 vs i7 vs M(obile)... ...jim On 10/2/15 9:10 AM, L

Re: [OpenJDK Rasterizer] Marlin #4

2015-10-02 Thread Laurent Bourgès
Sergey, thanks for the information: I tried your gcc options on my ubuntu 14.4 (v4.8.4) and it is actually >> slightly faster: 10% on my fill ellipse test (450ms vs 490ms). >> > > I tested by your jmh test, and the difference became bigger on 1400 size. > Interesting; I will try too. > >> Do y

Re: [OpenJDK Rasterizer] Marlin #4

2015-10-02 Thread Sergey Bylokhov
On 02.10.15 10:57, Laurent Bourgès wrote: Sergey, I tried your gcc options on my ubuntu 14.4 (v4.8.4) and it is actually slightly faster: 10% on my fill ellipse test (450ms vs 490ms). I tested by your jmh test, and the difference became bigger on 1400 size. Do you know which gcc compiler an

Re: [OpenJDK Rasterizer] Marlin #4

2015-10-02 Thread Laurent Bourgès
Sergey, I tried your gcc options on my ubuntu 14.4 (v4.8.4) and it is actually slightly faster: 10% on my fill ellipse test (450ms vs 490ms). Do you know which gcc compiler and options are used to build JavaSE EA? Moreover, the linux distrib may define default options. I will try to figure out

Re: [OpenJDK Rasterizer] Marlin #4

2015-10-01 Thread Laurent Bourgès
Jim, Thanks for your ideas below: > Some thoughts - we record some info on each scanline - mostly about the new edges that are added. Perhaps we could keep deltas of how many edges come and go per scanline and then sum them up at the start to figure out if any scanline has a lot of crossings? I

Re: [OpenJDK Rasterizer] Marlin #4

2015-10-01 Thread Jim Graham
Hi Sergy, Is this a new patch that replaces the -march flag (and has the same performance benefits)? Are those options compatible with all of our supported platforms? If so, then we should file a bug and get that into 9... ...jim On 10/1/15 10:10 AM, Sergey Bylokhov

Re: [OpenJDK Rasterizer] Marlin #4

2015-10-01 Thread Sergey Bylokhov
My patch below: >< diff -r 2b680924a73f make/lib/Awt2dLibraries.gmk --- a/make/lib/Awt2dLibraries.gmk Wed Sep 16 18:34:38 2015 +0300 +++ b/make/lib/Awt2dLibraries.gmk Thu Oct 01 17:06:38 2015 +0300 @@ -243,7 +243,7 @@ EXCLUDES := $(LIBAWT_EXCLUDES), \ EXCL

Re: [OpenJDK Rasterizer] Marlin #4

2015-10-01 Thread Phil Race
oh and ps .. we are of course also dependent on what the version of gcc in use knows about the CPU. The important one is whatever RE use. All of this is tricky to get right and maintain and is one reason why the adaptive JIT in hotspot can be a good thing. -phil. On 10/01/2015 09:22 AM, Phil Ra

Re: [OpenJDK Rasterizer] Marlin #4

2015-10-01 Thread Phil Race
The results of -march=native are more interesting/useful if we also know what exact CPU this was compiled on. Obviously we can't use that option directly in JDK builds but it may be that we could make some judgement about whether it is acceptable to trade slow-downs on some rarer CPUs for speed-u

Re: [OpenJDK Rasterizer] Marlin #4

2015-10-01 Thread Laurent Bourgès
Sergey, Thanks for having tests. What is the units of your your results ? I guess it is on the time axis: slower values are better. How did you hack the gcc options in the openjdk build scripts ? I could try on my local build too. Bye, Laurent Le 1 oct. 2015 17:39, "Sergey Bylokhov" a écrit :

Re: [OpenJDK Rasterizer] Marlin #4

2015-10-01 Thread Sergey Bylokhov
I got results below after I made a hack and added -march=native to the libawt library: "EllipseFill.fillEllipse 1400" 8u60_RE: 6,540 9_dev: 8,457 9_hack: 6,276 So we have a window for tweaking. - sergey.bylok...@oracle.com wrote: > Hello, > I built both version of jdk8 and jdk9 on my local

Re: [OpenJDK Rasterizer] Marlin #4

2015-10-01 Thread Sergey Bylokhov
Hello, I built both version of jdk8 and jdk9 on my local system, and compares output of preprocessor and generated assemblers, both are identical. But jdk8u60 from RE actually 20% faster than my version of jdk8. It seems that the difference is in the compiler and/or in some compiler(gcc) options

Re: [OpenJDK Rasterizer] Marlin #4

2015-09-24 Thread Laurent Bourgès
Jim, > As far as why the software loops are slower... > > Did any command line options change for compiling IntArgbPre.c? Touch the file and rebuild and verify if the compiler options are the same (and that both builds use the same compiler)... To avoid all possible side effects, I deliberately

Re: [OpenJDK Rasterizer] Marlin #4

2015-09-24 Thread Jim Graham
As far as why the software loops are slower... Did any command line options change for compiling IntArgbPre.c? Touch the file and rebuild and verify if the compiler options are the same (and that both builds use the same compiler)... ...jim

Re: [OpenJDK Rasterizer] Marlin #4

2015-09-24 Thread Jim Graham
Hi Laurent, You are looking at the wrong loop. It's tough to explain... vis_*.c are only ever compiled or used on Solaris. They convince the compiler to emit Sparc's version of MMX instructions. They are not even compiled on any other build except for Solaris. You were probably confused b

Re: [OpenJDK Rasterizer] Marlin #4

2015-09-24 Thread Laurent Bourgès
Sergey, I managed to create a new benchmark with JMH + perfasm profiler: http://cr.openjdk.java.net/~lbourges/jmh/ellipse_fill/ See MyBenchMark.java that fills an ellipse with radius in {"100", "500", "900", "1400"} I tested with both Oracle JDK8 and Oracle JDK9 EA b81 ie using the ductus render

Re: [OpenJDK Rasterizer] Marlin #4

2015-09-23 Thread Sergey Bylokhov
On 22.09.15 0:15, Laurent Bourgès wrote: Conclusion: The new patch seems promising as it is very close to ductus performance. Filling ellipse seems slower on OpenJDK9 (492 / 437 = 12% slower) ! Any MaskFill changes ? For such checks I suggest to use JMH + "prof perfasm". It will provide reall

Re: [OpenJDK Rasterizer] Marlin #4

2015-09-23 Thread Jim Graham
Some thoughts - we record some info on each scanline - mostly about the new edges that are added. Perhaps we could keep deltas of how many edges come and go per scanline and then sum them up at the start to figure out if any scanline has a lot of crossings? One slight optimization in the non-

Re: [OpenJDK Rasterizer] Marlin #4

2015-09-23 Thread Jim Graham
Hi Laurent, On 9/21/15 2:15 PM, Laurent Bourgès wrote: Here is a summary showing only my ellipse draw / fill tests (radius = 1 to 2000): As you can see below, the table is still mangled, but due to fewer columns I was able to piece things together. Marlin 0.7.0 on JDK1.8.60: Test

Re: [OpenJDK Rasterizer] Marlin #4

2015-09-21 Thread Laurent Bourgès
Jim, I would like your point of view on the new algorithms to store alpha values efficiently and some advices on heuristics / metrics to make the adaptive approach more efficient / robust. I hope having some spare time soon to spend on improving this patch... Here are few more explanations that

Re: [OpenJDK Rasterizer] Marlin #4

2015-09-17 Thread Jim Graham
Hi Laurent, Sorry it took me so long to get around to this... MarlinConst.java, line 86 - "+2 explain"? MarlinProperties.java - indentation on && continuations should be 4 spaces and/or line up the operands (as in: return isEnableRLE() && isSomethingElse...; OR return isEna

[OpenJDK Rasterizer] Marlin #4

2015-09-10 Thread Laurent Bourgès
Jim, Here is the first webrev improving copyAARow() on large shapes (pixel loops): http://cr.openjdk.java.net/~lbourges/marlin/marlin-s4.0/ Note: I also incorporated few changes related to force cleanup in case of runtime exception happening within pathTo(): see MarlinRenderingEngine, Stroker, D