Hi Sergey,
thank you to look at this problem.
I confirm that your simple patch improves the performance on my laptop
ubuntu 16.4 (gcc 5.4 as yours) with intel i4700 cpu when I run the ellipse
JMH test.
- ojdk9 without patch:
Benchmark (size) Mode Cnt Score Error Units
El
Hi, Laurent.
Can you please check the next patch:
==
diff -r 8a61c000a194 make/lib/Awt2dLibraries.gmk
--- a/make/lib/Awt2dLibraries.gmk Tue Dec 20 09:52:14 2016 -0800
+++ b/make/lib/Awt2dLibraries.gmk Wed Dec 21 17:33:36 2016 +0300
@@ -222,6 +222,7 @@
# applies to debug builds.
Hi Laurent,
On 11/23/15 9:02 AM, Laurent Bourgès wrote:
I know that Marlin is slightly slower than ductus for shape size ~ 20:
Ductus seems using 16x16 blocks whereas Marlin uses 32x32 tiles so the
new RLE approach is not in use (raw encoding) i.e. lots of zero-fill /
array copy operations.
Wh
Hi Laurent,
On 11/23/15 1:35 PM, Laurent Bourgès wrote:
It seems you are right: there is a potential remaining failure !
I tested my code in CrashTest but it passed as the off heap growth
exponentially ie the mentioned case never happened !
Yes, I believe that the growth algorithms make this a
Hi Jim,
Sorry I sent the message partially edited by mistake.
> The point is that the hard failure is a condition of when we need more
than we can provide, not when we "already have" more than we can provide.
needSize should cause the hard failure, not the current size. And if
needSize is going
Hi Jim,
>> I'm code reading now:
>>
>> ArrayCache.java, line 205 - should that be needSize there? Also,
>> should these tests be > or >=?
>>
>> I wanted to limit the size to 2M (Integer.MAX_VALUE) but it wanted 2
>> passes: first, return 2M, then if more needed, fail !
>> If prefer us
Jim,
Here are few answers to several questions during last friday's sprint:
> I ran a bunch of tests on 4.2 and saw no issues and the performance looked
> good. There were still some things that Ductus was faster on, but I just
> did a brief run of a few tests I cobbled together so I don't know
Jim,
I am ok with your changes to use the new Unsafe and all others also.
PS: I added recently the enableLogs flag to make Marlin silent to mimic
Phil's approach to disable stdout logs 6 months ago.
I am going to sleep now.
Good luck & good night,
Laurent
Le 20 nov. 2015 23:53, "Jim Graham" a
OK. So it is fine as you have it.
-phil.
On 11/20/2015 01:11 PM, Jim Graham wrote:
On 11/20/15 12:53 PM, Phil Race wrote:
On 11/20/2015 08:50 AM, Jim Graham wrote:
Here is the webrev for the remaining pre-integration tasks below. This
includes:
- turning off Marlin logging
static final b
On 11/20/15 12:53 PM, Phil Race wrote:
On 11/20/2015 08:50 AM, Jim Graham wrote:
Here is the webrev for the remaining pre-integration tasks below. This
includes:
- turning off Marlin logging
static final boolean enableLogs = false;
34 // enable Logger
35 static final boolean u
Jim,
It seems ok.
Laurent
Le 20 nov. 2015 17:50, "Jim Graham" a écrit :
>
> Here is the webrev for the remaining pre-integration tasks below. This
includes:
>
> - turning off Marlin logging
> - switching to Marlin as the default renderer
> - adding a flag to print out which renderer is used on
Here is the webrev for the remaining pre-integration tasks below. This
includes:
- turning off Marlin logging
- switching to Marlin as the default renderer
- adding a flag to print out which renderer is used on startup
webrev: http://cr.openjdk.java.net/~flar/Marlin/Defaults/webrev.00/
Please
I will be pushing this code cleanup changeset momentarily followed by
one more push to accomplish all of the changing of the various defaults...
Second code cleanup task - switching to jdk.internal.misc.Unsafe...
webrev: http://cr.openjdk.java.net/~flar/Marlin/NewUnsafeClass.0/
On 11/20/2015 08:50 AM, Jim Graham wrote:
Here is the webrev for the remaining pre-integration tasks below.
This includes:
- turning off Marlin logging
static final boolean enableLogs = false;
34 // enable Logger
35 static final boolean useLogger = enableLogs &&
MarlinProperties
I will be pushing this momentarily along with a few other pushes to
accomplish the list of pre-integration changes I mentioned below...
First cleanup push is using the blessed modifier order script.
webrev: http://cr.openjdk.java.net/~flar/Marlin/BlessedModifiers/webrev.4.3/
There should be no
Hi Laurent,
I ran a bunch of tests on 4.2 and saw no issues and the performance
looked good. There were still some things that Ductus was faster on,
but I just did a brief run of a few tests I cobbled together so I don't
know how representative they are. Marlin beat Ductus on a number of
te
Jim,
> My goal is to be able to publish a webrev to the 2D mailing list by
tomorrow so hopefully I'll see a diff soon.
Excellent news !
It would be nice to have a webrev against the GR forest and also a webrev
against 4.2 to streamline reviewing.
I sent you a webrev against the GR forrest on my
One more thing:
Did you agree with phil to make Marlin the default renderer for both
OpenJDK and Oracle JDK (closed source) to obtain the maximum exposure very
early ?
Laurent
Jim,
Here is the new webrev Marlin #4.3:
http://cr.openjdk.java.net/~lbourges/marlin/marlin-s4.3/
Changes:
- MarlinCache: fillRLE only: I clear both arrays (alphaRow...) inline as
only few values are expected => 10% better !
- Fixed minor bugs, unused imports and system.out.println calls ...
- ad
Hi Laurent,
My goal is to be able to publish a webrev to the 2D mailing list by
tomorrow so hopefully I'll see a diff soon. It would be nice to have a
webrev against the GR forest and also a webrev against 4.2 to streamline
reviewing.
I may do some pushes to the GR-forest on your behalf ove
Yes...
...jim
On 11/19/15 2:01 PM, Laurent Bourgès wrote:
One more thing:
Did you agree with phil to make Marlin the default renderer for both
OpenJDK and Oracle JDK (closed source) to obtain the maximum exposure
very early ?
Laurent
Hi Jim,
Good to see it is moving forward.
> I am going to move forward with intent to get this version 4.2 into the
client repos as the version we will go into Feature Complete milestone
with. Let me know if there is a more recent version I should be looking at.
I will publish a new webrev asap
Hi Laurent,
I am going to move forward with intent to get this version 4.2 into the
client repos as the version we will go into Feature Complete milestone
with. Let me know if there is a more recent version I should be looking at.
I'm about to do some test builds and check performance and ru
Hi Laurent,
That is my number one focus this week, including doing a test build with
it and running my own tests.
The big fly in the ointment, though, is that I've recently hosed my JDK
build environment for Windows (I normally work on the FX source and only
rarely build the JDK sources) and
Jim,
Just a reminder: could you review the last marlin patch 4.2 (10/19) ?
> >> 1. Do you prefer I send you another webrev including my last changes ?
> >
> >
> > Let me look through the latest webrev first.
>
> Ok. FYI new changes are very small.
If you want, I can send you an up-to-date webrev
Hi Jim,
>> 1. Do you prefer I send you another webrev including my last changes ?
>
>
> Let me look through the latest webrev first.
Ok. FYI new changes are very small.
I started writing the CrashTest class testing several corner cases with
huge images & paths to force growable arrays to resize
I think making it *the* default would certainly shake out problems
better and faster.
Internal testing is mostly done on Oracle JDK builds, not OpenJDK builds.
And if you want to change this from ductus to marlin *after* feature freeze
you will encounter a lot more questions (push back due to inv
One question on the integration...
On 10/29/15 4:22 AM, Laurent Bourgès wrote:
integration into jdk9 forest:
Integration Due:2015-11-27
What is the plan according to you ?
Do we want to switch the flag to make Marlin the default before then?
We should probably at least make it the default Op
Hi Laurent,
On 10/29/15 4:22 AM, Laurent Bourgès wrote:
1. Do you prefer I send you another webrev including my last changes ?
Let me look through the latest webrev first.
2. I will be busy in november so I would like to anticipate Marlin
That's unfortunate as now that JavaOne is over I ha
Jim,
just a reminder on my last patch waiting for review.
Meanwhile I improved a bit the array cleanup in the RLE case and got some
gains:
Test Threads Ops Med Pct95 Avg StdDev Min Max TotalOps CircleTests.ser 1 170
61.993 62.304 62.018 0.159 61.687 62.667 170 *EllipseTests-fill-false.ser *
*1*
Hi Jim,
Here is the new webrev:
http://cr.openjdk.java.net/~lbourges/marlin/marlin-s4.2/
I added the OffHeapArray class used by Renderer and now by MarlinCache to
store rowAAChunk data.
Moreover I performed other small optimizations (heuristics,
Renderer.addLine() split in 2 methods) and many ben
Hi Laurent,
These are great results! And they are much easier to read with the
tables (which seem to get lost in my reply, oops!).
If it is just the dashing results I can believe that as Ductus does a
pretty good job of minimizing the number of segments in its stroked
output paths. The los
Hi Jim,
Here is below the webrev I prepared last saturday night.
However, I made progress since as I inlined few methods and now use Unsafe
for rowAAChunk storage (to save few percents avoiding bound checks).
*So please, just have a look to see the new hybrid approach but do not make
a full revi
Hi Laurent,
I've been looking at it a little lately.
One thing that occurred to me is that the 2 strategies - RLE vs
uncompressed - might be easier to follow and manage if they were broken
out into separate classes:
MarlinCache
+--- MarlinRLECache
+--- MarlinUncompressedCache
Jim,
FYI I worked on improving Marlin patch #4 to send you asap a new webrev: it
will be simpler with only 2 variants (uncompressed and RLE+blockFlags).
I also fixed a bug in the array cache related to crossings (ptrLen !=
ptrEnd) in Renderer.endRendering () and wrote a test that makes all cached
We should also be wary of compiler options that are a win on one
processor family and a loss on another. Anything that schedules
instructions may be specific to a particular generation of CPUs, for
instance. Or for i5 vs i7 vs M(obile)...
...jim
On 10/2/15 9:10 AM, L
Sergey,
thanks for the information:
I tried your gcc options on my ubuntu 14.4 (v4.8.4) and it is actually
>> slightly faster: 10% on my fill ellipse test (450ms vs 490ms).
>>
>
> I tested by your jmh test, and the difference became bigger on 1400 size.
>
Interesting; I will try too.
>
>> Do y
On 02.10.15 10:57, Laurent Bourgès wrote:
Sergey,
I tried your gcc options on my ubuntu 14.4 (v4.8.4) and it is actually
slightly faster: 10% on my fill ellipse test (450ms vs 490ms).
I tested by your jmh test, and the difference became bigger on 1400 size.
Do you know which gcc compiler an
Sergey,
I tried your gcc options on my ubuntu 14.4 (v4.8.4) and it is actually
slightly faster: 10% on my fill ellipse test (450ms vs 490ms).
Do you know which gcc compiler and options are used to build JavaSE EA?
Moreover, the linux distrib may define default options.
I will try to figure out
Jim,
Thanks for your ideas below:
> Some thoughts - we record some info on each scanline - mostly about the
new edges that are added. Perhaps we could keep deltas of how many edges
come and go per scanline and then sum them up at the start to figure out if
any scanline has a lot of crossings?
I
Hi Sergy,
Is this a new patch that replaces the -march flag (and has the same
performance benefits)? Are those options compatible with all of our
supported platforms?
If so, then we should file a bug and get that into 9...
...jim
On 10/1/15 10:10 AM, Sergey Bylokhov
My patch below:
><
diff -r 2b680924a73f make/lib/Awt2dLibraries.gmk
--- a/make/lib/Awt2dLibraries.gmk Wed Sep 16 18:34:38 2015 +0300
+++ b/make/lib/Awt2dLibraries.gmk Thu Oct 01 17:06:38 2015 +0300
@@ -243,7 +243,7 @@
EXCLUDES := $(LIBAWT_EXCLUDES), \
EXCL
oh and ps .. we are of course also dependent on what the version of gcc
in use knows about the CPU. The important one is whatever RE use.
All of this is tricky to get right and maintain and is one reason why
the adaptive JIT in hotspot can be a good thing.
-phil.
On 10/01/2015 09:22 AM, Phil Ra
The results of -march=native are more interesting/useful if
we also know what exact CPU this was compiled on.
Obviously we can't use that option directly in JDK builds but
it may be that we could make some judgement about whether
it is acceptable to trade slow-downs on some rarer CPUs for
speed-u
Sergey,
Thanks for having tests.
What is the units of your your results ?
I guess it is on the time axis: slower values are better.
How did you hack the gcc options in the openjdk build scripts ?
I could try on my local build too.
Bye,
Laurent
Le 1 oct. 2015 17:39, "Sergey Bylokhov" a
écrit :
I got results below after I made a hack and added -march=native to the libawt
library:
"EllipseFill.fillEllipse 1400"
8u60_RE: 6,540
9_dev: 8,457
9_hack: 6,276
So we have a window for tweaking.
- sergey.bylok...@oracle.com wrote:
> Hello,
> I built both version of jdk8 and jdk9 on my local
Hello,
I built both version of jdk8 and jdk9 on my local system, and compares output
of preprocessor and generated assemblers, both are identical. But jdk8u60 from
RE actually 20% faster than my version of jdk8. It seems that the difference is
in the compiler and/or in some compiler(gcc) options
Jim,
> As far as why the software loops are slower...
>
> Did any command line options change for compiling IntArgbPre.c? Touch
the file and rebuild and verify if the compiler options are the same (and
that both builds use the same compiler)...
To avoid all possible side effects, I deliberately
As far as why the software loops are slower...
Did any command line options change for compiling IntArgbPre.c? Touch
the file and rebuild and verify if the compiler options are the same
(and that both builds use the same compiler)...
...jim
Hi Laurent,
You are looking at the wrong loop. It's tough to explain...
vis_*.c are only ever compiled or used on Solaris. They convince the
compiler to emit Sparc's version of MMX instructions. They are not even
compiled on any other build except for Solaris.
You were probably confused b
Sergey,
I managed to create a new benchmark with JMH + perfasm profiler:
http://cr.openjdk.java.net/~lbourges/jmh/ellipse_fill/
See MyBenchMark.java that fills an ellipse with radius in {"100", "500",
"900", "1400"}
I tested with both Oracle JDK8 and Oracle JDK9 EA b81 ie using the ductus
render
On 22.09.15 0:15, Laurent Bourgès wrote:
Conclusion:
The new patch seems promising as it is very close to ductus performance.
Filling ellipse seems slower on OpenJDK9 (492 / 437 = 12% slower) ! Any
MaskFill changes ?
For such checks I suggest to use JMH + "prof perfasm". It will provide
reall
Some thoughts - we record some info on each scanline - mostly about the
new edges that are added. Perhaps we could keep deltas of how many
edges come and go per scanline and then sum them up at the start to
figure out if any scanline has a lot of crossings?
One slight optimization in the non-
Hi Laurent,
On 9/21/15 2:15 PM, Laurent Bourgès wrote:
Here is a summary showing only my ellipse draw / fill tests (radius = 1
to 2000):
As you can see below, the table is still mangled, but due to fewer
columns I was able to piece things together.
Marlin 0.7.0 on JDK1.8.60:
Test
Jim,
I would like your point of view on the new algorithms to store alpha values
efficiently and some advices on heuristics / metrics to make the adaptive
approach more efficient / robust.
I hope having some spare time soon to spend on improving this patch...
Here are few more explanations that
Hi Laurent,
Sorry it took me so long to get around to this...
MarlinConst.java, line 86 - "+2 explain"?
MarlinProperties.java - indentation on && continuations should be 4
spaces and/or line up the operands (as in:
return isEnableRLE() &&
isSomethingElse...;
OR
return isEna
Jim,
Here is the first webrev improving copyAARow() on large shapes (pixel
loops):
http://cr.openjdk.java.net/~lbourges/marlin/marlin-s4.0/
Note: I also incorporated few changes related to force cleanup in case of
runtime exception happening within pathTo(): see MarlinRenderingEngine,
Stroker, D
57 matches
Mail list logo