Hi,
in a separate thread Jonathan provided a load test, data and styles for a
map he's
publishing.
The thread started as a comparison between PNG encoders, but, at least as
far as I can
tell, there is no significant optimization that can be made at the PNG
encoder level.

However, I've been toying with the data set and found some interesting
results regardless
(grab some popcorn).

The maps are setup as a group of 15 different layers, with some scale
dependency.
At 10 and 5 millions not all the layers are shown, but there is one thing
that is evident: there
are a few layers (woodlands, lakes, and the portion of urban areas shown at
this lavel) that
contain a good number of small, but detailed polygons (in the order of
20k+)... this makes
it miss the reference "no more than 1000 elements in the map" mark by a
factor of 20, which
is probably what makes it interesting :-p

We have code that simplifies them before drawing them, but they are small
enough that
another optimization, geared towards drawing features smaller than a pixel,
should have kicked
in, but it did not and that affected drawing times. So I've fixed it (fixes
are already in, will
be part of the next stable release).
In order to get the fully benefit the styles must not use partial
transparency, and I had
to fix one of the styles accordingly (believe it was lakes, not 100% sure).

To have some comparisons, I'm reporting the numbers found in the other
thread without any
change. The report contains the JVM and build details, and then average
response time and throughput
(for those that did not follow, this is a 15 layers group, made of
shapefiles,
and the output is a 1272x1261 pixels full color png, with 10 concurrent
clients hitting GeoServer,
the benchmark is done on a core i7 820 CPU). All benchmarks are using the
PNGJ encoder,
for those not familar with benchmarking, the Oracle JDK 7 has  a renderer
that does not scale up,
so it is worthy setting up a number of load balanced GeoServer instances in
separate JVMs,
whilst OpenJDK has one that's quite a bit slower, but that has no
scalability issues, so one
normally sets up a single instance instead.

OpenJDK 7: <not available>, 3.3r/s
JDK 7: 2867ms, 3.9r/s
JDK 7, three instances load balanced with HA proxy: 1248ms, 7.9r/s

The above shows the benchmark is heavily bottlenecked by the rendering
subsystem drawing speed
(so much that the JDK 7 renderer, Ductus, can outperform OpenJDK 7 one,
Pisces, even with 10
concurrent requests).

After fixing the "small polygons" optimization bug the results are:

OpenJDK 7: 2507ms, 4r/s
JDK 7: 2459ms, 4r/s
JDK 7, three instances load balanced with HA proxy: 1175ms, 8.4r/s

So, this optimization improves the rendering time of OpenJDK 7 and puts it
on par with the
closed source JDK, and also improves the latter.

Then again, Jonathan has this data in Oracle... I did not want to venture
there: loading
data there is a pain, the styles need fixing because of the uppercase
attributes,
and honestly, why waste an open source developer spare time on closed
source databases anyways?
So I loaded all the data in PostGIS instead and run the tests again (test
run with the connection pool
locked at 10 connections):

OpenJDK 7: 2630ms, 3.8r/s
JDK 7: 2422ms, 4.1r/s
JDK 7, three instances load balanced with HA proxy: 1301ms, 7.6r/s

Hum... a bit slower. And then I've remembered what Paul Ramsey used to say
about Mapnik using
ST_Simplify on the geometries to get a boost (reference,
http://blog.cartodb.com/post/20163722809/speeding-up-tiles-rendering).
Which had been tried already in GeoServer without much of a benefit
in previous benchmarks (we already do the simplification on the java side),
but.. we did not have a map
with so many little polygons. So why not give it a kick? All one needs to
do is to add the following in the postgis dialect:

@Override
    public void encodeGeometryColumnSimplified(GeometryDescriptor gatt,
String prefix, int srid,
            StringBuffer sql, Double distance) {
        boolean geography = "geography".equals(gatt.getUserData().get(
                JDBCDataStore.JDBC_NATIVE_TYPENAME));

        if (geography) {
            sql.append("encode(ST_AsBinary(ST_Simplify(");
            encodeColumnName(prefix, gatt.getLocalName(), sql);
            sql.append(", "  + distance + ")),'base64')");
        } else {
            sql.append("encode(ST_AsBinary(ST_Simplify(ST_Force_2D(");
            encodeColumnName(prefix, gatt.getLocalName(), sql);
            sql.append("), "  + distance + ")),'base64')");
        }
    }

    protected void addSupportedHints(Set<Hints.Key> hints) {
        hints.add(Hints.GEOMETRY_SIMPLIFICATION);
    }

And here are the results:

OpenJDK 7: 2014 ms, 4.9 r/s
JDK 7: 1694ms, 5.8r/s
JDK 7, three instances load balanced with HA proxy: 1046 ms, 9.4 r/s

Holy cow, on the single JVM setup that's roughly a 50% speedup (and not to
throw away in the
case of 3 JVMs either)...
Now, this is PostGIS specific (no
plain simplification in Oracle, only the topology preserving one is
available, which
is supposedly more expensive than not doing it, at least, it is in
PostGIS),
and we need to verify what's the overhead when the geometry do not need
simplification.
That said, how do you see this being enabled?
* always on, that could be a good one if the overhead when simplification
is not really needed
  show no regression
* have it enabled by a store parameter
* have it enabled with SLD vendor parameter (to be used only when zoomed
out)

Now... another thing that I suspected made MapServer and Mapnik competitive
is that
they load the data from the database in a single kick, instead of paging
through them like we do.
The approach is a double edged sword:
* on the bright side, a single communication with the db, and no need for
the dbms to allocate
  and manage a server side cursor
* on the dark side, no freaking way to control how much data you're loading
into memory, OOM
  risk is there (I guess with a cgi/fastcgi approach you just don't care,
if one instance goes boom
  it just gets replaced automatically, worst thing that can happen you get
into a swap storm)

Code wise the change is small, in the PostGIS dialect we just disable the
use of transactions while
reading, which breaks postgresql ability to use server side cursors,
forcing it to return everything in
a single kick instead:

    @Override
    public boolean isAutoCommitQuery() {
        return true;
    }

Let's see the results:

OpenJDK 7: 1726 ms, 5.8 r/s
JDK 7: 1688ms, 5.9r/s
JDK 7, three instances load balanced with HA proxy: 998 ms, 9.9 r/s

The one that benefits the most are the ones CPU bottlenecked, which means,
first OpenJDK,
and then the three instances of JDK 7, which are both using close to 100%
cpu, whilst
JDK 7 stand alone instances uses like 50% cpu (the JVM wide lock in the
Ductus renderer
is killing scalabiliity).
The issue is, this approach is not really usable "always on". Options I see
(and suggestions
welcomed):
* test again and see if just increasing the fetch size helps (it's now set
1000 by default)
* add a hint that only the WMS renderer sets, that will enable this mode...
this assumes
  the styles are always set to provide reasonable scale dependencies... and
we should add
  some config to disable the usage of styles that are not associated to the
layer
* other suggestions?

Now, for a final test, we know that the OpenJDK renderer is slow, but
Laurent Bourges
has been providing Oracle some patches to make it faster. Which have not
been accepted, so far.
However... I've made some tests and now have a jar that one can drop in a
regular JDK 7
install, which just some of Laurent's improvements, and enable it by
setting some JVM
parameters. How does this one fare? This are the results for a single
OpenJDK 7
JVM, with also all of the other GeoServer patches mentioned above included:

OpenJDK 7 + optimized Pisces renderer: 1091 ms, 9.0 r/s

Not too bad uh? :-)
The plan is to keep on work on it a bit more before going public and
releasing the jar for
everybody to play with.

Cheers
Andrea

-- 
*== GeoSolutions will be closed for seasonal holidays from 23/12/2013 to
06/01/2014 ==*

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054  Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39  339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

-------------------------------------------------------
------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk
_______________________________________________
Geoserver-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

Reply via email to