Re: [Geotools-devel] Retiring shapefile renderer (long mail)

Jesse Eichar Mon, 06 Apr 2009 00:26:58 -0700

On Sun, Apr 5, 2009 at 12:13 PM, Andrea Aime <[email protected]> wrote:


> Hi,
> so, as the subject say, I would like to spend some time
> doing the work necessary to retire the shapefile renderer.
>
> The reason for this is simple, maintenance overhead,
> debugging headaches, I would like to avoid having to
> fix bugs twice of find that some bug shows up only
> on shapefiles, or only on other data sources.
>
> Yet, in order to retire the shapefile renderer I need
> to get streaming renderer to the same level of performance,
> that is, eliminate the very reason shapefile renderer
> was born (get better rendering performance out of shapefiles).
>
> The basic idea is that the set of optimization shapefile
> renderer has may be either ported to streaming, or turned
> into hints that a datastore can handle in order to better
> support fast rendering.
>
> I've split the mail in two, a discussion of the
> shapefile renderer speedups  I know of, and a plan on
> how to move the speedups in the normal rendering chain.
> For gt2 devs that are not interested in the speedups
> themselves can you at least give me some feedback on
> the plan part?
>
> Discussion of the shapefile renderer speedups
> -------------------------------------------------------
>
> The shapefile renderer basically accesses the data in
> a very direct manner, skipping completely the datastore
> API, however, in the end it still builds SimpleFeature
> objects for the rest of the "upper level" rendering
> code to use. Here is a list of optimizations I'm aware
> of, that the StreamingRenderer + IndexedShapefileDataStore
> don't:
> a) use of LiteCoordinateSequence instead of Coordinate[]
> b) generalize while reading the coordinates from the disk
> c) direct bbox generalization
> d) true loose bbox behaviour
> e) feature skipping based on pixel map
> f) doing all transformations (reprojection and world to
>   screen transformation) inside the data reading code
>
> Am I missing any other optimization?
>
> Let's elaborate on the ones that I know of.
>
> (a) Use of LiteCoordinateSequence instead of Coordinate[]
> is a matter of datastore respecting the coordinate sequence
> factory hint that the streaming renderer is already providing,
> and that atm is used by some of the jdbc based datastores.
> No big deal here, in fact I already have a patch ready to
> give the datastore support for this hint:
> http://jira.codehaus.org/browse/GEOT-2424
> (anyone cares to review?)
>
> (b) and (c) are about generalizing data while reading it.
> The hints we're discussing in the separate generalization
> thread could be used to support this. I'm however
> a little skeptical about this point.
> (b) simply means doing inside the datastore the job of the
> Decimator class (skip points that are too close to each
> other to show up as distinct entities on the screen).
> The datastore still has to read all the coordinates,
> and perform the generalization in memory, so I'm
> dubious doing this inside the reader can be any faster
> than doing it in the Decimator. Jesse, what do you think?


For shapefile it might be slightly faster in the case where the renderer
requests features in projection X but the native projection is Y in this
case decimation in the renderer results in many more transformations which
are potentially expensive.

For other datastores like postgis (if postgis adds a generalization
function) the performance in the datastore and not the renderer can be huge


>
> (c) is different thought. The idea is that if the geometry
> is so small in the screen that it will be smaller than a
> pixel, then we can avoid doing a point by point generalization,
> and simply replace it with a simple 2 points line, or
> a 3 points triangle, that represents it.
> This comes in handy in the shapefile reader because we
> have the opportunity to read the bbox before actually reading
> all of the coordinates, so we can really skip reading
> the geometry in that case.
>
d) is about skipping the bbox filter evaluation in memory,
> and in particular, skipping the bbox full intersection behaviour
> mandated by the OGC (an bbox in OGC terms is just a shorthand
> syntax for instersect, but for rendering a bbox vs bbox comparison
> is more than enough).
> Since we're accessing the spatial index, we can skip the
> bbox completely. IndexedDataStore is not doing that, and
> it's performing those expensive intersections for all
> features whose bbox is not fully contained in the rendering area.
> This can be skinned in a number of ways:
> - have streaming renderer build a subclass of bbox that
>  only does the loose evaluation when used in memory
> - have a gt2 wide bbox subclass that does the same, allowing
>  everyone to recognize it. Downside, how do we build it
>  given the FilterFactory is defined at the GeoApi level?
> - pass down a query hint that all bboxes should be evaluated
>  in a loose way. Nice, too bad it might interfere with
>  some bbox in the SLD itself that people wanted to be
>  evaluated the hard way (think dynamic SLD used to show
>  selection in WMS)
>

That is a hard one.  Considering having a gt wide bbox subclass so that the
DS can recognize it.  I want to analyze the benefits of that option.  Is the
majority of the performance bottleneck with Postgis and WFS, etc.. the
network streaming? Or is the reduced cost of processing in the
database/other service worth the cost of introducing a new system wide class
to be recognized by all.

I honestly dont know.  I can see perhaps if the Postgis and Geoserver are on
the same server it could make a significant difference and that is a usecase
that occurs at least occasionally.

Same argument can be made for the hint.




>
> (e) is kind of a very hard core, very effective in its
> specific use case, but map degrading optimization.
> Basically the renderer keeps
> a bitfield representing the pixels in a map, every time
> a feature is bounds optimized away using (c),
> he screen pixel in which the feature is drawn is marked
> in the bitfield. If another feature is bounds optimized
> away and the pixel in which it would fall is already
> "on" in the map, nothing gets drawn at all.
> This mainly speeds up the case in which you're drawing
> a huge shapefile fully, which is the typical case of
> someone loading a huge shapefile for the first time
> and wanted to see it all to get a sense of what he's
> looking at.
> However, the result is _not_ the same when rendering
> with antialiasing on. I've attached the two maps
> obtained with and without that optimization
>
> Whilst the shapefile renderer generated map is showing
> the general shape darn quick, the streaming renderer
> map is more accurate and shows proper density distribution
> (shapefile renderer takes 10 times more to draw it,
> not sure what part of the difference can be associated
> with the bitfield usage).
>
> I guess this optimization has merit anyways when getting
> a quick overview is more important than getting an
> accurate result, but being a trade off, it should
> be optional.
>

Agreed.  But it is really nice when loading a GB shapefile :)


>
> (f) imho does not give any speedup, once you're loaded
> the coordinates into a double[] where you do the
> transformation (inside the reader, or inside the
> renderer) should not make a difference.
>

I think I agree :P


> Plan
> -------------------------------------------------------
> Well, the idea is to come up with a number of patches
> that move the optimizations into the
> StreamingRenderer/ShapefileDataStore couple, and allow
> other datastores to benefit from them as well.
> I can do that on a branch, or I can do it on trunk.
> What do people prefer? My preference would be to
> create a stream of incremental patches to be applied
> on trunk, where I can get early feedback from uDig and
> GeoTools on whether these are breaking any expected
> behaviour.
>


I am ok with a stream of clean patches.  I think there is potential for more
upfront testing that way.

------------------------------------------------------------------------------

_______________________________________________
Geotools-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/geotools-devel

Re: [Geotools-devel] Retiring shapefile renderer (long mail)

Reply via email to