Martin's "intersects != intersection" point is actually extremely relevant here... the latest version on GitHub does an apples-to-oranges comparison.
The initial version before the "fixed methodology" commit compared PreparedGeometry.intersects(Geometry) with Geometry.*intersects*(Geometry), whereas the latest version compares PreparedGeometry.intersects(Geometry) with Geometry.*intersection*(Geometry). Just wanted to point this out... On Sun, Jun 14, 2015 at 5:51 PM, Martin Davis <[email protected]> wrote: > Interesting and very thorough analysis! It's definitely a surprise that > PreparedGeometry drops below raw Geometry for large query sizes. It's > maybe a bit less of a surprise that small PGs don't provide much > improvement. > > It's going to take me a while to fully digest this, but a few initial > comments: > > - In JTS the intersects and intersection operation are quite different > (very much so in semantics, and currently somewhat less so in > implementation). PreparedGeometry currently provides only the intersects > operation. Support for intersection might be forthcoming at some point, > but it's not there now. You might want to adjust your terminology to > reflect this, to avoid confusion. > > - Increasing vertex count by densifying lines *might* be skewing the > results somewhat, since it might defeat the use of monotone chains to > improve indexing. Not sure how easy this is to mitigate - it's hard to > generate large random polygons! Nevertheless, real-world datasets might > show different performance characteristics (hopefully better!). > > - To figure out what's going on it will probably be necessary to profile > the execution for large PGs. Typically in geometric algorithms performance > problems are due to an O(n^2) operation lurking somewhere in the code. > These sometimes don't show up until very large input sizes. > > Thanks for the post - if this provides better guidance on using PG vs Raw, > this will be very helpful. > > > > On Sun, Jun 14, 2015 at 1:38 PM, Chris Bennight <[email protected]> wrote: > >> >> I put together a quick project to try and understand some of the >> trade-offs for prepared vs. non-prepared geometries (when to use a prepared >> geometry, when not to, etc.) >> >> The readme has a summary of the tests and results, with graphics >> https://github.com/chrisbennight/intersection-test >> >> Probably the best summary is this chart: >> >> https://raw.githubusercontent.com/chrisbennight/intersection-test/master/src/main/resources/difference%20-%20prepared%20vs%20non%20prepared%20-%20chart.png >> >> which shows the difference of (prepared geometry intersection time - >> regular geometry intersection time) as a function of the number of points >> in each geometry. >> >> My big takeaway is that once the target geometry starts getting around >> 10k or more points, using a regular geometry intersection may be faster >> than using a prepared geometry intersection. (Not saying 10k is a magic >> number, just using that as a rough estimate of when I might need to start >> caring) >> >> I think a typically use case of a query window (think typically WMS, WFS, >> etc. BBOX query) with 5 points matching against complex geometries in the >> database might very easily see worse performance by creating a new >> PreparedGeometry instance of that 5 point geometry. >> >> Which is completely fair - just trying to understand the trade-offs and >> heuristics for using it. Other strategies - such as in the case above, >> preparing the target geometry and flipping the intersection test (since in >> testing preparing the geometry was orders of magnitude faster than the >> intersection) are also things I'm interested in. >> >> Probably my big request is a little review from the experts - have I >> missed something obvious, done something horribly wrong, or is this a >> reasonable outcome? I'd also be curious if there are any guidelines for >> how/when people use a prepared geometry. It looks like PostGIS might >> employ some heuristics in what and when to prepared the geometry? >> >> Anyway, just looking for a quick vector check as well as any >> thoughts/guidance. Overall I'd like to come up with a heuristic for when >> to used prepared geometry against arbitrary sets of polygons. >> >> >> ------------------------------------------------------------------------------ >> >> _______________________________________________ >> Jts-topo-suite-user mailing list >> [email protected] >> https://lists.sourceforge.net/lists/listinfo/jts-topo-suite-user >> >> > > > ------------------------------------------------------------------------------ > > _______________________________________________ > Jts-topo-suite-user mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/jts-topo-suite-user > >
------------------------------------------------------------------------------
_______________________________________________ Jts-topo-suite-user mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/jts-topo-suite-user
