Thanks, but I realised later that just eliminating the duplicate intersections is not enough, because sometimes only part of the intersection is doubled and maybe there are other inaccuracies there as well. (Perhaps the line running along tile border somehow falls "in-between" tiles when re-projected by ST_Intersection?) Anyway, my fix was to detect such lines and intersect them with the original polygon instead of the tiled one. Since there are relatively few of them it doesn't affect the overall time noticeably.

Regards,

Evan

On 11/02/2014 08:39, Rémi Cura wrote:
Hey,
geometry equality can be defined in many ways.


For your duplication problem, it is a simple postgres problem :
you want that for any couple (line, poly), you have at most one result (given polygon are convex, which they are if they are squares).

so at the end of you computing you just add a filtering select :
SELECT DISTINCT ON (line_id,poly_id)

If you don't want random result, you should order your result to know which line of the 4 you get (you could order by length, centroid, first point, whatever.)

WITH (your computing)
SELECT DISTINCT ON (line_id, poly_id) , poly, line
FROM your computing
ORDER BY ST_Length(line) ASC

Now if you want to solve this the PostGIS way it would be harder, as you would need a distance between shape.
(like http://postgis.net/docs/ST_HausdorffDistance.html)

Of course you could also use implicit distance, for example by snapping your result line to a given precision and doing a regular test after (WHERE ST_Equals(ST_SnapToGrid(line1, 0.01),ST_SnapToGrid(line2, 0.01))=TRUE )

Cheers,

Rémi-C




2014-02-10 21:33 GMT+01:00 Evan Martin <[email protected] <mailto:[email protected]>>:

    I've discovered a slight problem with the handy "tiled
    intersection" trick suggested earlier: some of my lines run
    exactly along a meridian or along a parallel and so do the tiles,
    so those intersections get counted twice! For example,
    LINESTRING(-18 14.5,-18 15.5) results in the following
    intersections with a particular tiled polygon

    LINESTRING(-17.9999148001148 14.7502863979935,-17.9998863986648
    15.0000002498516)
    LINESTRING(-18.0000851998852 14.7502863979933,-18.0001136013352
    15.0000002498516)

    LINESTRING(-18.0001136013352 15.0000002498516,-18.0000852013123
    15.2502965758817)
    LINESTRING(-17.9998863986648 15.0000002498516,-17.9999147986877
    15.250296575882)

    Could someone suggest the best (fastest while still accurate) way
    to filter out such duplicates? As you can see, they're not exactly
    the same, so ST_Equals() returns false on them.

    Evan

    On 08/07/2013 17:36, Evan Martin wrote:

        Thanks, Steve, that seems to do the trick. Of course the
        results change a bit, so it's a trade-off of accuracy vs.
        speed. I presume the change is because I do the tiling on the
        plane - ST_Intersection(geom, geom). When I tried doing tiling
        on geography the results changed much more (compared to no
        tiling). Would be interesting to understand why that is. Am I
        doing something wrong? I create a grid of 1x1 degree polygons
        and then do this:

            SELECT poly_id, ST_Intersection(poly_border::geometry,
        tile)::geography AS poly_tile
            FROM my_polygon p
            JOIN world_tile_1 t ON ST_Intersects(p.border::geometry,
        t.tile)

        The intersection with lines is then done on geography, as
        before. I only do this for polygons that don't span the
        dateline (which is 99% of them, luckily).

        Evan

        On 06.07.2013 21 <tel:06.07.2013%2021>:19, Stephen Woodbridge
        wrote:

            The standard way of dealing this this is to chop you
            really large polygons into tiles. Or if the multipolygons
            can be split into multiple individual polygons you might
            get better performance.

            google: postgis tiling large polygons

            if you need the distance that the line intersects the
            multiple tiles or multiple split multipolygons you will
            need to sum() and group on the original id of the split
            object.

            -Steve

            On 7/6/2013 1:10 PM, Evan Martin wrote:

                It's not really "many large things vs many large
                things". Most lines are
                < 100 km long (but there are some over 1000 km).
                Here's a percentile
                chart: https://imageshack.us/a/img16/940/w5s.png

                Most of the polygons are also quite small and simple,
                but there are a
                few really large complex ones. From my testing it
                looks like a few of
                the "worst" polygons (multi-polygons, actually) take
                all the time, so
                that 25,000 count was a bit misleading. 96% of them
                have < 100 points,
                but the worst one has > 23,000. I couldn't get the
                area, because
                ST_Area(geog) is returning some ridiculously high
                numbers, but it would
                be millions of sq km.

                On 06.07.2013 5:48, Paul Ramsey wrote:

                    Without seeing your data it's quite hard to say.
                    Many large things vs
                    many large things yields a problem where indexes
                    and so on don't have
                    a lot of leverage on the problem.

                    On Tue, Jul 2, 2013 at 6:39 AM, Evan Martin
                    <[email protected]
                    <mailto:[email protected]>> wrote:

                        Hi,

                        I have tables of ~25,000 polygons and ~80,000
                        lines and I want to
                        find which
                        lines intersect which polygons using PostGIS
                        2.1. Both are
                        geographies and
                        can span the dateline. Doing this the simple
                        way using
                        ST_Intersects(geog,
                        geog) takes about 3 hours on my machine and
                        I'd to see if there's a
                        way to
                        speed this up.

                        I already have indexes on the geography
                        columns and one of them is being
                        used (the one on the lines). Each line only
                        has 2 points, but the
                        polygons
                        have anywhere from 4 to 20,000 points and some
                        of them are very
                        large. It
                        would be OK to miss some of the smaller
                        intersections (ie. where the two
                        only just barely intersect), but I wouldn't
                        want the query to return
                        false
                        positives. In fact, ideally, I'd like to find
                        only the lines that
                        "substantially" intersect a polygon, eg. at
                        least x km or x% of the
                        line is
                        in the polygon, but finding any intersections
                        at all would be a start.

                        One trick I tried is
                        ST_SimplifyPreserveTopology. I used that to create
                        simplified version of the polygons (at least
                        those that don't span the
                        dateline) and check those first, then if they
                        intersect then check
                        the real
                        polygons. This seems to work, but the
                        performance gains are marginal
                        compared to the simple approach.

                        Is there another trick I can use to do this
                        faster? I know
                        ST_Intersects()
                        internally calls ST_Distance(), which
                        calculates the distance to a
                        fraction
                        of a metre. I don't need that kind of
                        precision, so surely there's some
                        "shorcut" to be found?

                        Thanks,

                        Evan


    _______________________________________________
    postgis-users mailing list
    [email protected] <mailto:[email protected]>
    http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users




_______________________________________________
postgis-users mailing list
[email protected]
http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users

_______________________________________________
postgis-users mailing list
[email protected]
http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users

Reply via email to