Jorge,

I would have some faith in Brent's approach - if you run a query like :

select gid, nice_text_name, st_astext(st_centroid(the_geom)) as centroid
from bad_data_set
where st_isValid(the_geom) is false;

You will quickly get a list in a rough order of the magnitude of your problem, as well as the notices of where the self-intersections are. In QGIS there is a Zoom to Point Plugin so you can just put in the value given in the notice, and then zoom straight in and fix the geometry as best you can (or need!).

I recently had to do this for a smallish data set (but plenty of errors), and once you have your list and have done a few, you may be surprised how quickly you can tidy them up.

cheers

Ben






On 08/05/2009, at 6:39 AM, [email protected] wrote:


HIi

I have tried a few ways of dealing with this, including ST_Simplify_PreserveTopology() to try to fix errors.

I have yet to find a solution that automatically resolves such probles as I think they should be. Using some variation of buffer(buffer(geom,-1),1) instead of buffer(geom,0) sometimes fixes things that the simple buffer of zero has failed to.

What I have done that works for me is to use QGIS to edit the geometries that are not fixed by the above operations.

I run isvalid() limit 1, then zoom into the returned coodinates & manually fix the problems, which so far have always been pretty simple & obvious. I then rerun isvalid() & work through each error, It doesn't take long per error, 200 or so can be fixed in a morning without too much trouble.

Also note that an empty geometry is regarded as an error by some applications, but passes isvalid(), so I also do a check for ST_NumGeomteries()=0 to clean up PostGIS datasets.


HTH,

  Brent Wood

--- On Fri, 5/8/09, Jorge Arévalo <[email protected]> wrote:

From: Jorge Arévalo <[email protected]>
Subject: Re: [postgis-users] Problem with ST_Within, Polygons and Multipolygons To: "PostGIS Users Discussion" <postgis- [email protected]>
Date: Friday, May 8, 2009, 9:44 AM
Hello,

Many thanks for your help Kevin. I can confim one thing:
isvalid only returns the first error. Just today, I realized
on the big amount of errors of my data, as you said. Indeed,
I was reading the same link that you provided me about
geometry validity, and I split my multipolygon into single
polygons to check them. My conclusions are the same as
yours: what a mess!



Oh, and thanks again for your response to my original
question. It helped me a lot... Maybe is it time to ask for
new data, without errors? If not, how should I
"clean" my data? I tried with ST_Buffer(geom, 0.0)
(I read in this link 
http://www.bostongis.com/postgis_extent_expand_buffer_distance.snippet
that is useful to repair data with self-interesections), but
doesn't seem to work for me...



Regards
Jorge



On Thu, May 7, 2009 at 6:58 PM,
Kevin Neufeld <[email protected]>
wrote:


Hi Jorge,



So, I had a quick look at your data ... I'm sorry to
say that you've got a lot of problems with your ocean
polygon.

Paul can correct me since he did the work on isvalidreason,
but I think st_isvalid only returns the first thing it finds
wrong with your input geometry.



I did this and found all kinds of things:



-- Extract the POLYGONs from the MULTI

CREATE TABLE water_polys AS

SELECT (ST_Dump(the_geom)).geom AS the_geom

FROM water;



SELECT ST_IsValidReason(the_geom)

FROM water_polys

WHERE NOT ST_IsValid(the_geom);



So, looking at the individual polygons (of which only the
first error is again reported), I found over 66
self-intersections (figure 8's and so forth) and several
whose boundaries loop back and touch itself.



Also, in response to your original question about why a
point on your airport also intersects the ocean polygon: it
turns out that the island in question is not a hole in your
ocean at all ... but a second polygon on top of your ocean
polygon.  This is why your point query returns true - there
are two surfaces (the island and the water) that intersect
the point (which, by the way, also makes an invalid
multipolygon).





A while back I put together some isvalid/issimple geometry
descriptions with soem pretty pictures:

http://postgis.refractions.net/documentation/manual-svn/ch04.html#OGC_Validity



You may want to have a look through it and clean up your
dataset.  The invalidity will most certainly result in more
unexpected behaviour.



Cheers,

Kevin





Jorge Arévalo wrote:


Hello,



I have additional info. When running the method
"isvalid", I get this warning message:



NOTICE:  Self-intersection at or near point -2749.99
4.7955e+06



What is exactly a "self-intersection"? May this
"self-intersection" be the cause of my problem?



Thanks

Jorge







2009/5/6 Jorge Arévalo <[email protected]
<mailto:[email protected]>>





   Hi,



   Thanks for your response Kevin. I can provide some
data:



   - Screenshot (over geoserver) of the
"water" multipolygon. You can

   see the shape of Spain and the Balearic Islands, and
a hole on the

   left (Portugal). The multipolygon is open, isn't
it? -->

   http://www.nebulared.com/tmp_geo/water_multipolygon.jpg



   - Zoom over Menorca (one of the Balearic Islands). As
you can see,

   the point tested is inside the "airports"
multipolygon, but

   ST_Within and ST_Contains returns that the point is
inside both,

   "water" and "airports"
multipolygons -->

   http://www.nebulared.com/tmp_geo/zoom_in_water_multipolygon.jpg



   - The "ST_AsText" data for the
"airport" multipolygon -->

   http://www.nebulared.com/tmp_geo/airports_multipolygon.txt



   - The "ST_AsText" data for the
"water" multipolygon (rar file, 2.5

   MB of plain text...) -->

   http://www.nebulared.com/tmp_geo/water_multipolygon.rar



   So, how can I check the validity of my multipolygons?
Do someone

   could help me with this?



   Now, the query. My context is the next:

   - I have a table with polygons that I want to check
(to say if they

   are in an airport, in a forest, in water, etc). We
can call this

   table T1. I use the centroid of the polygons to
check, not the

   entire polygon.

   - In other different table, that we'll call T2, I
have all the

   multipolygons ("water",
"airports", etc). The name of the field that

   says if the multipolygon is "water" or
"airport" or anything is

   "clutter"



   With this code, I pretend to loop over T1, checking
if the centroid

   of its polygons are inside of one (and ONLY one) of
the

   multipolygons of T2.  (Is part of a procedure)



   (...)

   BEGIN

     FOR s IN SELECT * FROM T1 LOOP

           select clutter from T2 where

   ST_Within(ST_Centroid(T1.wkb_geometry),
T2.wkb_geometry))

           (...)

     END LOOP;

     RETURN;

   END;



   Most of the times, the select inside the loop,
returns more than one

   result ("water" and other one, normally).
That's my problem



   I have another version of the query, with Java.
Again, I loop over

   the table T1, and check the centroid of its polygons
with all the

   multipolygons (water, airports...) that are in T2.
Then, my query,

   inside the loop, is:



   SELECT clutter FROM T2 WHERE ST_Within(?,
T2.wkb_geometry)



   The '?' is replaced with the centroid of the
polygon in T1, of

   course. Basically, is the same idea of the previous
SQL code, but

   with Java.



   Apart from this, other problem is that ST_Within
takes a long time.

   1 sec per query. And I have like 2 millions of
polygons (queries) to

   check...



   My multipolygons' table (T2) has only 29 rows
(small table with

   large geometries), and I found this

   http://postgis.refractions.net/documentation/manual-1.3/ch05.html.

   Seems to be good for me. But using "SET
enable_seqscan TO off", the

   queries take the same time (~ 1 sec). Maybe the other
approach

   (create an additional column that "caches"
the bbox, and matching

   against this) could help me. Any ideas on how to
create this bbox

   with my huge multipolygons?





   Thanks in advance



   Regards

   Jorge





   On Wed, May 6, 2009 at 4:28 AM, Kevin Neufeld

   <[email protected]
<mailto:[email protected]>>
wrote:



       Jorge Arévalo wrote:



           ... If is useful, I tested the method
"ST_isvalid" with the

           multipolygon and returns
"false". Maybe the multipolygon is

           not closed? I loaded the data from a
shapefile. Is it

           possible to create a
"non-valid" multipolygon? Does PostGIS

           accept this?





       Ah, yes.  Most spatial predicates in PostGIS
assume the input

       geometry is valid.  Suppose you had defined a
POLYGON with a

       hole or inner ring outside of the exterior
ring.  What is the

       area?  Does the question even make sense?
PostGIS allows

       invalid geometries in the database so users can
make full use of

       the PostGIS toolset to do whatever they need
(ie. breakdown a

       polygon to it's constituent linework and
rebuild it back up

       again to a valid polygon)





           Ok, being even more specific. I'm
working with data about

           Spain. I have a HUGE multipolygon that
represents "water"

           (this is, the coasts around Spain and its
islands). Then,

           the "holes" inside this
multypolygon have the shape of

           Spain, Balearic Islands and Canary
Islands. Of course, I

           have more multipolygons that represent
"forests",

           "airports", "cities",
etc, that fit into these holes.



           Really, my problem is with some points
that belong to an

           airport in an island. Using
"ST_Within" and "ST_Contains",

           the result is that these points belong to
the multipolygon

           "airport" and multipolygon
"water" at same time. Obviously,

           the island (and its airport) is
surrounded by water, but the

           airport's points shouldn't be
part of the multopolygon

           "water". And, as I said, when I
apply "ST_isvalid" to the

           multipolygon "water", returns
false. Maybe is not closed?





       Yeah, as mentioned before, a point that is not
on the surface of

       a (multi)polygon (whether in a hole or
completely outside) is

       not considered within the (multi)polygon.  It
does sound like

       your ocean polygon has validity issues.







           Oh, btw, what's the difference
between "ST_Within" and

           "Within". Does
"ST_Within" use index instead of geometry? Am

           I right?



       No, not instead of.  Both use the actual
geometry for testing

       within.  ST_Within will also use the index to
narrow down the

       candidate list first.

       As you can see, the definition of ST_Within is
just a simple SQL

       wrapper that first invokes the index.

       postgis=# select prosrc from pg_proc where
proname = 'st_within';

                     prosrc
         ---------------------------------------

       SELECT $1 && $2 AND _ST_Within($1,$2)

       (1 row)



       http://postgis.refractions.net/documentation/manual-svn/ST_Within.html



       Cheers,

       Kevin






_______________________________________________

       postgis-users mailing list

       [email protected]

       <mailto:[email protected]>

       http://postgis.refractions.net/mailman/listinfo/postgis-users









------------------------------------------------------------------------



_______________________________________________

postgis-users mailing list

[email protected]

http://postgis.refractions.net/mailman/listinfo/postgis-users


_______________________________________________

postgis-users mailing list

[email protected]

http://postgis.refractions.net/mailman/listinfo/postgis-users




-----Inline Attachment Follows-----

_______________________________________________
postgis-users mailing list
[email protected]
http://postgis.refractions.net/mailman/listinfo/postgis-users

_______________________________________________
postgis-users mailing list
[email protected]
http://postgis.refractions.net/mailman/listinfo/postgis-users

--

Ben Madin
REMOTE INFORMATION

t : +61 8 9192 5455
f : +61 8 9192 5535
m : 0448 887 220
Broome   WA   6725

[email protected]



                                                        Out here, it pays to 
know...


_______________________________________________
postgis-users mailing list
[email protected]
http://postgis.refractions.net/mailman/listinfo/postgis-users

Reply via email to