Hello, Of course I have faith, I need it :-). But, to put you on context: I have a table with 29 rows. Each row has a column with a kind of terrain (as a string) and other column with a multipolygon. Each multipolygon has thousands of polygons. For example, the multipolygon "water" has 8872.
Ok, After running ST_IsValid in the entire table, 27 of 29 multipolygons has errors. ST_IsValid only informs about the first error. For these reasons I think that fixing the errors manually could take a really LONG time. Maybe applying ST_Buffer can reduce the amount of errors. I'll try this option, and the Kevin's option too. It's gonna be a hard work... But I'll keep you informed. Regards Jorge On Fri, May 8, 2009 at 2:24 AM, Ben Madin <[email protected]>wrote: > Jorge, > > I would have some faith in Brent's approach - if you run a query like : > > select gid, nice_text_name, st_astext(st_centroid(the_geom)) as centroid > from bad_data_set > where st_isValid(the_geom) is false; > > You will quickly get a list in a rough order of the magnitude of your > problem, as well as the notices of where the self-intersections are. In QGIS > there is a Zoom to Point Plugin so you can just put in the value given in > the notice, and then zoom straight in and fix the geometry as best you can > (or need!). > > I recently had to do this for a smallish data set (but plenty of errors), > and once you have your list and have done a few, you may be surprised how > quickly you can tidy them up. > > cheers > > Ben > > > > > > > > On 08/05/2009, at 6:39 AM, [email protected] wrote: > > >> HIi >> >> I have tried a few ways of dealing with this, including >> ST_Simplify_PreserveTopology() to try to fix errors. >> >> I have yet to find a solution that automatically resolves such probles as >> I think they should be. Using some variation of buffer(buffer(geom,-1),1) >> instead of buffer(geom,0) sometimes fixes things that the simple buffer of >> zero has failed to. >> >> What I have done that works for me is to use QGIS to edit the geometries >> that are not fixed by the above operations. >> >> I run isvalid() limit 1, then zoom into the returned coodinates & manually >> fix the problems, which so far have always been pretty simple & obvious. I >> then rerun isvalid() & work through each error, It doesn't take long per >> error, 200 or so can be fixed in a morning without too much trouble. >> >> Also note that an empty geometry is regarded as an error by some >> applications, but passes isvalid(), so I also do a check for >> ST_NumGeomteries()=0 to clean up PostGIS datasets. >> >> >> HTH, >> >> Brent Wood >> >> --- On Fri, 5/8/09, Jorge Arévalo <[email protected]> wrote: >> >> From: Jorge Arévalo <[email protected]> >>> Subject: Re: [postgis-users] Problem with ST_Within, Polygons and >>> Multipolygons >>> To: "PostGIS Users Discussion" <[email protected]> >>> Date: Friday, May 8, 2009, 9:44 AM >>> Hello, >>> >>> Many thanks for your help Kevin. I can confim one thing: >>> isvalid only returns the first error. Just today, I realized >>> on the big amount of errors of my data, as you said. Indeed, >>> I was reading the same link that you provided me about >>> geometry validity, and I split my multipolygon into single >>> polygons to check them. My conclusions are the same as >>> yours: what a mess! >>> >>> >>> >>> Oh, and thanks again for your response to my original >>> question. It helped me a lot... Maybe is it time to ask for >>> new data, without errors? If not, how should I >>> "clean" my data? I tried with ST_Buffer(geom, 0.0) >>> (I read in this link >>> http://www.bostongis.com/postgis_extent_expand_buffer_distance.snippet >>> that is useful to repair data with self-interesections), but >>> doesn't seem to work for me... >>> >>> >>> >>> Regards >>> Jorge >>> >>> >>> >>> On Thu, May 7, 2009 at 6:58 PM, >>> Kevin Neufeld <[email protected]> >>> wrote: >>> >>> >>> Hi Jorge, >>> >>> >>> >>> So, I had a quick look at your data ... I'm sorry to >>> say that you've got a lot of problems with your ocean >>> polygon. >>> >>> Paul can correct me since he did the work on isvalidreason, >>> but I think st_isvalid only returns the first thing it finds >>> wrong with your input geometry. >>> >>> >>> >>> I did this and found all kinds of things: >>> >>> >>> >>> -- Extract the POLYGONs from the MULTI >>> >>> CREATE TABLE water_polys AS >>> >>> SELECT (ST_Dump(the_geom)).geom AS the_geom >>> >>> FROM water; >>> >>> >>> >>> SELECT ST_IsValidReason(the_geom) >>> >>> FROM water_polys >>> >>> WHERE NOT ST_IsValid(the_geom); >>> >>> >>> >>> So, looking at the individual polygons (of which only the >>> first error is again reported), I found over 66 >>> self-intersections (figure 8's and so forth) and several >>> whose boundaries loop back and touch itself. >>> >>> >>> >>> Also, in response to your original question about why a >>> point on your airport also intersects the ocean polygon: it >>> turns out that the island in question is not a hole in your >>> ocean at all ... but a second polygon on top of your ocean >>> polygon. This is why your point query returns true - there >>> are two surfaces (the island and the water) that intersect >>> the point (which, by the way, also makes an invalid >>> multipolygon). >>> >>> >>> >>> >>> >>> A while back I put together some isvalid/issimple geometry >>> descriptions with soem pretty pictures: >>> >>> >>> http://postgis.refractions.net/documentation/manual-svn/ch04.html#OGC_Validity >>> >>> >>> >>> You may want to have a look through it and clean up your >>> dataset. The invalidity will most certainly result in more >>> unexpected behaviour. >>> >>> >>> >>> Cheers, >>> >>> Kevin >>> >>> >>> >>> >>> >>> Jorge Arévalo wrote: >>> >>> >>> Hello, >>> >>> >>> >>> I have additional info. When running the method >>> "isvalid", I get this warning message: >>> >>> >>> >>> NOTICE: Self-intersection at or near point -2749.99 >>> 4.7955e+06 >>> >>> >>> >>> What is exactly a "self-intersection"? May this >>> "self-intersection" be the cause of my problem? >>> >>> >>> >>> Thanks >>> >>> Jorge >>> >>> >>> >>> >>> >>> >>> >>> 2009/5/6 Jorge Arévalo <[email protected] >>> <mailto:[email protected]>> >>> >>> >>> >>> >>> >>> Hi, >>> >>> >>> >>> Thanks for your response Kevin. I can provide some >>> data: >>> >>> >>> >>> - Screenshot (over geoserver) of the >>> "water" multipolygon. You can >>> >>> see the shape of Spain and the Balearic Islands, and >>> a hole on the >>> >>> left (Portugal). The multipolygon is open, isn't >>> it? --> >>> >>> http://www.nebulared.com/tmp_geo/water_multipolygon.jpg >>> >>> >>> >>> - Zoom over Menorca (one of the Balearic Islands). As >>> you can see, >>> >>> the point tested is inside the "airports" >>> multipolygon, but >>> >>> ST_Within and ST_Contains returns that the point is >>> inside both, >>> >>> "water" and "airports" >>> multipolygons --> >>> >>> http://www.nebulared.com/tmp_geo/zoom_in_water_multipolygon.jpg >>> >>> >>> >>> - The "ST_AsText" data for the >>> "airport" multipolygon --> >>> >>> http://www.nebulared.com/tmp_geo/airports_multipolygon.txt >>> >>> >>> >>> - The "ST_AsText" data for the >>> "water" multipolygon (rar file, 2.5 >>> >>> MB of plain text...) --> >>> >>> http://www.nebulared.com/tmp_geo/water_multipolygon.rar >>> >>> >>> >>> So, how can I check the validity of my multipolygons? >>> Do someone >>> >>> could help me with this? >>> >>> >>> >>> Now, the query. My context is the next: >>> >>> - I have a table with polygons that I want to check >>> (to say if they >>> >>> are in an airport, in a forest, in water, etc). We >>> can call this >>> >>> table T1. I use the centroid of the polygons to >>> check, not the >>> >>> entire polygon. >>> >>> - In other different table, that we'll call T2, I >>> have all the >>> >>> multipolygons ("water", >>> "airports", etc). The name of the field that >>> >>> says if the multipolygon is "water" or >>> "airport" or anything is >>> >>> "clutter" >>> >>> >>> >>> With this code, I pretend to loop over T1, checking >>> if the centroid >>> >>> of its polygons are inside of one (and ONLY one) of >>> the >>> >>> multipolygons of T2. (Is part of a procedure) >>> >>> >>> >>> (...) >>> >>> BEGIN >>> >>> FOR s IN SELECT * FROM T1 LOOP >>> >>> select clutter from T2 where >>> >>> ST_Within(ST_Centroid(T1.wkb_geometry), >>> T2.wkb_geometry)) >>> >>> (...) >>> >>> END LOOP; >>> >>> RETURN; >>> >>> END; >>> >>> >>> >>> Most of the times, the select inside the loop, >>> returns more than one >>> >>> result ("water" and other one, normally). >>> That's my problem >>> >>> >>> >>> I have another version of the query, with Java. >>> Again, I loop over >>> >>> the table T1, and check the centroid of its polygons >>> with all the >>> >>> multipolygons (water, airports...) that are in T2. >>> Then, my query, >>> >>> inside the loop, is: >>> >>> >>> >>> SELECT clutter FROM T2 WHERE ST_Within(?, >>> T2.wkb_geometry) >>> >>> >>> >>> The '?' is replaced with the centroid of the >>> polygon in T1, of >>> >>> course. Basically, is the same idea of the previous >>> SQL code, but >>> >>> with Java. >>> >>> >>> >>> Apart from this, other problem is that ST_Within >>> takes a long time. >>> >>> 1 sec per query. And I have like 2 millions of >>> polygons (queries) to >>> >>> check... >>> >>> >>> >>> My multipolygons' table (T2) has only 29 rows >>> (small table with >>> >>> large geometries), and I found this >>> >>> http://postgis.refractions.net/documentation/manual-1.3/ch05.html. >>> >>> Seems to be good for me. But using "SET >>> enable_seqscan TO off", the >>> >>> queries take the same time (~ 1 sec). Maybe the other >>> approach >>> >>> (create an additional column that "caches" >>> the bbox, and matching >>> >>> against this) could help me. Any ideas on how to >>> create this bbox >>> >>> with my huge multipolygons? >>> >>> >>> >>> >>> >>> Thanks in advance >>> >>> >>> >>> Regards >>> >>> Jorge >>> >>> >>> >>> >>> >>> On Wed, May 6, 2009 at 4:28 AM, Kevin Neufeld >>> >>> <[email protected] >>> <mailto:[email protected]>> >>> wrote: >>> >>> >>> >>> Jorge Arévalo wrote: >>> >>> >>> >>> ... If is useful, I tested the method >>> "ST_isvalid" with the >>> >>> multipolygon and returns >>> "false". Maybe the multipolygon is >>> >>> not closed? I loaded the data from a >>> shapefile. Is it >>> >>> possible to create a >>> "non-valid" multipolygon? Does PostGIS >>> >>> accept this? >>> >>> >>> >>> >>> >>> Ah, yes. Most spatial predicates in PostGIS >>> assume the input >>> >>> geometry is valid. Suppose you had defined a >>> POLYGON with a >>> >>> hole or inner ring outside of the exterior >>> ring. What is the >>> >>> area? Does the question even make sense? >>> PostGIS allows >>> >>> invalid geometries in the database so users can >>> make full use of >>> >>> the PostGIS toolset to do whatever they need >>> (ie. breakdown a >>> >>> polygon to it's constituent linework and >>> rebuild it back up >>> >>> again to a valid polygon) >>> >>> >>> >>> >>> >>> Ok, being even more specific. I'm >>> working with data about >>> >>> Spain. I have a HUGE multipolygon that >>> represents "water" >>> >>> (this is, the coasts around Spain and its >>> islands). Then, >>> >>> the "holes" inside this >>> multypolygon have the shape of >>> >>> Spain, Balearic Islands and Canary >>> Islands. Of course, I >>> >>> have more multipolygons that represent >>> "forests", >>> >>> "airports", "cities", >>> etc, that fit into these holes. >>> >>> >>> >>> Really, my problem is with some points >>> that belong to an >>> >>> airport in an island. Using >>> "ST_Within" and "ST_Contains", >>> >>> the result is that these points belong to >>> the multipolygon >>> >>> "airport" and multipolygon >>> "water" at same time. Obviously, >>> >>> the island (and its airport) is >>> surrounded by water, but the >>> >>> airport's points shouldn't be >>> part of the multopolygon >>> >>> "water". And, as I said, when I >>> apply "ST_isvalid" to the >>> >>> multipolygon "water", returns >>> false. Maybe is not closed? >>> >>> >>> >>> >>> >>> Yeah, as mentioned before, a point that is not >>> on the surface of >>> >>> a (multi)polygon (whether in a hole or >>> completely outside) is >>> >>> not considered within the (multi)polygon. It >>> does sound like >>> >>> your ocean polygon has validity issues. >>> >>> >>> >>> >>> >>> >>> >>> Oh, btw, what's the difference >>> between "ST_Within" and >>> >>> "Within". Does >>> "ST_Within" use index instead of geometry? Am >>> >>> I right? >>> >>> >>> >>> No, not instead of. Both use the actual >>> geometry for testing >>> >>> within. ST_Within will also use the index to >>> narrow down the >>> >>> candidate list first. >>> >>> As you can see, the definition of ST_Within is >>> just a simple SQL >>> >>> wrapper that first invokes the index. >>> >>> postgis=# select prosrc from pg_proc where >>> proname = 'st_within'; >>> >>> prosrc >>> --------------------------------------- >>> >>> SELECT $1 && $2 AND _ST_Within($1,$2) >>> >>> (1 row) >>> >>> >>> >>> >>> http://postgis.refractions.net/documentation/manual-svn/ST_Within.html >>> >>> >>> >>> Cheers, >>> >>> Kevin >>> >>> >>> >>> >>> >>> >>> _______________________________________________ >>> >>> postgis-users mailing list >>> >>> [email protected] >>> >>> <mailto:[email protected]> >>> >>> http://postgis.refractions.net/mailman/listinfo/postgis-users >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> ------------------------------------------------------------------------ >>> >>> >>> >>> _______________________________________________ >>> >>> postgis-users mailing list >>> >>> [email protected] >>> >>> http://postgis.refractions.net/mailman/listinfo/postgis-users >>> >>> >>> _______________________________________________ >>> >>> postgis-users mailing list >>> >>> [email protected] >>> >>> http://postgis.refractions.net/mailman/listinfo/postgis-users >>> >>> >>> >>> >>> -----Inline Attachment Follows----- >>> >>> _______________________________________________ >>> postgis-users mailing list >>> [email protected] >>> http://postgis.refractions.net/mailman/listinfo/postgis-users >>> >>> _______________________________________________ >> postgis-users mailing list >> [email protected] >> http://postgis.refractions.net/mailman/listinfo/postgis-users >> > > -- > > Ben Madin > REMOTE INFORMATION > > t : +61 8 9192 5455 > f : +61 8 9192 5535 > m : 0448 887 220 > Broome WA 6725 > > [email protected] > > > > Out here, it pays to > know... > > > > _______________________________________________ > postgis-users mailing list > [email protected] > http://postgis.refractions.net/mailman/listinfo/postgis-users >
_______________________________________________ postgis-users mailing list [email protected] http://postgis.refractions.net/mailman/listinfo/postgis-users
