Yep, maybe something like UPDATE ukmajrdbuffer SET the_geom = ST_MakeValid(the_geom) WHERE ST_IsValid(the_geom) = FALSE
Cheers, Rémi-C 2013/11/15 James David Smith <[email protected]> > Hi Remi, > > Thanks for your continued guidance with this big data problem. I > wasn't in the office yesterday so am looking at it again now. I've > just ran the below below query running, and will let you know how many > rows there are once it's finished. > > CREATE TABLE lines_for_each_road AS > WITH all_lines AS ( > SELECT * > FROM all_lines > ) > SELECT row_number() over() AS id--,* > FROM ukmajrdbuffer, all_lines > WHERE ST_Intersects(ukmajrdbuffer.geom, all_lines.the_geom)=TRUE; > > In the meantime however, I have noticed something else that is a > worry. I did this: > > SELECT st_isvalid(geom) FROM ukmajrdbuffer; > > NOTICE: Ring Self-intersection at or near point 367541.09074241668 > 311890.25257437676 > NOTICE: Ring Self-intersection at or near point 209901.50000000186 > 706890.50000000373 > NOTICE: Ring Self-intersection at or near point 444961.46434461698 > 161217.66612910852 > NOTICE: Ring Self-intersection at or near point 559818.64931676909 > 103089.14606037363 > NOTICE: Ring Self-intersection at or near point 420114.82424666081 > 298567.38722311892 > st_isvalid > ------------ > f > f > t > f > f > f > t > t > (8 rows) > > I guess that I should try to fix these geometries first before we > attempt to continue any further right? > > Thanks > > James > > > > > > > > On 13 November 2013 18:04, Rémi Cura <[email protected]> wrote: > > All lines should be in one table, you can add a column to differentiate > > between line going SN and EW, > > this is what the query I wrote does. > > > > Not changing your code much, it gives for the line table : > > > > CREATE TABLE all_lines tablespace jamestabs AS > > ( SELECT st_setsrid(st_makeline(st_makepoint(x, 7780), st_makepoint(x, > > 1226580)),27700) as the_geom, 'SN' AS direction > > > > FROM generate_series(53320::int, 667380::int, 20) as x) > > UNION > > ( SELECT st_setsrid(st_makeline(st_makepoint(53320, y), > > st_makepoint(667380, y)),27700) as the_geom, 'EW' AS direction > > > > FROM generate_series(7780::int, 1226580::int, 20) as y) > > > > then building index on it. > > > > > > "-- Merge the road buffer polygons into one big polygon > > -- Should maybe consider leaving them as they are, but am merging for > now. > > SELECT st_union(geom) as geom INTO union_roads FROM ukmajrdbuffer;" > > You must not do it ! > > As I told you "blabla index blabla". > > > > instead, create an index on table ukmajrdbuffer, and use it without > > unioning. > > > > Then you query is not going to work at all ... (or *very very* slowly) > > "FROM east_to_west, north_to_south, buff_union_roads" : it potentially > > generates 1000*50*1000*50 * number_of_road rows to filter, very bad idea, > > you'll get the same problem as with points... > > > > (please see some example on internet to understand WITH, it is simple , > > useful, and makes query much more clear) > > > > First you could try : > > > > CREATE TABLE lines_for_each_road tablespace jamestabs AS > > WITH all_lines AS ( --this subquerry gets all the lines > > > > SELECT * > > FROM all_lines > > > > ), > > SELECT row_number() over() AS id--,* > > FROM ukmajrdbuffer, all_lines > > WHERE ST_Intersects(ukmajrdbuffer.the_geom, all_lines.the_geom)=TRUE; > > > > This will give you (if you uncomment the "--,*", ), for every road, all > the > > lines intersecting the road > > example : > > road_1 | line_2_EW > > road_1 | line_3_EW > > road_1 | line_78_SN > > road_1 | line_7_EW > > road_2 ... > > > > How long is it (don't uncomment plz), and how many rows do you get > (select > > count(*) from lines_for_each_road )? > > > > If it is not too long, we will go on and "cut" the lines so that for > every > > road, we keep the part of the lines that are inside the road_buffer. > > (I have to leave now, tomorrow) > > > > Cheers, > > Rémi-C > > > > > > > > > > > > 2013/11/13 James David Smith <[email protected]> > >> > >> Hey Remi, > >> > >> I don't quite get what the query you gave does to be honest. But > >> working with some of the ideas and notions you gave me I have put the > >> below together. The final bit of the code is the key query. The tables > >> of NS and EW lines were done very quickly. Nice. But now doing the > >> 'intersects within polygon' is taking some time so I'll see how it > >> gets on when I get in to work tomorrow morning: > >> > >> DROP TABLE IF EXISTS north_to_south; > >> DROP TABLE IF EXISTS east_to_west; > >> > >> -- Make a table of north to south lines > >> CREATE TABLE north_to_south tablespace jamestabs AS > >> ( SELECT st_setsrid(st_makeline(st_makepoint(x, 7780), st_makepoint(x, > >> 1226580)),27700) as the_geom > >> FROM generate_series(53320::int, 667380::int, 20) as x); > >> > >> -- Add an index > >> CREATE INDEX north_to_south_geom_index ON north_to_south USING GIST ( > >> the_geom ); > >> > >> -- Make a table of east to west lines > >> CREATE TABLE east_to_west tablespace jamestabs AS > >> ( SELECT st_setsrid(st_makeline(st_makepoint(53320, y), > >> st_makepoint(667380, y)),27700) as the_geom > >> FROM generate_series(7780::int, 1226580::int, 20) as y); > >> > >> -- Add an index > >> CREATE INDEX east_to_west_geom_index ON east_to_west USING GIST ( > the_geom > >> ); > >> > >> -- Import the Road Buffer file using the pg_loader interface > >> > >> -- Update the geometry field as the loader didn't seem to do this. > >> SELECT UpdateGeometrySRID('ukmajrdbuffer','geom',27700); > >> > >> -- Merge the road buffer polygons into one big polygon > >> -- Should maybe consider leaving them as they are, but am merging for > now. > >> SELECT st_union(geom) as geom INTO union_roads FROM ukmajrdbuffer; > >> > >> -- The unioned roads file isn't valid geom. So I simplify it a little > >> to make it valid. > >> -- We simplify with a 70 metres factor. Geom is now valid. > >> SELECT ST_SimplifyPreserveTopology(geom, 70) AS geom INTO > >> buff_union_roads FROM union_roads; > >> > >> -- Now calculate the points where the lines intersect and also where > >> they are within the polygon. > >> CREATE TABLE gridpoints tablespace jamestabs > >> AS (SELECT ST_Intersection(east_to_west.the_geom, > north_to_south.the_geom) > >> FROM east_to_west, north_to_south, buff_union_roads > >> WHERE st_within(buff_union_roads.geom, > >> ST_Intersection(east_to_west.the_geom, north_to_south.the_geom)) = > >> TRUE); > >> > >> Thanks > >> > >> James > >> > >> > >> > >> > >> On 13 November 2013 13:33, Rémi Cura <[email protected]> wrote: > >> > Using the lines you exactly do the same as with points. > >> > The lines will be spaced exactly by 20 meters, end and start being > >> > exactly > >> > on the 20 meters grid, line being either perfectly veritcla or > perfectly > >> > horizontal. > >> > So intersections are guaranteed to be on the 20 meter grid, because > this > >> > is > >> > the definition of a 20 meter grid =) . > >> > > >> > > >> > Here is a not optimal way to do it : > >> > For instance if UK is enclosed in a bouding box (xmin,ymin,xmax,ymax), > >> > where > >> > xmin-max and ymin-max are rounded to the nearest 20 meters multiple > (ie > >> > x or > >> > y % 20 = 0) > >> > (Please note that this is not clever because we generate too much SN > >> > lines, > >> > GB being only 500 km wide at most, so we would need to filter, but you > >> > don't > >> > care, following would take less than a minute anyway, and you need to > do > >> > it > >> > only once). > >> > > >> > WITH minmax AS ( > >> > SELECT xmin,ymin,xmax,ymax, bbox_of_england > >> > ), > >> > WITH series AS ( > >> > SELECT s > >> > FROM generate_series(0,1000*(1000/20)::bigint,20) AS s --casting to > >> > bigint > >> > to be safe against going over int limit > >> > ), > >> > WITH lines AS ( > >> > SELECT ST_MakeLine > (ST_MakePoint(xmin+s,ymin),ST_MakePoint(xmin+s,ymax)) > >> > AS > >> > lineSN, > >> > ST_MakeLine(ST_MakePoint(xmin, ymin+s), ST_MakePoint(xmax,ymin+s) AS > >> > lineEW > >> > FROM series, minmax > >> > ), > >> > unioned_lines AS ( > >> > > >> > SELECT lineSN AS line, 'SN' as direction > >> > FROM lines,minmax > >> > WHERE ST_Interesects(lineSN,bbox_of_england)= TRUE > >> > > >> > UNION > >> > > >> > SELECT lineEW AS line, 'EW' AS direction > >> > FFROM lines,minmax > >> > WHERE ST_Interesects(lineEW,bbox_of_england)= TRUE > >> > > >> > ) > >> > SELECT row_number() over() AS uid, line, direction > >> > FROM unioned_line > >> > > >> > then creating table and index on geom and on direction (and possibly > on > >> > uid, > >> > but better create a primary key then if you use it) > >> > > >> > Of course I didn't test this query, but you should be able to use it > >> > easily. > >> > > >> > Cheers, > >> > Rémi-C > >> > > >> > > >> > > >> > > >> > 2013/11/13 James David Smith <[email protected]> > >> >> > >> >> Hi Remi, > >> >> > >> >> Apologies. One more thing which I am not sure about. We need the > >> >> final grid of points that we make to be multiples of 20. For example: > >> >> > >> >> 53320, 7780 = Yes > >> >> 53323, 7780 = No > >> >> 53320, 7794 = No > >> >> 53321, 7754 = No > >> >> > >> >> That is why in the original function that we wrote, we put the > numbers > >> >> in as below. > >> >> > >> >> SELECT ST_Collect(st_setsrid(ST_POINT(x,y),$3)) FROM > >> >> generate_series(53320::int, 667380::int,$2) as x > >> >> ,generate_series(7780::int, 1226580::int,$2) as y > >> >> where st_intersects($1,st_setsrid(ST_POINT(x,y),$3))' > >> >> > >> >> We did this because the points in the grid have to 'align' with the > >> >> values of mutliple 20. > >> >> > >> >> If we do the N-S and E-W lines solution that you suggest, I don't > >> >> think that this will work will it? > >> >> > >> >> Thanks > >> >> > >> >> James > >> >> > >> >> > >> >> On 13 November 2013 11:49, James David Smith > >> >> <[email protected]> wrote: > >> >> > Hi Remi, > >> >> > > >> >> > Thanks so much for this detailed response. Your idea about creating > >> >> > the lines and the only storing where they intersect and are within > >> >> > the > >> >> > polygons is a great idea. I'm going to give that a go now. > >> >> > > >> >> > Thanks again, > >> >> > > >> >> > James > >> >> > > >> >> > On 13 November 2013 11:41, Rémi Cura <[email protected]> wrote: > >> >> >> This would be an improvement but still non efficient. > >> >> >> > >> >> >> you have 2 possibilities, supposing that what you want is points > 20 > >> >> >> meter > >> >> >> spaced in all road_buf : > >> >> >> > >> >> >> either you compute for each road_buffer the points inside, one > road > >> >> >> at > >> >> >> a > >> >> >> time ( figuratively ). > >> >> >> > >> >> >> This means you write a function which generates points 20-meter > >> >> >> spaced > >> >> >> in > >> >> >> the bounding box of the road you are working on, and keep those in > >> >> >> the > >> >> >> real > >> >> >> road buffer, and group result points as a multipoints (for > cosmetic > >> >> >> purpose) > >> >> >> . > >> >> >> > >> >> >> You would then use it like this : > >> >> >> > >> >> >> SELECT road_id, points_insidea_road(the_geom) AS > >> >> >> my_points_inside_the_road > >> >> >> FROM road_polygons_table > >> >> >> > >> >> >> You would have as output a line per road with a multipoint > >> >> >> containing > >> >> >> all > >> >> >> the point 20 meter spaced inside the road. > >> >> >> > >> >> >> Or (what you wrote) you generate all points and keep those > >> >> >> intersecting > >> >> >> one > >> >> >> or many road. > >> >> >> > >> >> >> The first one is mandatory, because it avoids to manipulate > >> >> >> (incredibly) big > >> >> >> table of all points spaced by 20 meters for UK (around 500 * 10^6 > >> >> >> points ! ) > >> >> >> Even with indexes it's not a good idea to use such number of rows. > >> >> >> > >> >> >> That's the first point (write a function working for one road, > then > >> >> >> use > >> >> >> it > >> >> >> for all road). > >> >> >> > >> >> >> The second point is the way you compute is very inefficient. If > your > >> >> >> road is > >> >> >> going from south-West to NorthEast, you will generate a very big > >> >> >> number > >> >> >> of > >> >> >> points, and very few will be in the road_buffer. This is > problematic > >> >> >> as > >> >> >> for > >> >> >> a road of only 20 kms, you may generate as many as 100k points and > >> >> >> keep > >> >> >> only > >> >> >> few of them. If you want to process hundreds of ks or roads it > will > >> >> >> become > >> >> >> very problematic. Also you would have to generate points each > time. > >> >> >> > >> >> >> So here is what I suggest you : change your strategy : > >> >> >> instead of generating all point in bounding box and then keeping > >> >> >> only > >> >> >> those > >> >> >> in road_buffer, > >> >> >> generate a line every 20 meters going North south and an line > every > >> >> >> 20 > >> >> >> meters going East-West , then use the function ST_Intersection to > >> >> >> keep > >> >> >> only > >> >> >> part of this lines being inside the road_polygon, then you have > the > >> >> >> points > >> >> >> inside road_polygons as the intersections of these EW lines with > the > >> >> >> SN > >> >> >> lines. > >> >> >> > >> >> >> It will be very efficient because you can create a table with all > >> >> >> the > >> >> >> lines > >> >> >> going East-West and South-North for great britain (about 25k + 50k > >> >> >> lines), > >> >> >> and build index on it (index on geom and on the column saying if > it > >> >> >> is > >> >> >> SN or > >> >> >> EW). > >> >> >> > >> >> >> The trick is the number of lines is around 500km * 50 line/km + > >> >> >> 1000km > >> >> >> * 50 > >> >> >> line/km , where the number of points is 500km * 50 line/km * > 1000km > >> >> >> * > >> >> >> 50 > >> >> >> line/km > >> >> >> > >> >> >> Hope it helps, > >> >> >> > >> >> >> Cheers, > >> >> >> Rémi-C > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> 2013/11/13 James David Smith <[email protected]> > >> >> >>> > >> >> >>> Hey Remi, > >> >> >>> > >> >> >>> Thanks for your reply. So in your mind you think we should have a > >> >> >>> database of say 300 polygons, and then we run a command like this > >> >> >>> right? > >> >> >>> > >> >> >>> SELECT > >> >> >>> ST_Collect(st_setsrid(ST_POINT(x,y),27700)) > >> >> >>> FROM > >> >> >>> generate_series(53320::int, 667380::int,20) as x, > >> >> >>> generate_series(7780::int, 1226580::int,20) as y, > >> >> >>> road_polygons_table > >> >> >>> WHERE > >> >> >>> st_intersects(road_polygons_table.the_geom, > >> >> >>> st_setsrid(ST_POINT(x,y),27700)) > >> >> >>> > >> >> >>> What do you think? > >> >> >>> > >> >> >>> Thanks > >> >> >>> > >> >> >>> James > >> >> >>> > >> >> >>> > >> >> >>> > >> >> >>> > >> >> >>> On 11 November 2013 14:51, Rémi Cura <[email protected]> > wrote: > >> >> >>> > Hey, > >> >> >>> > the whole point on using a sgbds like postgis is using index. > >> >> >>> > > >> >> >>> > If you have one line you don't use indexes... > >> >> >>> > > >> >> >>> > So in short, don't make one polygon with a buffer of all the > >> >> >>> > road, > >> >> >>> > but a > >> >> >>> > table with a line for the buffer for every road, then do you > >> >> >>> > computation > >> >> >>> > to > >> >> >>> > create grid of points inside of polygons, then union the result > >> >> >>> > of > >> >> >>> > points! > >> >> >>> > > >> >> >>> > And it s always a bad idea to run a function on big data when > you > >> >> >>> > have > >> >> >>> > not > >> >> >>> > tested it fully (including scaling behavior) on small data. > >> >> >>> > > >> >> >>> > > >> >> >>> > Cheers > >> >> >>> > Rémi-C > >> >> >>> > > >> >> >>> > > >> >> >>> > 2013/11/11 James David Smith <[email protected]> > >> >> >>> >> > >> >> >>> >> Hi all, > >> >> >>> >> > >> >> >>> >> Would appreciate some advice on the best way to accomplish > this > >> >> >>> >> please. > >> >> >>> >> > >> >> >>> >> Our situation is that we have a single polygon which has been > >> >> >>> >> created > >> >> >>> >> by buffering all of the major roads in the UK. Projection is > >> >> >>> >> OSGB36 > >> >> >>> >> (27700). Obviously it's quite a big polygon. > >> >> >>> >> > >> >> >>> >> --> SELECT st_area(geom) FROM roadbufferunion; > >> >> >>> >> st_area > >> >> >>> >> ------------------ > >> >> >>> >> 77228753220.8271 > >> >> >>> >> > >> >> >>> >> What we now want to do is create a regular grid of 20 metre x > 20 > >> >> >>> >> metre > >> >> >>> >> points instead the polygon area. So we wrote this function > >> >> >>> >> (based > >> >> >>> >> on > >> >> >>> >> some googling, apologies for not being able to recall the > exact > >> >> >>> >> person > >> >> >>> >> who originally wrote it): > >> >> >>> >> > >> >> >>> >> CREATE OR REPLACE FUNCTION makegrid(geometry, integer, > integer) > >> >> >>> >> RETURNS geometry AS > >> >> >>> >> 'SELECT ST_Collect(st_setsrid(ST_POINT(x,y),$3)) FROM > >> >> >>> >> generate_series(53320::int, 667380::int,$2) as x > >> >> >>> >> ,generate_series(7780::int, 1226580::int,$2) as y > >> >> >>> >> where st_intersects($1,st_setsrid(ST_POINT(x,y),$3))' > >> >> >>> >> LANGUAGE sql > >> >> >>> >> > >> >> >>> >> and we then run this by doing the following: > >> >> >>> >> > >> >> >>> >> SELECT st_x((ST_Dump(makegrid(geom, 20, 27700))).geom) as x, > >> >> >>> >> st_y((ST_Dump(makegrid(geom, 20, 27700))).geom) as y INTO > >> >> >>> >> grid_points > >> >> >>> >> from roadbufferunion; > >> >> >>> >> > >> >> >>> >> However after over 2 days of the query running on a pretty > >> >> >>> >> powerful > >> >> >>> >> linux cluster, we still have no result. I'm not sure if it is > >> >> >>> >> actually running or not to be honest. > >> >> >>> >> > >> >> >>> >> Does the query look right? > >> >> >>> >> Any ideas how we can make it run quicker? > >> >> >>> >> > >> >> >>> >> Thanks > >> >> >>> >> > >> >> >>> >> James > >> >> >>> >> _______________________________________________ > >> >> >>> >> postgis-users mailing list > >> >> >>> >> [email protected] > >> >> >>> >> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users > >> >> >>> > > >> >> >>> > > >> >> >>> > > >> >> >>> > _______________________________________________ > >> >> >>> > postgis-users mailing list > >> >> >>> > [email protected] > >> >> >>> > http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users > >> >> >>> _______________________________________________ > >> >> >>> postgis-users mailing list > >> >> >>> [email protected] > >> >> >>> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users > >> >> >> > >> >> >> > >> >> >> > >> >> >> _______________________________________________ > >> >> >> postgis-users mailing list > >> >> >> [email protected] > >> >> >> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users > >> >> _______________________________________________ > >> >> postgis-users mailing list > >> >> [email protected] > >> >> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users > >> > > >> > > >> > > >> > _______________________________________________ > >> > postgis-users mailing list > >> > [email protected] > >> > http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users > >> _______________________________________________ > >> postgis-users mailing list > >> [email protected] > >> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users > > > > > > > > _______________________________________________ > > postgis-users mailing list > > [email protected] > > http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users > _______________________________________________ > postgis-users mailing list > [email protected] > http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users >
_______________________________________________ postgis-users mailing list [email protected] http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users
