Re: [postgis-users] Question on how to optimize linestring to points distance query

Carl Anderson Mon, 09 Jul 2007 18:06:47 -0700

Paul Ramsey wrote:

Perhaps you could try some experiments for us :)
Write a PL/PgSQL function that takes in a linestring and returns atupleset of two (or four or eight) bounding boxes, that taken togethercover the linestring. By recursively dividing the master box intosmaller boxes that still cover the line, you should be able to getbigger index selectivity.

Paul are you suggesting that a geometry can have more than one GISTnode? As I read them GIST compress and decompress as tied directly to asingle tuple having a geometry.

Maybe there could be a concept of a multi-bbox upon which some someindex selectivity could be built?

A geometry's bounding box is broken into a grid and each element totested if it truly contains a part of the original geometry.

Optimizations may exist.

That would be way more expensive to index but might massively improvequery speeds for complex objects. It would increase index storage sizea good deal as a downside.It would be a super elegant way to speed index queries for complexentities and have little impact on simple objects.It would increase index storage size a good deal as a small downside.(The entire index works best when it all fits into memory, that is aserver tuning issue.)

Can a single geometry datatype have alternate GIST index forms? Myguess is no. If so no half way rollout testing.

C.

If your tests work OK, this could be something of more generalapplicability.

I get 1 30x improvement in speed breaking Polygons having more than 500vertices into 1/100th chunks.Of course SQL environments are designed to wade through tons offeatures. So a 1 - > 100 expansion is no real problem.

I have attached the plpgsql function to do the splitting. I is tunedaround some GEOS issues I run into.

P

Stephen Woodbridge wrote:
Hi all,
I have a linestring and a large number of points in a table. I wantto find all the points with distance X of the linestring. This is apretty straight forward query like:
$LINE = "GeomFromText('LINESTRING(...)',4326)"
$X    = <number in meters>

select distance_sphere($LINE, the_geom) as distance, *
  from mypoints
 where
   expand($LINE, metersToDegrees($X)) && the_geom and
   distance_sphere($LINE, the_geom) < $X
 order by distance;

Discussion:
Since the && operator only compares bbox it does not really matter ifyou expand one or the other arguments, and you should expandwhichever has the fewest items for speed.
In the case of a linestring that might be diagonal across yourextents of the_geom, you are likely to select all the points. Itseems like there should be an optimization or another operator, thatwould allow you to expand the points and test their bbox against thelinestring. This could be done very quickly with the first stage ofan Cohen-Sutherland line clipping algorithm or the even fasterLiang-Barsky Line Clipping algorithm or the Nicholl-Lee-Nicholl (NLN)line clipping algorithm. These have a quick rejection algorithmbetween a bbox and a line segment that might be crossing it.
Doing something like this might be more expensive then the &&operator but might be much faster overall.
Questions:

Is anything like this being done?
Is there a way to compare a linestring to a large number of bboxquickly?
Is there a better way to do this than the sql above?

Thanks,
  -Steve
_______________________________________________
postgis-users mailing list
[email protected]
http://postgis.refractions.net/mailman/listinfo/postgis-users

-- Function: public.chopintogrid(text, text, text, text, numeric, numeric)

-- DROP FUNCTION public.chopintogrid(text, text, text, text, numeric, numeric);

CREATE OR REPLACE FUNCTION public.chopintogrid(text, text, text, text, numeric, numeric)
  RETURNS text AS
$BODY$DECLARE
 src_tab	ALIAS for $1;
 dest_tab	ALIAS for $2;
 shp_fld	ALIAS for $3;
 tag_fld        ALIAS for $4;
 offset		ALIAS for $5;
 size		ALIAS for $6;
 gx		NUMERIC;
 gy		NUMERIC;
 sml_x          NUMERIC;
 sml_y          NUMERIC;
 lrg_x          NUMERIC;
 lrg_y          NUMERIC;
 step           INTEGER;
 gbox		GEOMETRY;
 tag_item 	text;
 rec            record;
 shp 		GEOMETRY;
 t_shp          GEOMETRY;
 wrk_shp	GEOMETRY;
 ssrid          INTEGER;

BEGIN

-- create dest tab

-- walk existing records
FOR rec in EXECUTE 'select '||shp_fld||' as r1,'||tag_fld||' as r2 
         from '||src_tab || '  '
          LOOP
  tag_item = rec.r2;
  ssrid = srid(rec.r1);
--  shp = buffer(rec.r1,0);
  shp = rec.r1;
  sml_x = xmin(shp);
  sml_y = ymin(shp);
  lrg_x = xmax(shp);
  lrg_y = ymax(shp);

  gx = (round((sml_x-offset)/size)*size)+offset - (size/2);

WHILE ( gx <= (lrg_x + size) ) LOOP

  gy = (round((sml_y-offset)/size)*size)+offset - (size/2);

  WHILE ( gy <= (lrg_y + size) ) LOOP
 -- six steps at one time
  t_shp = (make_polygon(sml_x, gy - size, lrg_x, gy + ( 6 * size ) + (size/1.9999999), srid(shp))); 
  IF ( t_shp && shp AND intersects(t_shp,shp) ) THEN
    wrk_shp = (filter(intersection(t_shp, shp),'MULTIPOLYGON'));
  ELSE
    wrk_shp = NULL;
  END IF;

  step = 0;
  WHILE ( wrk_shp is not null and step < 6 ) LOOP

    gbox = (expand(makepoint(gx,gy,srid(shp)),size/2));

    IF ( gbox && wrk_shp AND intersects(gbox, wrk_shp)) THEN
      gbox = intersection(gbox,wrk_shp);
-- RAISE WARNING ' wrkshp %', asewkt(wrk_shp);
-- RAISE WARNING ' isect %', asewkt(gbox);
      gbox = filter(gbox,'MULTIPOLYGON');
    ELSE 
      gbox = NULL;
    END IF;

    IF ( gbox is not NULL ) THEN
--      RAISE INFO '% %, %',gx,gy,tag_item;

      EXECUTE 'insert into '||dest_tab||' ('||shp_fld||','||tag_fld||') values (setsrid(geometry('||
        quote_literal(replace(asewkt(gbox),'SRID=0;',''))||'),'||ssrid||'),'||quote_literal(tag_item)||')';
--    ELSE
--      RAISE INFO '% % ',gx,gy;
    END IF;

    step = step + 1;
    gy = gy + size;
  END LOOP;
-- RAISE INFO '% of % and %', gy, wrk_shp, lrg_y;

  END LOOP;
  gx = gx + size;
END LOOP;

END LOOP;
return 'ok';

END;$BODY$
  LANGUAGE 'plpgsql' VOLATILE;
ALTER FUNCTION public.chopintogrid(text, text, text, text, numeric, numeric) OWNER TO candrsn;

-- drop table political.places_grid_big;
truncate political.places_grid_big;
create table political.places_grid_big as select * from political.places limit 0
;
create index places__shape__ind on political.places using gist(shape);

select chopintogrid('political.places','political.places_grid_big','shape','gid'
,0,10000);

_______________________________________________
postgis-users mailing list
[email protected]
http://postgis.refractions.net/mailman/listinfo/postgis-users

Re: [postgis-users] Question on how to optimize linestring to points distance query

Reply via email to