Re: [postgis-users] GROUP BY geometry dropping points

Martin Davis Fri, 13 Jun 2008 13:18:37 -0700

Since GROUP BY is using the BB, it seems like it would be independent ofline orientation. Of course, it will also have the problem that a lineBB might include other completely different lines...

One idea we used here was to GROUP BY the WKB text for the geometry.That works well for points, but of course will require same orientationfor lines. (But the normalizing idea I suggested would take care of this).

If this data was in BC-Albers then 1.5 mm is probably below thethreshold for the floats in the BB. So that is likely the problem.


Martin

Dan Erikson wrote:

Thanks Martin, that is good to know. I was assuming that the GROUP BYoperation was based on the actual geometry, not the bounding box. Ifit is using the bounding box, do you know if it considers direction ona line geometry?
Nearest point was about 1.5 mm away. As you are aware with our"clean" process, we're likely to have many points and line stringsthat are very close. Seems like this was the problem in thepolygonize process as posted yesterday.
Dan Erikson BNRSc
Project Manager
-------------------------------------
Timberline Natural Resource Group
(250)-314-0875 ext 240
#201-175 4th Avenue  Kamloops  BC
www.timberline.ca
-------------------------------------



Martin Davis wrote:
One thing I do know is that GROUP BY (and probably DISTINCT) use onlyan approximate bounding box test when comparing geometries. Thereason for this is that the equality operator used only checks thebounding box, and the BB is defined using floats, not doubles. We'vehad issues in the past where points which are very close together getpartly ignored when using GROUP BY.
For the missing point, is there some other point very near it?
IMO this is not good behaviour to have in PostGIS. It's very usefulto use GROUP BY and DISTINCT in these kind of situations, but youneed to have exact answers. I suggest that the equality operator bemodified so that for small geometries (Points and 2-point linestringsat least - and perhaps boxes) it tests the exact vertex values.
Dan Erikson wrote:
I have a point dataset that I am looking to pull unique geometriesfrom.
I have tried the following:

    * select geom from foo GROUP BY geom;
    * select DISTINCT geom from foo;
Both of these methods result in at least one point dropped thatshould not have been. The point dropped in error was not aduplicate point. It in fact was already unique in the table.
This method seems to work:

    * create temp table bar as select * from foo;
    * delete from bar where st_equals(a.geom, b.geom) from bar a, bar
      b and a.gid < b.gid;
    * I know this is a round-about way, but it proves that GROUP BY or
      DISTINCT should have worked.  (I think)

Any ideas why group by is seemingly erroneous?
Dan Erikson BNRSc
Project Manager
-------------------------------------
Timberline Natural Resource Group
(250)-314-0875 ext 240
#201-175 4th Avenue  Kamloops  BC
www.timberline.ca
-------------------------------------
------------------------------------------------------------------------
_______________________________________________
postgis-users mailing list
[email protected]
http://postgis.refractions.net/mailman/listinfo/postgis-users
_______________________________________________
postgis-users mailing list
[email protected]
http://postgis.refractions.net/mailman/listinfo/postgis-users


--
Martin Davis
Senior Technical Architect
Refractions Research, Inc.
(250) 383-3022

_______________________________________________
postgis-users mailing list
[email protected]
http://postgis.refractions.net/mailman/listinfo/postgis-users

Re: [postgis-users] GROUP BY geometry dropping points

Reply via email to