Hi, Matt Amos wrote: > as marcus has pointed out the performance of 1-d indexes for 2-d data > is often poor. for further reading i recommend "Foundations of > Multidimensional and Metric Data Structures" by H. Samet > (http://www.cs.umd.edu/~hjs/) which is pretty comprehensive.
We had that discussion in breadth and length in late 2006/early 2007, where Nick Hill initially came up with the tiling stuff, http://lists.openstreetmap.org/pipermail/dev/2006-September/002059.html and was then reminded of the quadtile concept, http://lists.openstreetmap.org/pipermail/dev/2006-September/002062.html and so on. Nick seemed to have done a number of tests comparing B-tree and R-tree performance (because at the time we were constantly told how PostGIS would solve our performance issues), and at the time he wrote: > I therefore contend: > > 1) The data types for postgis are uneconomical. > I contend that point data types using 1/4 of the storage can perform > adequately, with +/- 5mm global accuracy. > > 2) R-tree indexes, although theoretically being close to ideal for > the geo problem domain have problems with their implementation. > Lookups on R-tree appear to be much more fragmented than look-ups on > b-tree, resulting in lots of costly disk seeks, or requiring them to > be cached in RAM. (Both above issues are soluble). > > I also contend that > > 3) B-tree lookups on lat/lon are theoretically inefficient. Only one > of either lat/lon are used as an index range lookup. The second is > looked up through a brute force search. However, the look-up on the > first column is extremely efficient. In practice, the records > narrowed from the first column needing brute force search to narrow > the second column, are actually performed quickly, with few > additional disk seeks. I don't have an explanation for it's apparent > speed apart from the maturity of b-tree and widespread efforts to > counteract the shortcomings of b-tree with clever optimisations for > 2-column arrangements. Ideally, we need to dispose of that > requirement to brute force search without introducing unacceptable > overheads, and the brute force search does impose scalability > concerns. > > In summary, the postGIS system appears to have a lot going for it, > and feel there are opportunities being lost with OSM not sharing the > same data format with other free GIS initiatives. At the same time, > my tests using MySQL have shown that many of the theoretical > performance benefits of the postGIS system are just that - > theoretical, and genuinely look forward to being proven wrong on > this. I also think that the theory and practice of PostGIS can be > brought closer together with further development and refinement. If > OSM used postGIS, that could help development of postGIS. On the > other hand, if OSM used postgis, it may delay or prevent better > systems developed through OSM seeing the light of day. (This is from http://lists.openstreetmap.org/pipermail/talk/2007-January/010166.html). Bye Frederik _______________________________________________ dev mailing list [email protected] http://lists.openstreetmap.org/listinfo/dev

