On Fri, Oct 17, 2008 at 03:25:59PM +0200, Freek wrote:
Generally you keep a bit of free space in all nodes to accommodate a number of insertions and deletions, and only split or merge nodes when they overflow or underflow so that you don't need to move around all kinds of elements all the time.But how do you do those splits, i.e. node inserts? Just moving everything behind the insertion point by one block would be way too expensive for large trees, so there's probably a better method.
Yup. I mmap() the whole database. For europe it's less than twice the size of my physical memory (5.3GB vs. 3.5GB), so most of it fits into memory. For whole planet imports it gets much slower (5.6h instead of 2.5h if it were scaling linearly). R-Tree generation scales fine (still fits into memory), 1D index sorting is O(2n) and way processing (including MBR calculation) is O(5n). That's probably due to the nodes being sorted by id in the database so spatial locality is reduced. Quicksort (used for sorting 1D indices) also doesn't use locality for the outer iterations, but isn't a bottleneck yet. Fitting as much as possible into physical RAM to avoid disk accesses was the primary design goal (the last design was severly I/O-constrained). There's still some room for improvement, but the performance gain for the router is already quite impressive (compared to the previous Python+PostgreSQL solution), though not entirely fair yet (only way type and in-builtup-area are used, oneway and maxspeed are ignored; only minimal postprocessing).I haven't done any research on that topic, though, since I don't need it yet (bulkimports are quite fast - ~25 minutes for the 865MB europe.osm.bz2).That's not bad, I suppose your index still fits in memory?
I think that in most cases the top part of the tree should fit in memory and only the leaves (containing pointers/addresses to the actual items) will stay on disk during operation.With the current node size of 8 entries, the overhead (#non-leafs/#leafs) is about 14%, with all R-Tree structures (leaf+non-leaf MBR, leaf pointers) totalling 128+32=160MB (space is allocated exponentially) for europe, so it fits into memory quite nicely (given "modern" desktop memory sizes, that is).
CU Sascha -- http://sascha.silbe.org/ http://www.infra-silbe.de/
signature.asc
Description: Digital signature
_______________________________________________ dev mailing list [email protected] http://lists.openstreetmap.org/listinfo/dev

