On Fri, Oct 17, 2008 at 10:46:39PM +0200, Freek wrote:
Much simpler, as more complex index structures also use more memory (thus risking disk access penalty). Sorted lists, scanned with binary search (=> O(log n)).So what kind of indexes do you have, hash tables and balanced binary search trees (what about external memory)?Node/way/relation internal <-> external id: - int -> ext O(1) - ext -> int O(log n)
Anyway, when do you use such an index? I guess one would want to know the OSM id of objects returned by certain queries, but not specifically a translation from internal to external id (though of course you probably get that for free if you have an index on internal id).Usually the ID mappings are used on in- and output, respectively. The program core will work purely with internal IDs.
Just a bit of a technical thing (not very important for the design at this stage), but with "Peano" I suppose you mean the order that looks like a lot of nested Z's? Historically this is not a curve Peano came up with (though many people call it /a/ or even /the/ Peano curve).Good to know. I used the nomenclature from the already-mentioned paper [1] by Faloutsos and Roseman.
Actually, he proposed a different one that does not have jumps and hence has better locality properties. A disadvantage of that one may be that an implementation would be slightly more complex, but such a discussion can better be deferred.Using a different space-filling curve is definitely something on the unwritten part of the TODO list. I mainly used Peano because it was easy to implement. It's also quite fast, but only the bulk importer would have to invoke the function often, ordinary programs shouldn't be affected significantly.
Yes, you can. If you add an appropriate entry to the rules file prior to running the importer, that is. Excerpt from my current rules file:Node/way/relation types (using tag -> id mapping with wildcard supportduring import): - id -> types ~O(1) - type -> ids ~O(log n)Type? Can I, for example, retrieve all post boxes using this index?
{'tags': {'amenity': 'post_box'}, 'maxScaleDisplay': 10000, 'maxScaleName': 5000, 'icon': 'public/post_box'},
This file is preprocessed with a Python script, producing a file that's easy to parse in C++ (but much less readable). It's also used by my moving map viewer (with basic navigation support), that's why there are rendering styles in it, too.
Node/way/relation "interesting" bitmap (name/ref/is_in set and/or knownImportant design decision, but more for the data format itself: does it store all nodes/ways/relations, also the "uninteresting" ones?type): - id -> interesting O(1)
Yes, it does - otherwise this bitmap wouldn't be necessary.
This is an index for retrieving all objects within a specified query rectangle, right? (So basically what we've been talking about ;-)Way MBR / RTree: - id -> MBR O(1) - MBR -> ids ~O(log n)
Exactly. :)
Will the data format have all these kinds of information (tags, members, etc.) in different places (on disk), or will they all be in the same place?Everything is in a different file. That can reduce performance (due to increased number of seeks), but enables different applications (using different subsets of the data) to (efficiently) share the same database. This also accounts for using different subsets of the data in different stages of the program: ID on in/output, node location etc. in "core" operations.
Node/way/relation name/ref/is_in: - text files sorted by id (intended for namefinder - could be changed into "proper" database entries similar to the way tags are handled)I think name finding is a topic on its own.
At least partially, yes.
Also, a good look at the current online OSM name finder and Gosmore may be a good idea.Already had a good look on the current namefinder - I'm quite impressed by its performance, given that it's based on MySQL+PHP.
Didn't know that Gosmore has a geolocation feature, though. [1] http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.45.9043 CU Sascha -- http://sascha.silbe.org/ http://www.infra-silbe.de/
signature.asc
Description: Digital signature
_______________________________________________ dev mailing list [email protected] http://lists.openstreetmap.org/listinfo/dev

