Hi All, I've finished coding new bounding box extraction in osmosis. It seems to work correctly although I admit I haven't tested it thoroughly yet. If anybody wishes to take a look or better yet improve the PostGIS usage I'll welcome all feedback.
BACKGROUND I've had a number of abortive attempts at implementing a bounding box algorithm that fixes existing bounding box algorithm issues which is why it's taken me so long. I initially wrote a file-based solution but I reached limits on indexing that appeared beyond my ability to fix in a reasonable amount of time. I then moved onto a Berkeley DB (java edition) implementation but it didn't seem as scalable as I'd hoped either and again I was running into limitations in terms of how to implement indexing. I gave up trying to provide a simple 100% command line driven approach and created a PostGIS solution. POSTGIS SOLUTION I've created a relatively simple schema that is similar to the production MySQL schema but with a few critical differences. * It separates node tags into their own table. * It represents node locations as a geometry POINT type. * It adds a geometry BBOX to the way table for the purposes of indexing and to allow "correct" bounding box extraction including ways with no nodes inside the bounding box. The schema script is available here: http://svn.openstreetmap.org/applications/utils/osmosis/script/pgsql_simple_schema.sql A population script (associated with the --write-pgsql-simple-dump task) is available here: http://svn.openstreetmap.org/applications/utils/osmosis/script/pgsql_simple_load.sql There are several osmosis tasks related to this bounding box functionality. * --truncate-pgsql-simple - This task deletes all existing data from the database. * --write-pgsql-simple - This task writes data into a database. This is the simplest method for importing a planet or subset thereof. (Note that a full planet may take a long time if not forever, I haven't tested it yet). A task such as --read-xml should be fed into this task. * --write-pgsql-simple-dump - This task writes data into files that can be loaded using PostgreSQL COPY commands. The script mentioned above must be used to import these files because there are some additional steps required. A task such as --read-xml should be fed into this task. This is faster than the --write-pgsql-simple task but less convenient. * --read-pgsql-simple - This task exposes the database to downstream tasks accepting a new type of input called a "dataset". This task doesn't actually read data, just exposes the database via a special interface. * --dataset-dump - This task reads the entire contents of a "dataset" created by a "dataset" reading task (ie. --read-pgsql-simple) and feeds it to the next task in the pipeline. This should be linked to something like --write-xml. * --dataset-bounding-box - This task extracts all data within a bounding box from a "dataset" created by a "dataset" reading task (ie. --read-pgsql-simple) and feeds it to the next task in the pipeline. This should be linked to something like --write-xml. I can't seem to reach the wiki at the moment to double check syntax but some sample command lines are provided below: To import into a database: osmosis --rx mydatafile.osm --write-pgsql-simple host=localhost database=osmds user=osm password=xxxx To extract a bounding box: osmosis --read-pgsql-simple host=localhost database=osmds user=osm password=xxxx --dataset-bounding-box left=-1 right=1 bottom=-1 top=1 --write-xml mybbox.osm BOUNDING BOX IMPLEMENTATION The bounding box implementation is in the com.bretth.osmosis.core.pdb.v0_5.impl.PostgreSqlDatasetReader class in the iterateBoundingBox method. http://svn.openstreetmap.org/applications/utils/osmosis/src/com/bretth/osmosis/core/pdb/v0_5/impl/PostgreSqlDatasetReader.java Although it's java, the SQL queries should be understandable by anybody with PostGIS knowledge. Note that "?" in the queries is replaced by a bounding box polygon using jdbc bind variables. If running the queries manually, replace the "?" with something like: 'GeomFromText('POLYGON((144.93912192855174 -37.82981987499741, 144.93912192855174 -37.79310006709244, 144.98188026000003 -37.79310006709244, 144.98188026000003 -37.82981987499741, 144.93912192855174 -37.82981987499741))', -1)' It uses temporary tables which may or may not be the best approach. REMAINING STEPS If this is to become feasible for every day use I need to be able to apply changesets to an existing database. This will allow a database to be kept up-to-date with the latest production data (with a maximum lag of 10 minutes or so if 1 minute diffs are used) without requiring full imports all the time. Not sure when I'll get to this, it could be a while. It's on my list but if somebody else wishes to do it first I'll be very happy. Anyway, as I mentioned earlier any feedback would be appreciated. Cheers, Brett _______________________________________________ dev mailing list [email protected] http://lists.openstreetmap.org/cgi-bin/mailman/listinfo/dev

