Re: [OSM-dev] Is there a way to use simple schema without hstore

Andreas Kalsch Fri, 19 Nov 2010 00:38:35 -0800

Hi Brett,

thanks for your elaborate answer! Now I am up to date. Some ideas regarding my 
use case ...


Am 18.11.10 23:50, schrieb Brett Henderson:

Hi Andreas,
The change was made mostly for performance reasons. With a full planet imported into thedatabase, bounding box style queries are now approximately 10 times faster. This is due to acouple of reasons:
    * All data (with the exception of relations) is now clustered by 
geographical location.  This
      drastically improves performance where data is being processed for a 
limited area.
    * The nodes and ways tables are the only tables that have a geometry 
column, thus other data
      must be embedded in those tables in order to make use of clustering.

My concept is always to use _1_ table for all geometries and to create extractions when I need them.Because a geom column can store any type, so it is a more unifying concept.

I don't understand your comment regarding NoSQL. The main change is that now you will have todeal with a more complex hstore column type on the nodes/ways tables, but otherwise the same datastill exists and can still be manipulated with SQL statements. The data is less relational thatit was previously, but tag data is not terribly useful without access to parent entities sogrouping them together shouldn't result in loss of functionality.
You can still populate separate tags tables if you wish by running your own separate query to pullthe hstore column apart.

This is what I need to do sooner or later, when I will update. It's important for me to use aseparate table for tags, because I run a script that will correct the tags of relations (from outerways to relations), and I don't want to rewrite this and other scripts that depend on this schemaevery time the version changes.Running an extra script that fetches the hstore tags and puts them into a separate table will addthe time that PBF gave me ;(My main concern is that with the next big schema update I _have_ to patch the schema. On the longrun it is great to be conservative about such changes or *)

If you're applying diffs to the database you can enhance the osmosisUpdate() function (initiallyempty, but can be customised) to keep your separate tags tables up to date during each diffapplication. You will need to run the "pgsql_simple_schema_0.6_action.sql" script against thedatabase so that all actions during a diff are logged and can be used by your osmosisUpdatefunction to know which records need to be re-processed.

Is it possible to truncate the actions table for myself so that a separate script can access thechanges?

This is another important point. In the moment, I manually populate my own "current_features" tablesafter an update that are populated with all features, whose tstamp is >= the time of the lastupdate. A little overhead ... I see that this table exists in 0.36 as well, so I could use it, if Ican truncate it manually?

The older Osmosis 0.36 is still available so you don't have to upgrade. It remains compatiblewith 0.6 XML files. Finally, if there is enough demand for the older schema style the old taskscan be pulled back out of SVN and run alongside the new ones, but I'm not keen to do that withoutgood reason. I did consider trying to support both styles of table in the same tasks bydynamically detecting what tables are installed, but it increases the code complexity considerablyand I didn't think the effort was worthwhile.

*) With that, you would provide a downward compatible solution that I would 
appreciate a lot!

Is it necessary that Osmosis makes the schema checks? What about giving each schema a unique ID andthen let the user point Osmosis to this ID and let it fail, if the user has installed the wrong schema?

Finally, I didn't make the change without careful consideration. I do try to keep schemas stable,and when they do change I provide an upgrade script to allow migration between them. But theperformance gains achieved through use of hstore were too great to ignore. Retrieving heavilypopulated 1x1 degree areas from a database containing a full planet used to take approximately 1hour, but this is now down to well under 10 minutes.

On the long run, this is an argument ;) I am critical, because I still haven't thought through alldependant scripts that do something with tags. But there are many ...


Hope that helps,
Brett

On Thu, Nov 18, 2010 at 8:18 PM, Andreas Kalsch <[email protected]<mailto:[email protected]>> wrote:


    Is there a way to use simple schema in Osmosis without hstore? And why was 
this changed? A
    separate table for tags can more easily be indexed. I think it is not a 
good idea to use
    hstore because then we can drop SQL, use NoSQL for storing data and use 
PostGIS/Postgres for
    Geometry only.

    What do you think?
    Best,

    Andi

    _______________________________________________
    dev mailing list
    [email protected] <mailto:[email protected]>
    http://lists.openstreetmap.org/listinfo/dev

_______________________________________________
dev mailing list
[email protected]
http://lists.openstreetmap.org/listinfo/dev

Re: [OSM-dev] Is there a way to use simple schema without hstore

Reply via email to