Hi,

Brett Henderson wrote:
I'll help incorporate this into the rest of Osmosis. There's a few things to work through though.

    * Is there a demand for the binary format in its current
      incantation?  I'm not keen to incorporate it if nobody will use it.

I run a nightly job at Geofabrik which currently operates on plain (uncompressed) OSM files and goes roughly like this (every step uses Osmosis):

* apply daily diff to planet file
* split planet file into continents
* split each continent into countries
* split some countries into smaller units
* split some smaller units into even smaller units
* bzip2 the lot

The whole job takes from ~ 22h at night to ~ 9h in the morning, even though I'm ignoring the US.

A lot of time is spent just reading from, and writing to, disk and parsing XML. Running the whole thing with .gz files doesn't make a big difference - saves some disk i/o, adds some CPU time, doesn't change XML parsing overhead.

I wanted to test-drive the binary format as a replacement for raw .osm files in this setup, hoping that it would give me the i/o benefits of gzip compressed data but also slash XML parsing time. The numbers that have been posted seemed promising. I might even be able to skip the bzip2 step at the end if the binary format should become widely used, just placing binary files on the server; and use the saved time to re-introduce US extracts.

So here's one user who's definitely in for it - the reason I asked right now was that I was planning to have a go at it in the near future, and wanted to make sure that I'm not using an old version or going down a path that everyone else already discarded. - If there's "proper" integration with Osmosis around the corner then I'd wait for that.

The way I understood it, Scott was re-using some code he placed inside the Osmosis tree from within his "splitter" code. Also I could imagine that using this fance Google library means you'll have some format description files which might be shared across all projects using that library, perhaps even including the C++ reader that jamesmikedupont has built, but I'm not sure.

I prefer SVN over git for the simple reason that I only have to "svn up" and everything is there but I'm sure it is going to be a matter of minutes before someone from Iceland points out that the same convenience can be had with git if one knows what they're doing ;)

Bye
Frederik

--
Frederik Ramm  ##  eMail [email protected]  ##  N49°00'09" E008°23'33"

_______________________________________________
dev mailing list
[email protected]
http://lists.openstreetmap.org/listinfo/dev

Reply via email to