Re: [OSM-dev] New OSM binary fileformat implementation.

Frederik Ramm Sun, 01 Aug 2010 14:26:33 -0700

Hi,

Brett Henderson wrote:

I'll help incorporate this into the rest of Osmosis. There's a fewthings to work through though.
    * Is there a demand for the binary format in its current
      incantation?  I'm not keen to incorporate it if nobody will use it.

I run a nightly job at Geofabrik which currently operates on plain(uncompressed) OSM files and goes roughly like this (every step usesOsmosis):


* apply daily diff to planet file
* split planet file into continents
* split each continent into countries
* split some countries into smaller units
* split some smaller units into even smaller units
* bzip2 the lot

The whole job takes from ~ 22h at night to ~ 9h in the morning, eventhough I'm ignoring the US.

A lot of time is spent just reading from, and writing to, disk andparsing XML. Running the whole thing with .gz files doesn't make a bigdifference - saves some disk i/o, adds some CPU time, doesn't change XMLparsing overhead.

I wanted to test-drive the binary format as a replacement for raw .osmfiles in this setup, hoping that it would give me the i/o benefits ofgzip compressed data but also slash XML parsing time. The numbers thathave been posted seemed promising. I might even be able to skip thebzip2 step at the end if the binary format should become widely used,just placing binary files on the server; and use the saved time tore-introduce US extracts.

So here's one user who's definitely in for it - the reason I asked rightnow was that I was planning to have a go at it in the near future, andwanted to make sure that I'm not using an old version or going down apath that everyone else already discarded. - If there's "proper"integration with Osmosis around the corner then I'd wait for that.

The way I understood it, Scott was re-using some code he placed insidethe Osmosis tree from within his "splitter" code. Also I could imaginethat using this fance Google library means you'll have some formatdescription files which might be shared across all projects using thatlibrary, perhaps even including the C++ reader that jamesmikedupont hasbuilt, but I'm not sure.

I prefer SVN over git for the simple reason that I only have to "svn up"and everything is there but I'm sure it is going to be a matter ofminutes before someone from Iceland points out that the same conveniencecan be had with git if one knows what they're doing ;)


Bye
Frederik

--
Frederik Ramm  ##  eMail [email protected]  ##  N49°00'09" E008°23'33"

_______________________________________________
dev mailing list
[email protected]
http://lists.openstreetmap.org/listinfo/dev

Re: [OSM-dev] New OSM binary fileformat implementation.

Reply via email to