On Sun, 1 Aug 2010, Frederik Ramm wrote:

A lot of time is spent just reading from, and writing to, disk and parsing XML. Running the whole thing with .gz files doesn't make a big difference - saves some disk i/o, adds some CPU time, doesn't change XML parsing overhead.

I'm sorry but the parsing overhead from Java or libXML basically a known slowless factor. MSXML, pre/post plane parsing or even custom readers are not slow, and only limited to the disk.

So the binary format, per se, is only faster because:
 - smaller filesize = less io
 - encoding: no xml rewriting

Anything else is currently available using for example osmsucker.c, obviously not using an XML parser because all input is structured.


If the binary format can pack our doubles (lat/lon), integers (version/ids) and makes strings available in UTF-8, that skips CPU and IO overhead. But makes the data not human readable. I can totally live with that, and I hope the API protocol also gets protocol buffers.


Stefan

_______________________________________________
dev mailing list
[email protected]
http://lists.openstreetmap.org/listinfo/dev

Reply via email to