On Sat, May 07, 2011 at 08:25:49PM +0200, Christian Vetter wrote: > On Sat, May 7, 2011 at 5:46 AM, Scott Crosby <[email protected]> wrote: > > On Fri, May 6, 2011 at 9:24 PM, Christian Vetter <[email protected]> > > wrote: > >> With regard to LZMA: I have some C++ code lying around to compress / > >> decompress LZMA... I can test how much it would affect file size / > >> decoding speed. > > > > Cool. You don't need a full-fledged PBF reader&writer to test it. Just > > enough to parse out blobs and write blobs. > > I quickly hacked it into MoNav's importer and tested it on the extract > of Germany. I used maximum compression ( dictionary size == blob size > ): > > size of zlib blobs: 849MB > size of lzma blobs: 762MB > time spent decoding zlib blobs: 6.986 s > time spent decoding lzma blobs: 49.078 s > > We can reduce the size a bit by using lzma ( ~10% ) and adding it > isn't much work ( about 10 lines of code for encoding / decoding ). > However, it doesn't seem worth it, considering that it makes parsing > slower. Increasing the block size would most likely not increase the > compression: I tested compressing all uncompressed blobs at once using > a 64MB dictionary and the size only decreased to 728MB
For me that makes the decision easy then: 10% space savings is not worth the extra time needed for the decompression. 10% space savings are a) not a big issues for current disk sizes and b) will be eaten up in a few months of OSM growth. Time savings on the other hand are my biggest issue in most applications with OSM, because I want to work with data thats as current as possible. That being said I would not object to adding an lzma option if others have different priorities. Jochen -- Jochen Topf [email protected] http://www.remote.org/jochen/ +49-721-388298 _______________________________________________ dev mailing list [email protected] http://lists.openstreetmap.org/listinfo/dev

