Hi! Today i was pleasantly surprised to see osmosis finished processing today's planet.osm.gz in only 2 hours. Normally it took approx 6 hours on the same machine, same bbox, same osmosis version (0.24), but slightly smaller bz2 version of the whole planet.
http://osm.baebler.net/data/log.txt Previous benchmark of compression algorithms on planet files: On Jan 1, 2008 3:47 PM, Bruce Cowan <[EMAIL PROTECTED]> wrote: > Testing on the latest UK planet (I'm not going to download the real > planet), I got the following results (all the default settings): > > Original: > 650625406 bytes > > gzip: > 74858343 bytes, 37.198s to compress, 33.005s to decompress. > 88.5% saving. > > bzip2: > 57581322 bytes, 7m25.169s to compress, 45.983s to decompress. > 91.1% saving, 23.0% over gzip. > > lzma: > 38175614 bytes, 21m39.595s to compress, 36.187s to decompress. > 94.1% saving, 49.0% over gzip, 33.7% over bzip2. Bruce's test: UK planet decompression time with native debian tools: bz2 / gz ratio = 46s / 33s = 1.39 ... 39% "overtime" My test: whole planet decompression (+extracting bbox and writing it to bz2!) time with Osmosis: bz2 / gz ratio = 6h / 2h = 3 ... 200% "overtime" At least. If we say that it took roughly 1h in both cases for decompressed osm processing and writing out the extracted bbox (compressed to bz2 in both cases) and substract it from total times to get the bz2 decompression time, the ratio gets even worse: bz2 / gz ratio = 5h / 1h = 5 ... 400% "overtime" It seems that apache's bz2 implementation that is used in Osmosis is very slow compared to the gz implementation. Could it simply be due to Java or are other bz2 implementations in Java better? Stefan _______________________________________________ dev mailing list [email protected] http://lists.openstreetmap.org/cgi-bin/mailman/listinfo/dev

