slightly not on point for this conversation, but I thought it worth mentioning....LZO is splitable, which makes it a good for for hadoopy things. Just something to remember when you do get some final results on performance.
Cheers James. On 2011-03-02, at 8:12 PM, Brian Bockelman wrote: > > I think some profiling is in order: claiming LZO decompresses at 1.0MB/s and > is more than 3x faster at compression than decompression (especially when > it's a well known asymmetric algorithm in favor of decompression speed) is > somewhat unbelievable. > > I see that you use small files. Maybe whatever you do for LZO and > Gzip/Hadoop has a large startup overhead? > > Again, sounds like you'll be spending an hour or so with a profiler. > > Brian > > On Mar 2, 2011, at 2:16 PM, Niels Basjes wrote: > >> Question: Are you 100% sure that nothing else was running on that >> system during the tests? >> No cron jobs, no "makewhatis" or "updatedb"? >> >> P.S. There is a permission issue with downloading one of the files. >> >> 2011/3/2 José Vinícius Pimenta Coletto <[email protected]>: >>> Hi, >>> >>> I'm making a comparison between the following compression methods: gzip >>> and lzo provided by Hadoop and gzip from package java.util.zip. >>> The test consists of compression and decompression of approximately 92,000 >>> files with an average size of 2kb, however the decompression time of lzo is >>> twice the decompression time of gzip provided by Hadoop, it does not seem >>> right. >>> The results obtained in the test are: >>> >>> Method | Bytes | Compression >>> | Decompression >>> - | - | Total Time(with i/o) Time Speed >>> | Total Time(with i/o) Time Speed >>> Gzip (Haddop) | 200876304 | 121.454s 43.167s >>> 4,653,424.079 B/s | 332.305s 111.806s 1,796,635.326 B/s >>> Lzo | 200876304 | 120.564s 54.072s >>> 3,714,914.621 B/s | 509.371s 184.906s 1,086,368.904 B/s >>> Gzip (java.util.zip) | 200876304 | 148.014s 63.414s >>> 3,167,647.371 B/s | 483.148s 4.528s 44,360,682.244 B/s >>> >>> You can see the code I'm using to the test here: >>> http://www.linux.ime.usp.br/~jvcoletto/compression/ >>> >>> Can anyone explain me why am I getting these results? >>> Thanks. >>> >> >> >> >> -- >> Met vriendelijke groeten, >> >> Niels Basjes >
