that's very interesting. for us poor souls using streaming, would we be able to use it?
(right now i'm looking at a 100+ GB gzipped file ...) Miles 2009/3/3 Johan Oskarsson <[email protected]>: > Hi, > > thought I'd pass on this blog post I just wrote about how we compress our > raw log data in Hadoop using Lzo at Last.fm. > > The essence of the post is that we're able to make them splittable by > indexing where each compressed chunk starts in the file, similar to the gzip > input format being worked on. > This actually gives us a performance boost in certain jobs that read a lot > of data while saving us disk space at the same time. > > http://blog.oskarsson.nu/2009/03/hadoop-feat-lzo-save-disk-space-and.html > > /Johan > -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
