OOps. Replied to wrong email. Well I should add something useful to the conversation now.
I think LZO has all the right features. However, not great support in Pig if that is what you are using. It is good to have something splittable. LZO - check Compress intermediate files...this is a no brainer. Stick with it...it is complicated ( a bit ) to install Cheers J On 2010-06-24, at 8:45 PM, James Seigel wrote: > Cool. Maybe we should start a page. > > J > On 2010-06-24, at 8:16 PM, Harsh J wrote: > >> On Fri, Jun 25, 2010 at 2:42 AM, Raymond Jennings III >> <raymondj...@yahoo.com> wrote: >>> Oh, maybe that's what I meant :-) I recall reading something on this mail >>> group that "the compression" in not included with the hadoop binary and >>> that you have to get and install it separately due to license >>> incompatibilities. Looking at the config xml files it's not clear what I >>> need to do. Thanks. >>> >> LZO Compression is the one you probably read about. Otherwise >> available CompressionCodecs are BZip2 and GZip, and you should be able >> to use those files just fine. >> >> Something like FileOutputFormat.setCompressOutput(conf, true); >> >> (Also look at mapred.output.compress configuration var for >> map-output-compression) >>> >>> >>> ----- Original Message ---- >>> From: Eric Sammer <esam...@cloudera.com> >>> To: common-user@hadoop.apache.org >>> Sent: Thu, June 24, 2010 5:09:33 PM >>> Subject: Re: Newbie to HDFS compression >>> >>> There is no file system level compression in HDFS. You can stored >>> compressed files in HDFS, however. >>> >>> On Thu, Jun 24, 2010 at 11:26 AM, Raymond Jennings III >>> <raymondj...@yahoo.com> wrote: >>>> Are there instructions on how to enable (which type?) of compression on >>>> hdfs? Does this have to be done during installation or can it be added to >>>> a running cluster? >>>> >>>> Thanks, >>>> Ray >>>> >>>> >>>> >>>> >>> >>> >>> >>> -- >>> Eric Sammer >>> twitter: esammer >>> data: www.cloudera.com >>> >>> >>> >>> >>> >> >> >> >> -- >> Harsh J >> www.harshj.com >