This isn't really a Hadoop issue, but gunzip will refuse to decompress files that don't have a well known suffix. Rename the file to have the file .gz and try again or use the -S option to specify an alternate suffix.
On Tue, Jun 1, 2010 at 10:28 AM, Arv Mistry <a...@kindsight.net> wrote: > Hi, > > I have a java process that writes compressed data to the HDFS. The way I > am doing that is wrapping the FSDataOutputSTream with GZIPOutputStream > and calling the write() method i.e. something like > > FSDataOutputSTream out = fs.create(file); > gzip = new GZIPOutputStream(out); > gzip.write("sss".getBytes("UTF8"); > > The file seems to get written ok. > > However, when I get the file out of HDFS and try to unzip it, it > complains; > > gunzip: cs_1_20100601_120000_1275396891183.cgz: unknown suffix -- > ignored > > When I do 'file' it is recognized as 'gzip compressed data, from FAT > filesystem (MS-DOS, OS/2, NT)' > > Any ideas? Appreciate any help. > > Cheers Arv > -- Eric Sammer phone: +1-917-287-2675 twitter: esammer data: www.cloudera.com