Hi Libhdfs does not support to store files compressed. But you can create a patch for it using the class of GZIPOutputStream.
weimin zhu > -----Original Message----- > From: Leon Mergen [mailto:[email protected]] > Sent: Monday, July 19, 2010 9:57 PM > To: [email protected] > Subject: libhdfs / gzip support > > Hello, > > We're using Hadoop in a C-oriented architecture ourselves, using libhdfs > for > storing files and Hadoop.Pipes for map/reduce jobs. Since the data we're > storing benefits a lot from compression, we're currently investigating ways > to do this. > > Ideally we would perform block-level compression: each separate 64MB block > of data would be compressed. Hadoop.Pipes seems to provide a way to change > the InputReader and OutputReader to enable the GzipCodec, however, I did > not > find a good way to tell libhdfs to store files compressed. > > Anyone has any experience with this, and/or ideas how to best approach this > problem? > > We're using Hadoop 0.20.2 > > Regards, > > Leon Mergen
