Hi, The Hadoop Definitive Guide book states that "if your input files are compressed, they will be automatically decompressed as they are read by MapReduce, using the filename extension to determine the codec to use" (in the section titled "Using Compression in MapReduce"). I'm trying to run a mapreduce job with some gzipped files as input and this isn't working. Does support for this have to be built into the input format? I'm using a custom one that extends from FileInputFormat. Is there an additional configuration option that should be set? I'd like to avoid having to do decompression from within my map.
I'm using the new API and the CDH3b2 distro. Thanks.