Hello Rui Hou,

If you look at the Writer constructor used here, you'll get your answer very 
easily. It takes a codec (a compression codec, to be specific) as an argument. 
The codec, if not null (in case compression is disabled), is then responsible 
for compressing the streams of data by wrapping around the actual output stream.

The codec variable is initialized during the MapOutputStream construction 
accordingly.

The code for how codecs work can be read in the common code for the chosen 
algorithm, if you'd like to take a look. For example, there's the DefaultCodec 
class.

I hope this helps! :)

P.s. Please do not cross post to multiple lists while seeking an answer. And 
for future mapreduce development questions such as this, please direct it to 
mapreduce-...@hadoop.apache.org

On 05-Jul-2011, at 7:50 PM, 侯锐 wrote:

> Hello guys, 
> We wonder to know where the compression take place for MapOutputStream in Map 
> phase.
> 
> We guess there are two possible places in sortAndSpill() at MapTask.java:
> Writer.append() or Writer.close()
> Which one makes compression? 
> Appreciate very much for your response~
> 
> See lines marked by ****** as below (from sortAndSpill() at MapTask.java).
> 
> for (int i = 0; i < partitions; ++i) {
>          IFile.Writer<K, V> writer = null;
>          try {;
>            writer = new Writer<K, V>(job, out, keyClass, valClass, codec,
>                                      spilledRecordsCounter);
>            if (combinerRunner == null) {
>                 …
>                key.reset(kvbuffer, kvindices[kvoff + KEYSTART],
>                          (kvindices[kvoff + VALSTART] - 
>                           kvindices[kvoff + KEYSTART]));
>                /**************************************/
>                writer.append(key, value);   // The 1st possible place
>                ++spindex;
>              }
>            } else {
> …
>              }
>              …
> 
>            // close the writer
>            /**************************************/
>            writer.close();   // The 2st possible place
> 
> --
> Rui Hou (侯锐)
> Insititute of Technology, Chinese Academy of Sciences
> 
> 
> 
> 

Reply via email to