Hello guys,

I have a Clouderas CDH3U2 package installed on a 3 node cluster and I've added 
to mapred-site:
    <property>
        <name>mapred.compress.map.output</name>
        <value>true</value>
    </property>

    <property>
        <name>mapred.map.output.compression.codec</name>
        <value>org.apache.hadoop.io.compress.SnappyCodec</value>
    </property>

Also to my pig job properties:
                <property>
                    <name>io.compression.codec.lzo.class</name>
                    <value>com.hadoop.compression.lzo.LzoCodec</value>
                </property>
                <property>
                    <name>pig.tmpfilecompression</name>
                    <value>true</value>
                </property>
                <property>
                    <name>pig.tmpfilecompression.codec</name>
                    <value>lzo</value>
                </property>
                <property>
                    <name>mapred.output.compress</name>
                    <value>true</value>
                </property>
                <property>
                    <name>mapred.output.compression.codec</name>
                    <value>org.apache.hadoop.io.compress.SnappyCodec</value>
                </property>
                <property>
                    <name>mapred.output.compression.type</name>
                    <value>BLOCK</value>
                </property>
                <property>
                    <name>mapred.compress.map.output</name>
                    <value>true</value>
                </property>
                <property>
                    <name>mapred.map.output.compression.codec</name>
                    <value>org.apache.hadoop.io.compress.SnappyCodec</value>
                </property>
                <property>
                    <name>mapreduce.map.output.compress</name>
                    <value>true</value>
                </property>
                <property>
                    <name>mapreduce.map.output.compress.codec</name>
                    <value>org.apache.hadoop.io.compress.SnappyCodec</value>
                </property>

So I want PIG to compress it's data with LZO but mapreduce with Snappy, but as 
I see in the tasktracker details (Map Bytes Out) data is not compressed at all, 
which reduces performance a lot (IO is 100% most of the time)... What am I 
doing wrong and how do I fix it?


Thanks,
Marek M.

Reply via email to