Hello, I also use Hadoop 0.20.0.
I didn't turn on or off the compression, so it is the default. I use a class that i have created. Maybe if I describe how all my application works, it will be more easy to understand. Maybe i do something wrong. I have a job and the mapper for this job just send the key and the value that it gets. The reducer sends as key the key it gets and as value an object of my own class which has two fields: String x and int count. This job writes the results in a file called "interm1". The second job reads from the "interm1" file and in the second mapper I obtain from the value (which is an object of my own class) the x fields and the count fields. It sends as key the x obtained and as value an object of my own class which has as value for x field the key received and as count the value obtained for count. In the second reducer all the values and the key are encoded to little endian and there are written in the second job output file with little endian encoding. I opened that file in a Hex Viewer and there I saw what was the problem, because there is no encoding specified and all the editors showed spaces or little squares before and after every character of my strings. If you need to see my code or if I did not explained very well, please let me know. Thank you very much for your answers. Regards, Adriana Sbircea On Wed, Apr 11, 2012 at 8:02 PM, Koert Kuipers <ko...@tresata.com> wrote: > i have a simple map-reduce job that i test with only 2 mappers, 2 reducers > and very small input (10 lines of text). > > it runs fine without compression. but as soon as i turn on compression > (mapred.compress.map.output=true), the output files (part-00000.snappy, > etc.) are empty. zero records. using logging i can see that my reducer > succesfully calls output.collect(key, value) yet they dont show up in the > file. i tried both snappy and gzip. do i need to do some sort of flushing? > > i am on hadoop 0.20.2 > > >