Hello,

I also use Hadoop 0.20.0.

I didn't turn on or off the compression, so it is the default. I use a
class that i have created.  Maybe if I describe how all my application
works, it will be more easy to understand. Maybe i do something wrong.  I
have a job and the mapper for this job just send the key and the value that
it gets. The reducer sends as key the key it gets and as value an object of
my own class which has two fields: String x and int count. This job writes
the results in a file called "interm1". The second job reads from the
"interm1" file and in the second mapper I obtain from the value (which is
an object of my own class) the x fields and the count fields. It sends as
key the x obtained and as value an object of my own class which has as
value for x field the key received and as count the value obtained for
count.  In the second reducer all the values and the key are encoded to
little endian and there are written in the second job output file with
little endian encoding. I opened that file in a Hex Viewer and there I saw
what was the problem, because there is no encoding specified and all the
editors showed spaces or little squares before and after every character of
my strings.

If you need to see my code or if I did not explained very well, please let
me know.

Thank you very much for your answers.

Regards,
Adriana Sbircea



On Wed, Apr 11, 2012 at 8:02 PM, Koert Kuipers <ko...@tresata.com> wrote:

> i have a simple map-reduce job that i test with only 2 mappers, 2 reducers
> and very small input (10 lines of text).
>
> it runs fine without compression. but as soon as i turn on compression
> (mapred.compress.map.output=true), the output files (part-00000.snappy,
> etc.) are empty. zero records. using logging i can see that my reducer
> succesfully calls output.collect(key, value) yet they dont show up in the
> file. i tried both snappy and gzip. do i need to do some sort of flushing?
>
> i am on hadoop 0.20.2
>
>
>

Reply via email to