Re: Avro Map Reduce Question: GenericRecord, renaming reduce output

Shirahatti, Nikhil Fri, 08 Jun 2012 13:36:20 -0700

The reason is: when I try to read the file using GenericReader.. I get the
error: not a data file.



Code snippet:
--------------
DatumReader<GenericData.Record> reader = new
GenericDatumReader<Record>(AVRO_SCHEMA);

String MUXDEMUX_FILE = outpath.concat("part-r-00000");
                InputStream in = new BufferedInputStream(new
FileInputStream(MUXDEMUX_FILE));
                DataFileStream<GenericData.Record> records = new
DataFileStream<GenericData.Record>(in,
                                reader);
                for (GenericData.Record r : records)
                {
                        System.out.println(r.toString());
                }



Nikhil

On 6/8/12 12:17 PM, "Doug Cutting" <[email protected]> wrote:

>On Fri, Jun 8, 2012 at 11:49 AM, snikhil0 <[email protected]> wrote:
>> My expectation is that I can use the same input schema to read the
>>output
>> file. But alas this is not working.
>> In the part-r-00000 I have a 0<tab>Obj<Avroschema>....datums...... Why
>>is
>> this?
>
>That looks approximately like an Avro data file.  How is it not what you
>expect?
>
>> Also how can rename the reduce output file to something other than
>> part-r-0000*?
>
>That's the standard name for Hadoop mapreduce output files.  You could
>override it in the OutputFormat, but most folks do not.  The name of
>the directory these are in is normally used to identify the result
>set.  The files within the directory are just fragments of that result
>set.
>
>Doug

Re: Avro Map Reduce Question: GenericRecord, renaming reduce output

Reply via email to