The reason is: when I try to read the file using GenericReader.. I get the
error: not a data file.
Code snippet:
--------------
DatumReader<GenericData.Record> reader = new
GenericDatumReader<Record>(AVRO_SCHEMA);
String MUXDEMUX_FILE = outpath.concat("part-r-00000");
InputStream in = new BufferedInputStream(new
FileInputStream(MUXDEMUX_FILE));
DataFileStream<GenericData.Record> records = new
DataFileStream<GenericData.Record>(in,
reader);
for (GenericData.Record r : records)
{
System.out.println(r.toString());
}
Nikhil
On 6/8/12 12:17 PM, "Doug Cutting" <[email protected]> wrote:
>On Fri, Jun 8, 2012 at 11:49 AM, snikhil0 <[email protected]> wrote:
>> My expectation is that I can use the same input schema to read the
>>output
>> file. But alas this is not working.
>> In the part-r-00000 I have a 0<tab>Obj<Avroschema>....datums...... Why
>>is
>> this?
>
>That looks approximately like an Avro data file. How is it not what you
>expect?
>
>> Also how can rename the reduce output file to something other than
>> part-r-0000*?
>
>That's the standard name for Hadoop mapreduce output files. You could
>override it in the OutputFormat, but most folks do not. The name of
>the directory these are in is normally used to identify the result
>set. The files within the directory are just fragments of that result
>set.
>
>Doug