Hi,
(using hadoop 0.20.2 and avro 1.4.1)
I have defined a simple avro object 'AvroObj' (a record of strings),
compiled the schema and
setup a simple MR job that takes as input <Object, Text> and emits
<Text, IntWritable>
and reducer that takes said <Text, IntWritable> and ...
I would like to achieve is - have reducer emit <NullWritable, AvroObj>
pairs into an avro sequence file;
so the next mr job will open that avro file and read-in avro objects, not
text lines, out of it;
I have looked through the (H ed.2) book and few online samples but can't
figure out how to do it;
some online sources mention job config settings like:
job.setOutputFormatClass(AvroOutputFormat.class);
AvroOutputFormat.setCompressOutput(conf, false);
But this doesn't compile - setCompressOutput asks for deprecated JobConf
object, and
"setOutputFormatClass" gives error about its param - param not applicable to
AvroOutputFormat.class;
Could someone enlighten me how to have reducer write to avro sequence file ?
Cheers;
--
View this message in context:
http://apache-avro.679487.n3.nabble.com/How-to-direct-Reducer-to-write-avro-objects-to-avro-sequence-file-tp2663706p2663706.html
Sent from the Avro - Users mailing list archive at Nabble.com.