Hello Ed, The AvroReducer per http://avro.apache.org/docs/1.7.4/api/java/org/apache/avro/mapred/AvroReducer.html has a simple spec of <K,V,OUT>, where OUT can be any record type and not necessarily a Pair<KO,VO> type.
AvroJob.setOutputSchema(…) should accept non-pair configs. I think its java-doc is incorrect though. I wrote a test case yesterday at http://issues.apache.org/jira/browse/AVRO-1439, in which I set a non-Pair schema via the same call without any trouble. We could get the java-doc fixed, if it is indeed wrong. On Thu, Jan 16, 2014 at 2:14 PM, ed <[email protected]> wrote: > Hello, > > I am currently reading in lots of small avro files and then writing them out > into one large avro file using Map Reduce MR1. I'm trying to do this using > the AvroMapper and AvroReducer and it's almost working how I want. > > The problem right now is that it looks like I have to use > "org.apache.avro.mapred.Pair" if I use "AvroJob.setOutputSchema". Is there > a way to output a Pair schema from AvroReducer and have the "key" in that > schema be ignored (i.e., not included in the output from the reducer)? > Right now when I check the Reducer output there is an added field in each > record called "key" which I'd like to not have there. > > Essentially I'm looking for something like NullWritable where the key will > just be ignored in the final output. > > Thank you for any assistance or guidance you can provide! > > Best Regards, > > Ed -- Harsh J
