Hi, this sounds interesting! What datatype would my input in the mapper have? Or: How would I distinguish between the different inputs in the mapper?
Thanks, Markus On May 11, 2011, at 3:00 PM, Jacob R Rideout wrote: > We do take the union schema approach, but create the unions > programmaticly in java: > > Something like: > > ArrayList<Schema> schemas = new ArrayList<Schema>(); > schemas.add(schema1); > schemas.add(schema2); > Schema unionSchema = Schema.createUnion(schemas); > AvroJob.setInputSchema(job, unionSchema); > > > On Wed, May 11, 2011 at 12:44 PM, Markus Weimer <[email protected]> wrote: >> Hi, >> >> I'd like to write a mapreduce job that uses avro throughout, but the map >> phase would need to read files with two different schemas, similar to what >> the MultipleInputFormat does in stock hadoop. Is this a supported use case? >> >> A work-around would be to create a union schema that has both fields as >> optional and to convert all data into it, but that seems clumsy. >> >> Has anyone done this before? >> >> Thanks for any suggestion you can give, >> >> Markus >> >>
