I am creating a dynamic union of records as seen below but keep receiving an 
exception org.apache.avro.UnresolvedUnionException: Not in union
Any reason why it deems the same schemas that created the union invalid for 
collection? Avro throws this with each record it tries to collect. An example 
of this working would be appreciated.

Also, is there such a thing as a nullrecord, The records I am assembling fit 
into a set instead of a Map but I could find no elegent way outside of defining 
a record with a single field of null.

inside ToolRunnner
Schema.Parser p = new Schema.Parser();

ArrayList<Schema> keySchemas = new ArrayList<Schema>();
keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s1.avsc")));
keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s2.avsc")));
keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s3.avsc")));
keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s4.avsc")));
keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s5.avsc")));

Schema keySchema = Schema.createUnion(keySchemas);
Schema valSchema = 
p.parse(AvroConverter.class.getResourceAsStream("null.avsc"));
AvroJob.setMapOutputSchema(conf, Pair.getPairSchema(keySchema, valSchema));

Inside Mapper Setup:
private static HashMap<String, Schema> keySchemas = new HashMap<String, 
Schema>();
private static Schema valSchema;
Schema.Parser p = new Schema.Parser();
keySchemas.put("s1", p.parse(Map.class.getResourceAsStream("s1.avsc")));
keySchemas.put("s2", p.parse(Map.class.getResourceAsStream("s2.avsc")));
keySchemas.put("s3", p.parse(Map.class.getResourceAsStream("s3.avsc")));
keySchemas.put("s4", p.parse(Map.class.getResourceAsStream("s4.avsc")));
keySchemas.put("s5", p.parse(Map.class.getResourceAsStream("s5.avsc")));
valSchema = p.parse(Map.class.getResourceAsStream("null.avsc"));

Inside Map function:
GenericData.Record r;
if(in.type=="s1") {
r = new GenericData.Record(keySchemas.get("s1");
} else if(in.type=="s1") {
r = new GenericData.Record(keySchemas.get("s2");
}
oc.collect(new AvroKey<GenericRecord>(r), new AvroValue<GenericRecord>(new 
GenericData.Record(valSchema)));

Avro throws a Union Exception everytime I pass in a record. Any reason why it 
deems the same schemas that created the union invalid for collection?

org.apache.avro.UnresolvedUnionException: Not in unionI am creating a dynamic 
union of records as seen below

inside ToolRunnner
Schema.Parser p = new Schema.Parser();

ArrayList<Schema> keySchemas = new ArrayList<Schema>();
keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s1.avsc")));
keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s2.avsc")));
keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s3.avsc")));
keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s4.avsc")));
keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s5.avsc")));

Schema keySchema = Schema.createUnion(keySchemas);
Schema valSchema = 
p.parse(AvroConverter.class.getResourceAsStream("null.avsc"));
AvroJob.setMapOutputSchema(conf, Pair.getPairSchema(keySchema, valSchema));

Inside Mapper Setup:
private static HashMap<String, Schema> keySchemas = new HashMap<String, 
Schema>();
private static Schema valSchema;
Schema.Parser p = new Schema.Parser();
keySchemas.put("s1", p.parse(Map.class.getResourceAsStream("s1.avsc")));
keySchemas.put("s2", p.parse(Map.class.getResourceAsStream("s2.avsc")));
keySchemas.put("s3", p.parse(Map.class.getResourceAsStream("s3.avsc")));
keySchemas.put("s4", p.parse(Map.class.getResourceAsStream("s4.avsc")));
keySchemas.put("s5", p.parse(Map.class.getResourceAsStream("s5.avsc")));
valSchema = p.parse(Map.class.getResourceAsStream("null.avsc"));

Inside Map function:
GenericData.Record r;
if(in.type=="s1") {
r = new GenericData.Record(keySchemas.get("s1");
} else if(in.type=="s1") {
r = new GenericData.Record(keySchemas.get("s2");
}
oc.collect(new AvroKey<GenericRecord>(r), new AvroValue<GenericRecord>(new 
GenericData.Record(valSchema)));




Reply via email to