I am creating a dynamic union of records as seen below but keep receiving an
exception org.apache.avro.UnresolvedUnionException: Not in union
Any reason why it deems the same schemas that created the union invalid for
collection? Avro throws this with each record it tries to collect. An example
of this working would be appreciated.
Also, is there such a thing as a nullrecord, The records I am assembling fit
into a set instead of a Map but I could find no elegent way outside of defining
a record with a single field of null.
inside ToolRunnner
Schema.Parser p = new Schema.Parser();
ArrayList<Schema> keySchemas = new ArrayList<Schema>();
keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s1.avsc")));
keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s2.avsc")));
keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s3.avsc")));
keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s4.avsc")));
keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s5.avsc")));
Schema keySchema = Schema.createUnion(keySchemas);
Schema valSchema =
p.parse(AvroConverter.class.getResourceAsStream("null.avsc"));
AvroJob.setMapOutputSchema(conf, Pair.getPairSchema(keySchema, valSchema));
Inside Mapper Setup:
private static HashMap<String, Schema> keySchemas = new HashMap<String,
Schema>();
private static Schema valSchema;
Schema.Parser p = new Schema.Parser();
keySchemas.put("s1", p.parse(Map.class.getResourceAsStream("s1.avsc")));
keySchemas.put("s2", p.parse(Map.class.getResourceAsStream("s2.avsc")));
keySchemas.put("s3", p.parse(Map.class.getResourceAsStream("s3.avsc")));
keySchemas.put("s4", p.parse(Map.class.getResourceAsStream("s4.avsc")));
keySchemas.put("s5", p.parse(Map.class.getResourceAsStream("s5.avsc")));
valSchema = p.parse(Map.class.getResourceAsStream("null.avsc"));
Inside Map function:
GenericData.Record r;
if(in.type=="s1") {
r = new GenericData.Record(keySchemas.get("s1");
} else if(in.type=="s1") {
r = new GenericData.Record(keySchemas.get("s2");
}
oc.collect(new AvroKey<GenericRecord>(r), new AvroValue<GenericRecord>(new
GenericData.Record(valSchema)));
Avro throws a Union Exception everytime I pass in a record. Any reason why it
deems the same schemas that created the union invalid for collection?
org.apache.avro.UnresolvedUnionException: Not in unionI am creating a dynamic
union of records as seen below
inside ToolRunnner
Schema.Parser p = new Schema.Parser();
ArrayList<Schema> keySchemas = new ArrayList<Schema>();
keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s1.avsc")));
keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s2.avsc")));
keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s3.avsc")));
keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s4.avsc")));
keySchemas.add(p.parse(AvroConverter.class.getResourceAsStream("s5.avsc")));
Schema keySchema = Schema.createUnion(keySchemas);
Schema valSchema =
p.parse(AvroConverter.class.getResourceAsStream("null.avsc"));
AvroJob.setMapOutputSchema(conf, Pair.getPairSchema(keySchema, valSchema));
Inside Mapper Setup:
private static HashMap<String, Schema> keySchemas = new HashMap<String,
Schema>();
private static Schema valSchema;
Schema.Parser p = new Schema.Parser();
keySchemas.put("s1", p.parse(Map.class.getResourceAsStream("s1.avsc")));
keySchemas.put("s2", p.parse(Map.class.getResourceAsStream("s2.avsc")));
keySchemas.put("s3", p.parse(Map.class.getResourceAsStream("s3.avsc")));
keySchemas.put("s4", p.parse(Map.class.getResourceAsStream("s4.avsc")));
keySchemas.put("s5", p.parse(Map.class.getResourceAsStream("s5.avsc")));
valSchema = p.parse(Map.class.getResourceAsStream("null.avsc"));
Inside Map function:
GenericData.Record r;
if(in.type=="s1") {
r = new GenericData.Record(keySchemas.get("s1");
} else if(in.type=="s1") {
r = new GenericData.Record(keySchemas.get("s2");
}
oc.collect(new AvroKey<GenericRecord>(r), new AvroValue<GenericRecord>(new
GenericData.Record(valSchema)));