[ https://issues.apache.org/jira/browse/AVRO-1145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alexandre Normand resolved AVRO-1145. ------------------------------------- Resolution: Not A Problem It was me misusing the API. > Can't read union of null and primitive from value written with schema as > primitive > ---------------------------------------------------------------------------------- > > Key: AVRO-1145 > URL: https://issues.apache.org/jira/browse/AVRO-1145 > Project: Avro > Issue Type: Bug > Components: java > Affects Versions: 1.7.0 > Reporter: Alexandre Normand > Attachments: TestPrimitiveToUnionResolution.java > > > Using the its Java's generic representation API and I have a problem dealing > with our current case of schema evolution. The scenario we're dealing with > here is making a primitive-type field optional by changing the field to be a > {{union}} of {{null}} and that primitive type. > I'm going to use a simple example. Basically, our schemas are: > Initial: A record with one field of type {{int}} > Second version: Same record, same field name but the type is now a union of > {{null}} and {{int}} > According to the [schema > resolution|http://avro.apache.org/docs/current/spec.html#Schema+Resolution] > chapter of Avro's spec, the resolution for such a case should be: > {code} > if reader's is a union, but writer's is not > The first schema in the reader's union that matches > the writer's schema is recursively resolved against > it. If none match, an error is signalled. > {code} > My interpretation is that we should resolve data serialized with the initial > schema properly as int is part of the union in the reader's schema. > However, when running a test of reading back a record serialized with version > 1 using the version 2, I get > *{{org.apache.avro.AvroTypeException: Attempt to process a int when a union > was expected.}}* > Here's a test that shows exactly this: > {code} > @Test > public void testReadingUnionFromValueWrittenAsPrimitive() throws Exception { > Schema writerSchema = new Schema.Parser().parse("{\n" + > " \"type\":\"record\",\n" + > " \"name\":\"NeighborComparisons\",\n" + > " \"fields\": [\n" + > " {\"name\": \"test\",\n" + > " \"type\": \"int\" }]} "); > Schema readersSchema = new Schema.Parser().parse(" {\n" + > " \"type\":\"record\",\n" + > " \"name\":\"NeighborComparisons\",\n" + > " \"fields\": [ {\n" + > " \"name\": \"test\",\n" + > " \"type\": [\"null\", \"int\"],\n" + > " \"default\": null } ] }"); > // Writing a record using the initial schema with the > // test field defined as an int > GenericData.Record record = new GenericData.Record(writerSchema); > record.put("test", Integer.valueOf(10)); > ByteArrayOutputStream output = new ByteArrayOutputStream(); > JsonEncoder jsonEncoder = EncoderFactory.get(). > jsonEncoder(writerSchema, output); > GenericDatumWriter<GenericData.Record> writer = new > GenericDatumWriter<GenericData.Record>(writerSchema); > writer.write(record, jsonEncoder); > jsonEncoder.flush(); > output.flush(); > System.out.println(output.toString()); > // We try reading it back using the second schema > // version where the test field is defined as a union of null and int > JsonDecoder jsonDecoder = DecoderFactory.get(). > jsonDecoder(readersSchema, output.toString()); > GenericDatumReader<GenericData.Record> reader = > new GenericDatumReader<GenericData.Record>(writerSchema, > readersSchema); > GenericData.Record read = reader.read(null, jsonDecoder); > // We should be able to assert that the value is 10 but it > // fails on reading the record before getting here > assertEquals(10, read.get("test")); > } > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira