As Doug mentioned in the ticket, the problem is likely: new SpecificDatumReader<Record>()
This should be new SpecificDatumReader<Record>(Record.class) Which sets the reader to resolve to the schema found in Record.class On 9/20/11 3:44 AM, "Alex Holmes" <[email protected]> wrote: >Created the following ticket: > >https://issues.apache.org/jira/browse/AVRO-891 > >Thanks, >Alex > >On Tue, Sep 20, 2011 at 6:26 AM, Alex Holmes <[email protected]> wrote: >> Thanks, I'll add a bug. >> >> As a FYI, even without the alias (retaining the original field name), >> just removing the "id" field yields the exception. >> >> On Tue, Sep 20, 2011 at 2:22 AM, Scott Carey <[email protected]> >>wrote: >>> That looks like a bug. What happens if there is no aliasing/renaming >>> involved? Aliasing is a newer feature than field addition, removal, >>>and >>> promotion. >>> >>> This should be easy to reproduce, can you file a JIRA ticket? We >>>should >>> discuss this further there. >>> >>> Thanks! >>> >>> >>> On 9/19/11 6:14 PM, "Alex Holmes" <[email protected]> wrote: >>> >>>>OK, I was able to reproduce the exception. >>>> >>>>v1: >>>>{"name": "Record", "type": "record", >>>> "fields": [ >>>> {"name": "name", "type": "string"}, >>>> {"name": "id", "type": "int"} >>>> ] >>>>} >>>> >>>>v2: >>>>{"name": "Record", "type": "record", >>>> "fields": [ >>>> {"name": "name_rename", "type": "string", "aliases": ["name"]} >>>> ] >>>>} >>>> >>>>Step 1. Write Avro file using v1 generated class >>>>Step 2. Read Avro file using v2 generated class >>>> >>>>Exception in thread "main" org.apache.avro.AvroRuntimeException: Bad >>>>index >>>> at Record.put(Unknown Source) >>>> at >>>>org.apache.avro.generic.GenericData.setField(GenericData.java:463) >>>> at >>>>org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReade >>>>r.j >>>>ava:166) >>>> at >>>>org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java >>>>:13 >>>>8) >>>> at >>>>org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java >>>>:12 >>>>9) >>>> at >>>>org.apache.avro.file.DataFileStream.next(DataFileStream.java:233) >>>> at >>>>org.apache.avro.file.DataFileStream.next(DataFileStream.java:220) >>>> at Read.readFromAvro(Unknown Source) >>>> at Read.main(Unknown Source) >>>> >>>>The code to write/read the avro file didn't change from below. >>>> >>>>On Mon, Sep 19, 2011 at 9:08 PM, Alex Holmes <[email protected]> >>>>wrote: >>>>> I'm trying to put together a simple test case to reproduce the >>>>> exception. While I was creating the test case, I hit this behavior >>>>> which doesn't seem right, but maybe it's my misunderstanding on how >>>>> forward/backward compatibility should work: >>>>> >>>>> Schema v1: >>>>> >>>>> {"name": "Record", "type": "record", >>>>> "fields": [ >>>>> {"name": "name", "type": "string"}, >>>>> {"name": "id", "type": "int"} >>>>> ] >>>>> } >>>>> >>>>> Schema v2: >>>>> >>>>> {"name": "Record", "type": "record", >>>>> "fields": [ >>>>> {"name": "name_rename", "type": "string", "aliases": ["name"]}, >>>>> {"name": "new_field", "type": "int", "default":"0"} >>>>> ] >>>>> } >>>>> >>>>> In the 2nd version I: >>>>> >>>>> - removed field "id" >>>>> - renamed field "name" to "name_rename" >>>>> - added field "new_field" >>>>> >>>>> I write the v1 data file: >>>>> >>>>> public static Record createRecord(String name, int id) { >>>>> Record record = new Record(); >>>>> record.name = name; >>>>> record.id = id; >>>>> return record; >>>>> } >>>>> >>>>> public static void writeToAvro(OutputStream outputStream) >>>>> throws IOException { >>>>> DataFileWriter<Record> writer = >>>>> new DataFileWriter<Record>(new SpecificDatumWriter<Record>()); >>>>> writer.create(Record.SCHEMA$, outputStream); >>>>> >>>>> writer.append(createRecord("r1", 1)); >>>>> writer.append(createRecord("r2", 2)); >>>>> >>>>> writer.close(); >>>>> outputStream.close(); >>>>> } >>>>> >>>>> I wrote a version-agnostic Read class: >>>>> >>>>> public static void readFromAvro(InputStream is) throws IOException { >>>>> DataFileStream<Record> reader = new DataFileStream<Record>( >>>>> is, new SpecificDatumReader<Record>()); >>>>> for (Record a : reader) { >>>>> System.out.println(ToStringBuilder.reflectionToString(a)); >>>>> } >>>>> IOUtils.cleanup(null, is); >>>>> IOUtils.cleanup(null, reader); >>>>> } >>>>> >>>>> Running the Read code against the v1 data file, and including the v1 >>>>> code-generated classes in the classpath produced: >>>>> >>>>> Record@6a8c436b[name=r1,id=1] >>>>> Record@6baa9f99[name=r2,id=2] >>>>> >>>>> If I run the same code, but use just the v2 generated classes in the >>>>> classpath I get: >>>>> >>>>> Record@39dd3812[name_rename=r1,new_field=1] >>>>> Record@27b15692[name_rename=r2,new_field=2] >>>>> >>>>> The name_rename field seems to be good, but why would "new_field" >>>>> inherit the values of the deleted field "id"? >>>>> >>>>> Cheers, >>>>> Alex >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Mon, Sep 19, 2011 at 12:35 PM, Doug Cutting <[email protected]> >>>>>wrote: >>>>>> On 09/19/2011 05:12 AM, Alex Holmes wrote: >>>>>>> I then modified my original schema by adding, deleting and renaming >>>>>>> some fields, creating version 2 of the schema. After re-creating >>>>>>>the >>>>>>> Java classes I attempted to read the version 1 file using the >>>>>>> DataFileStream (with a SpecificDatumReader), and this is throwing >>>>>>>an >>>>>>> exception. >>>>>> >>>>>> This should work. Can you provide more detail? What is the >>>>>>exception? >>>>>> A reproducible test case would be great to have. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Doug >>>>>> >>>>> >>> >>> >>> >>
