Created the following ticket: https://issues.apache.org/jira/browse/AVRO-891
Thanks, Alex On Tue, Sep 20, 2011 at 6:26 AM, Alex Holmes <[email protected]> wrote: > Thanks, I'll add a bug. > > As a FYI, even without the alias (retaining the original field name), > just removing the "id" field yields the exception. > > On Tue, Sep 20, 2011 at 2:22 AM, Scott Carey <[email protected]> wrote: >> That looks like a bug. What happens if there is no aliasing/renaming >> involved? Aliasing is a newer feature than field addition, removal, and >> promotion. >> >> This should be easy to reproduce, can you file a JIRA ticket? We should >> discuss this further there. >> >> Thanks! >> >> >> On 9/19/11 6:14 PM, "Alex Holmes" <[email protected]> wrote: >> >>>OK, I was able to reproduce the exception. >>> >>>v1: >>>{"name": "Record", "type": "record", >>> "fields": [ >>> {"name": "name", "type": "string"}, >>> {"name": "id", "type": "int"} >>> ] >>>} >>> >>>v2: >>>{"name": "Record", "type": "record", >>> "fields": [ >>> {"name": "name_rename", "type": "string", "aliases": ["name"]} >>> ] >>>} >>> >>>Step 1. Write Avro file using v1 generated class >>>Step 2. Read Avro file using v2 generated class >>> >>>Exception in thread "main" org.apache.avro.AvroRuntimeException: Bad index >>> at Record.put(Unknown Source) >>> at org.apache.avro.generic.GenericData.setField(GenericData.java:463) >>> at >>>org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.j >>>ava:166) >>> at >>>org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:13 >>>8) >>> at >>>org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:12 >>>9) >>> at org.apache.avro.file.DataFileStream.next(DataFileStream.java:233) >>> at org.apache.avro.file.DataFileStream.next(DataFileStream.java:220) >>> at Read.readFromAvro(Unknown Source) >>> at Read.main(Unknown Source) >>> >>>The code to write/read the avro file didn't change from below. >>> >>>On Mon, Sep 19, 2011 at 9:08 PM, Alex Holmes <[email protected]> wrote: >>>> I'm trying to put together a simple test case to reproduce the >>>> exception. While I was creating the test case, I hit this behavior >>>> which doesn't seem right, but maybe it's my misunderstanding on how >>>> forward/backward compatibility should work: >>>> >>>> Schema v1: >>>> >>>> {"name": "Record", "type": "record", >>>> "fields": [ >>>> {"name": "name", "type": "string"}, >>>> {"name": "id", "type": "int"} >>>> ] >>>> } >>>> >>>> Schema v2: >>>> >>>> {"name": "Record", "type": "record", >>>> "fields": [ >>>> {"name": "name_rename", "type": "string", "aliases": ["name"]}, >>>> {"name": "new_field", "type": "int", "default":"0"} >>>> ] >>>> } >>>> >>>> In the 2nd version I: >>>> >>>> - removed field "id" >>>> - renamed field "name" to "name_rename" >>>> - added field "new_field" >>>> >>>> I write the v1 data file: >>>> >>>> public static Record createRecord(String name, int id) { >>>> Record record = new Record(); >>>> record.name = name; >>>> record.id = id; >>>> return record; >>>> } >>>> >>>> public static void writeToAvro(OutputStream outputStream) >>>> throws IOException { >>>> DataFileWriter<Record> writer = >>>> new DataFileWriter<Record>(new SpecificDatumWriter<Record>()); >>>> writer.create(Record.SCHEMA$, outputStream); >>>> >>>> writer.append(createRecord("r1", 1)); >>>> writer.append(createRecord("r2", 2)); >>>> >>>> writer.close(); >>>> outputStream.close(); >>>> } >>>> >>>> I wrote a version-agnostic Read class: >>>> >>>> public static void readFromAvro(InputStream is) throws IOException { >>>> DataFileStream<Record> reader = new DataFileStream<Record>( >>>> is, new SpecificDatumReader<Record>()); >>>> for (Record a : reader) { >>>> System.out.println(ToStringBuilder.reflectionToString(a)); >>>> } >>>> IOUtils.cleanup(null, is); >>>> IOUtils.cleanup(null, reader); >>>> } >>>> >>>> Running the Read code against the v1 data file, and including the v1 >>>> code-generated classes in the classpath produced: >>>> >>>> Record@6a8c436b[name=r1,id=1] >>>> Record@6baa9f99[name=r2,id=2] >>>> >>>> If I run the same code, but use just the v2 generated classes in the >>>> classpath I get: >>>> >>>> Record@39dd3812[name_rename=r1,new_field=1] >>>> Record@27b15692[name_rename=r2,new_field=2] >>>> >>>> The name_rename field seems to be good, but why would "new_field" >>>> inherit the values of the deleted field "id"? >>>> >>>> Cheers, >>>> Alex >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> On Mon, Sep 19, 2011 at 12:35 PM, Doug Cutting <[email protected]> >>>>wrote: >>>>> On 09/19/2011 05:12 AM, Alex Holmes wrote: >>>>>> I then modified my original schema by adding, deleting and renaming >>>>>> some fields, creating version 2 of the schema. After re-creating the >>>>>> Java classes I attempted to read the version 1 file using the >>>>>> DataFileStream (with a SpecificDatumReader), and this is throwing an >>>>>> exception. >>>>> >>>>> This should work. Can you provide more detail? What is the exception? >>>>> A reproducible test case would be great to have. >>>>> >>>>> Thanks, >>>>> >>>>> Doug >>>>> >>>> >> >> >> >
