Thanks, I'll add a bug. As a FYI, even without the alias (retaining the original field name), just removing the "id" field yields the exception.
On Tue, Sep 20, 2011 at 2:22 AM, Scott Carey <[email protected]> wrote: > That looks like a bug. What happens if there is no aliasing/renaming > involved? Aliasing is a newer feature than field addition, removal, and > promotion. > > This should be easy to reproduce, can you file a JIRA ticket? We should > discuss this further there. > > Thanks! > > > On 9/19/11 6:14 PM, "Alex Holmes" <[email protected]> wrote: > >>OK, I was able to reproduce the exception. >> >>v1: >>{"name": "Record", "type": "record", >> "fields": [ >> {"name": "name", "type": "string"}, >> {"name": "id", "type": "int"} >> ] >>} >> >>v2: >>{"name": "Record", "type": "record", >> "fields": [ >> {"name": "name_rename", "type": "string", "aliases": ["name"]} >> ] >>} >> >>Step 1. Write Avro file using v1 generated class >>Step 2. Read Avro file using v2 generated class >> >>Exception in thread "main" org.apache.avro.AvroRuntimeException: Bad index >> at Record.put(Unknown Source) >> at org.apache.avro.generic.GenericData.setField(GenericData.java:463) >> at >>org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.j >>ava:166) >> at >>org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:13 >>8) >> at >>org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:12 >>9) >> at org.apache.avro.file.DataFileStream.next(DataFileStream.java:233) >> at org.apache.avro.file.DataFileStream.next(DataFileStream.java:220) >> at Read.readFromAvro(Unknown Source) >> at Read.main(Unknown Source) >> >>The code to write/read the avro file didn't change from below. >> >>On Mon, Sep 19, 2011 at 9:08 PM, Alex Holmes <[email protected]> wrote: >>> I'm trying to put together a simple test case to reproduce the >>> exception. While I was creating the test case, I hit this behavior >>> which doesn't seem right, but maybe it's my misunderstanding on how >>> forward/backward compatibility should work: >>> >>> Schema v1: >>> >>> {"name": "Record", "type": "record", >>> "fields": [ >>> {"name": "name", "type": "string"}, >>> {"name": "id", "type": "int"} >>> ] >>> } >>> >>> Schema v2: >>> >>> {"name": "Record", "type": "record", >>> "fields": [ >>> {"name": "name_rename", "type": "string", "aliases": ["name"]}, >>> {"name": "new_field", "type": "int", "default":"0"} >>> ] >>> } >>> >>> In the 2nd version I: >>> >>> - removed field "id" >>> - renamed field "name" to "name_rename" >>> - added field "new_field" >>> >>> I write the v1 data file: >>> >>> public static Record createRecord(String name, int id) { >>> Record record = new Record(); >>> record.name = name; >>> record.id = id; >>> return record; >>> } >>> >>> public static void writeToAvro(OutputStream outputStream) >>> throws IOException { >>> DataFileWriter<Record> writer = >>> new DataFileWriter<Record>(new SpecificDatumWriter<Record>()); >>> writer.create(Record.SCHEMA$, outputStream); >>> >>> writer.append(createRecord("r1", 1)); >>> writer.append(createRecord("r2", 2)); >>> >>> writer.close(); >>> outputStream.close(); >>> } >>> >>> I wrote a version-agnostic Read class: >>> >>> public static void readFromAvro(InputStream is) throws IOException { >>> DataFileStream<Record> reader = new DataFileStream<Record>( >>> is, new SpecificDatumReader<Record>()); >>> for (Record a : reader) { >>> System.out.println(ToStringBuilder.reflectionToString(a)); >>> } >>> IOUtils.cleanup(null, is); >>> IOUtils.cleanup(null, reader); >>> } >>> >>> Running the Read code against the v1 data file, and including the v1 >>> code-generated classes in the classpath produced: >>> >>> Record@6a8c436b[name=r1,id=1] >>> Record@6baa9f99[name=r2,id=2] >>> >>> If I run the same code, but use just the v2 generated classes in the >>> classpath I get: >>> >>> Record@39dd3812[name_rename=r1,new_field=1] >>> Record@27b15692[name_rename=r2,new_field=2] >>> >>> The name_rename field seems to be good, but why would "new_field" >>> inherit the values of the deleted field "id"? >>> >>> Cheers, >>> Alex >>> >>> >>> >>> >>> >>> >>> >>> On Mon, Sep 19, 2011 at 12:35 PM, Doug Cutting <[email protected]> >>>wrote: >>>> On 09/19/2011 05:12 AM, Alex Holmes wrote: >>>>> I then modified my original schema by adding, deleting and renaming >>>>> some fields, creating version 2 of the schema. After re-creating the >>>>> Java classes I attempted to read the version 1 file using the >>>>> DataFileStream (with a SpecificDatumReader), and this is throwing an >>>>> exception. >>>> >>>> This should work. Can you provide more detail? What is the exception? >>>> A reproducible test case would be great to have. >>>> >>>> Thanks, >>>> >>>> Doug >>>> >>> > > >
