[ 
https://issues.apache.org/jira/browse/AVRO-1562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14115825#comment-14115825
 ] 

Doug Cutting commented on AVRO-1562:
------------------------------------

I am not convinced it's worthwhile to get Avro reflect to correctly model every 
Java datastructure.  Avro schemas do not support inheritance.  This permits 
simple interoperability with programming languages without inheritance, like C. 
 And languages that do support inheritance do so in different ways such that it 
would be difficult to map a single inheritance-based schema language to every 
feature of all of them.  Avro data structures are meant to be a common subset, 
not the superset of all features.

Continued efforts to force inheritance into Avro's existing schema language 
results in odder and odder schemas, schemas that only make sense to Avro 
reflect and are not a good basis for data interoperability.

What do others think?

> Add support for types extending Maps/Collections
> ------------------------------------------------
>
>                 Key: AVRO-1562
>                 URL: https://issues.apache.org/jira/browse/AVRO-1562
>             Project: Avro
>          Issue Type: Bug
>    Affects Versions: 1.7.6
>            Reporter: Sachin Goyal
>         Attachments: custom_map_and_collections1.patch
>
>
> Consider the following code:
> {code}
> import java.io.ByteArrayOutputStream;
> import java.util.*;
> import org.apache.avro.Schema;
> import org.apache.avro.file.DataFileWriter;
> import org.apache.avro.reflect.ReflectData;
> import org.apache.avro.reflect.ReflectDatumWriter;
> public class AvroDerivingMaps
> {
>     public static void main (String [] args) throws Exception
>     {
>         MapDerivedContainer orig = new MapDerivedContainer();
>         ReflectData rdata = ReflectData.AllowNull.get();
>         Schema schema = rdata.getSchema(MapDerivedContainer.class);
>         System.out.println(schema);
>         
>         ReflectDatumWriter<MapDerivedContainer> datumWriter = new 
> ReflectDatumWriter (MapDerivedContainer.class, rdata);
>         DataFileWriter<MapDerivedContainer> fileWriter = new 
> DataFileWriter<MapDerivedContainer> (datumWriter);
>         ByteArrayOutputStream baos = new ByteArrayOutputStream();
>         fileWriter.create(schema, baos);
>         fileWriter.append(orig);
>         fileWriter.close();
>     }
> }
> class MapDerived extends HashMap<String, Integer>
> {
>     Integer a = 1;
>     String b = "b";
> }
> class MapDerivedContainer
> {
>     MapDerived2 map = new MapDerived2();
> }
> class MapDerived2 extends MapDerived
> {
>     String c = "c";
> }
> {code}
> \\
> \\
> It throws the following exception:
> {code:javascript}
> {"type":"record","name":"MapDerivedContainer","namespace":"avro","fields":[{"name":"map","type":["null",{"type":"record","name":"MapDerived2","fields":[{"name":"c","type":["null","string"],"default":null},{"name":"a","type":["null","int"],"default":null},{"name":"b","type":["null","string"],"default":null}]}],"default":null}]}
> {code}
> {color:brown}
> Exception in thread "main" 
> org.apache.avro.file.DataFileWriter$AppendWriteException:
> org.apache.avro.UnresolvedUnionException: 
> Caused by: org.apache.avro.UnresolvedUnionException: Not in union 
> ["null",{"type":"record","name":"MapDerived2","namespace":"avro","fields":[{"name":"c","type":["null","string"],"default":null},{"name":"a","type":["null","int"],"default":null},{"name":"b","type":["null","string"],"default":null}]}]:
>  {}
>       at 
> org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:600)
>       at 
> org.apache.avro.generic.GenericDatumWriter.resolveUnion(GenericDatumWriter.java:151)
>       at 
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:71)
>       at 
> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:145)
>       at 
> org.apache.avro.generic.GenericDatumWriter.writeField(GenericDatumWriter.java:114)
>       at 
> org.apache.avro.reflect.ReflectDatumWriter.writeField(ReflectDatumWriter.java:203)
>       at 
> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:104)
>       at 
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)
>       at 
> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:145)
>       at 
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:58)
>       at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:290)
>       ... 1 more
> {color}
> \\
> \\
> It appears that ReflectData#createSchema() checks for "type instanceof 
> ParameterizedType" and because of this, it skips handling of the map.
> The same is not true of GenericData#isMap() and GenericData#resolveUnion() 
> fails because of this.
> The same may be true for classes extending ArrayList, Collection, Set etc.
> Also, note the schema for the class extending Map:
> {code:javascript}
> {  
>    "type":"record",
>    "name":"MapDerived2",
>    "fields":[  
>       {  
>          "name":"c",
>          "type":[  
>             "null",
>             "string"
>          ],
>          "default":null
>       },
>       {  
>          "name":"a",
>          "type":[  
>             "null",
>             "int"
>          ],
>          "default":null
>       },
>       {  
>          "name":"b",
>          "type":[  
>             "null",
>             "string"
>          ],
>          "default":null
>       }
>    ]
> }
> {code}
> This schema ignores the Map completely.
> Probably, for such a class, the schema should look like:
> {code:javascript}
> {
>    "type":"record",
>    "name":"MapDerived2",
>    "fields":[  
>       {  
>          "name":"c",
>          "type":[  
>             "null",
>             "string"
>          ],
>          "default":null
>       },
>       .... // Other fields in the class extending the Map
>      {
>         "name":"BASE_MAP",
>          "type":[
>             "null",
>             "map" ... // Normal map which the class extends (implements?)
>          ],
>          "default":null
>      }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to