[ 
https://issues.apache.org/jira/browse/AVRO-1562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14357628#comment-14357628
 ] 

ASF GitHub Bot commented on AVRO-1562:
--------------------------------------

GitHub user sachingsachin opened a pull request:

    https://github.com/apache/avro/pull/25

    AVRO-1562: Support classes extending Maps/Collections in Avro

    For classes extending collections and maps, "_avro_implicit_collection_" 
and "_avro_implicit_map_" fields are added in the Avro representation while the 
class itself continues to hold other fields normally like a record.
    The implicit field-names are currently hardcoded as above.
    We will need to have an API in ReflectData for making this configurable.
    
    We may also need some name-mangling to have different names for classes 
extending different parameterized collections/maps.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/sachingsachin/avro AVRO-1562

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/avro/pull/25.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #25
    
----
commit f13568f3064f56064edddeb2d122e11888e23831
Author: Sachin Goyal <sgo...@walmart.com>
Date:   2015-03-11T21:12:18Z

    AVRO-1562: Support classes extending Maps/Collections in Avro
    
    For classes extending collections and maps, "_avro_implicit_collection_" 
and "_avro_implicit_map_" fields are added in the Avro representation while the 
class itself continues to hold other fields normally like a record.
    The implicit field-names are currently hardcoded as above.
    We will need to have an API in ReflectData for making this configurable.
    
    We may also need some name-mangling to have different names for classes 
extending different parameterized collections/maps.

----


> Add support for types extending Maps/Collections
> ------------------------------------------------
>
>                 Key: AVRO-1562
>                 URL: https://issues.apache.org/jira/browse/AVRO-1562
>             Project: Avro
>          Issue Type: Bug
>    Affects Versions: 1.7.6
>            Reporter: Sachin Goyal
>         Attachments: custom_map_and_collections1.patch
>
>
> Consider the following code:
> {code}
> import java.io.ByteArrayOutputStream;
> import java.util.*;
> import org.apache.avro.Schema;
> import org.apache.avro.file.DataFileWriter;
> import org.apache.avro.reflect.ReflectData;
> import org.apache.avro.reflect.ReflectDatumWriter;
> public class AvroDerivingMaps
> {
>     public static void main (String [] args) throws Exception
>     {
>         MapDerivedContainer orig = new MapDerivedContainer();
>         ReflectData rdata = ReflectData.AllowNull.get();
>         Schema schema = rdata.getSchema(MapDerivedContainer.class);
>         System.out.println(schema);
>         
>         ReflectDatumWriter<MapDerivedContainer> datumWriter = new 
> ReflectDatumWriter (MapDerivedContainer.class, rdata);
>         DataFileWriter<MapDerivedContainer> fileWriter = new 
> DataFileWriter<MapDerivedContainer> (datumWriter);
>         ByteArrayOutputStream baos = new ByteArrayOutputStream();
>         fileWriter.create(schema, baos);
>         fileWriter.append(orig);
>         fileWriter.close();
>     }
> }
> class MapDerived extends HashMap<String, Integer>
> {
>     Integer a = 1;
>     String b = "b";
> }
> class MapDerivedContainer
> {
>     MapDerived2 map = new MapDerived2();
> }
> class MapDerived2 extends MapDerived
> {
>     String c = "c";
> }
> {code}
> \\
> \\
> It throws the following exception:
> {code:javascript}
> {"type":"record","name":"MapDerivedContainer","namespace":"avro","fields":[{"name":"map","type":["null",{"type":"record","name":"MapDerived2","fields":[{"name":"c","type":["null","string"],"default":null},{"name":"a","type":["null","int"],"default":null},{"name":"b","type":["null","string"],"default":null}]}],"default":null}]}
> {code}
> {color:brown}
> Exception in thread "main" 
> org.apache.avro.file.DataFileWriter$AppendWriteException:
> org.apache.avro.UnresolvedUnionException: 
> Caused by: org.apache.avro.UnresolvedUnionException: Not in union 
> ["null",{"type":"record","name":"MapDerived2","namespace":"avro","fields":[{"name":"c","type":["null","string"],"default":null},{"name":"a","type":["null","int"],"default":null},{"name":"b","type":["null","string"],"default":null}]}]:
>  {}
>       at 
> org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:600)
>       at 
> org.apache.avro.generic.GenericDatumWriter.resolveUnion(GenericDatumWriter.java:151)
>       at 
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:71)
>       at 
> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:145)
>       at 
> org.apache.avro.generic.GenericDatumWriter.writeField(GenericDatumWriter.java:114)
>       at 
> org.apache.avro.reflect.ReflectDatumWriter.writeField(ReflectDatumWriter.java:203)
>       at 
> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:104)
>       at 
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)
>       at 
> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:145)
>       at 
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:58)
>       at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:290)
>       ... 1 more
> {color}
> \\
> \\
> It appears that ReflectData#createSchema() checks for "type instanceof 
> ParameterizedType" and because of this, it skips handling of the map.
> The same is not true of GenericData#isMap() and GenericData#resolveUnion() 
> fails because of this.
> The same may be true for classes extending ArrayList, Collection, Set etc.
> Also, note the schema for the class extending Map:
> {code:javascript}
> {  
>    "type":"record",
>    "name":"MapDerived2",
>    "fields":[  
>       {  
>          "name":"c",
>          "type":[  
>             "null",
>             "string"
>          ],
>          "default":null
>       },
>       {  
>          "name":"a",
>          "type":[  
>             "null",
>             "int"
>          ],
>          "default":null
>       },
>       {  
>          "name":"b",
>          "type":[  
>             "null",
>             "string"
>          ],
>          "default":null
>       }
>    ]
> }
> {code}
> This schema ignores the Map completely.
> Probably, for such a class, the schema should look like:
> {code:javascript}
> {
>    "type":"record",
>    "name":"MapDerived2",
>    "fields":[  
>       {  
>          "name":"c",
>          "type":[  
>             "null",
>             "string"
>          ],
>          "default":null
>       },
>       .... // Other fields in the class extending the Map
>      {
>         "name":"BASE_MAP",
>          "type":[
>             "null",
>             "map" ... // Normal map which the class extends (implements?)
>          ],
>          "default":null
>      }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to