[ 
https://issues.apache.org/jira/browse/AVRO-695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12978470#action_12978470
 ] 

Doug Cutting commented on AVRO-695:
-----------------------------------

The current patch is a specification change, since it adds a new schema type, 
"cycle".  It is back-compatible, but not forward-compatible: implementations 
that do not implement cycles would be unable to read data that contains cycle 
schemas.

Instead, a schema for a cycle reference might be defined.  For example, one 
could define a org.apache.avro.cycles.CycleReference record containing a single 
integer field.  CycleReference would only be used in unions with types that it 
may refer to.  The DatumWriter would keep a IdentityHashMap<Object,Integer> of 
records, maps and arrays written, adding an entry the first time an instance is 
seen, and writing a CycleReference for subsequent occurrences in appropriate 
unions.  The DatumReader would then keep an array of all records, maps and 
arrays that have been read and, when it reads a CycleReference in a union, 
return a pointer to the indicated element of that array.

Adding cycles should not slow applications that do not require this feature.  
They could be implemented in newly defined CycleDatumReader/Writer, or perhaps 
GenericDatumReader and GenericDatumWriter could be modified to optionally 
handle such cycles.

> Cycle Reference Support
> -----------------------
>
>                 Key: AVRO-695
>                 URL: https://issues.apache.org/jira/browse/AVRO-695
>             Project: Avro
>          Issue Type: New Feature
>          Components: spec
>    Affects Versions: 1.4.1
>            Reporter: Moustapha Cherri
>             Fix For: 1.5.0
>
>         Attachments: avro-1.4.1-cycle.patch.gz, avro-1.4.1-cycle.patch.gz
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> This is a proposed implementation to add cycle reference support to Avro. It 
> basically introduce a new type named Cycle. Cycles contains a string 
> representing the path to the other reference.
> For example if we have an object of type Message that have a member named 
> previous with type Message too. If we have have this hierarchy:
> message
>   previous : message2
> message2
>   previous : message2
> When serializing the cycle path for "message2.previous" will be "previous".
> The implementation depend on ANTLR to evaluate those cycle at read time to 
> resolve them. I used ANTLR 3.2. This dependency is not mandated; I just used 
> ANTLR to speed thing up. I kept in this implementation the generated code 
> from ANTLR though this should not be the case as this should be generated 
> during the build. I only updated the Java code.
> I did not make full unit testing but you can find "avrotest.Main" class that 
> can be used a preliminary test.
> Please do not hesitate to contact me for further clarification if this seems 
> interresting.
> Best regards,
> Moustapha Cherri

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to