[jira] [Commented] (AVRO-1533) permit promotions between string and bytes
[ https://issues.apache.org/jira/browse/AVRO-1533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14049741#comment-14049741 ] Martin Kleppmann commented on AVRO-1533: Only just saw this, sorry for the delay. I am wondering what this means for validation of schema compatibility, e.g. AVRO-1315. If a bytes field is changed to string, and thus runtime errors are possible due to invalid UTF-8, should the schemas be considered compatible? If people are going to understand validation succeeded as guaranteed no runtime errors, this is potentially an issue. permit promotions between string and bytes -- Key: AVRO-1533 URL: https://issues.apache.org/jira/browse/AVRO-1533 Project: Avro Issue Type: New Feature Components: java Reporter: Doug Cutting Assignee: Doug Cutting Fix For: 1.7.7 Attachments: AVRO-1533.patch, AVRO-1533.patch Avro strings are a subset of bytes, so promoting from string to bytes is lossless and should be possible. Promotion from bytes to strings may cause problems, as not all byte strings are valid UTF8, but it also might be useful. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (AVRO-1536) Remove monkeypatching of Enumerable
Martin Kleppmann created AVRO-1536: -- Summary: Remove monkeypatching of Enumerable Key: AVRO-1536 URL: https://issues.apache.org/jira/browse/AVRO-1536 Project: Avro Issue Type: Improvement Components: ruby Affects Versions: 1.7.6 Reporter: Martin Kleppmann Fix For: 1.7.7 The Avro Ruby gem adds a method {{collect_hash}} to the core module {{Enumerable}}. It's bad form for a library to extend core modules like this, and it's also unnecessary (stdlib methods can do the job perfectly well). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (AVRO-1536) Remove monkeypatching of Enumerable
[ https://issues.apache.org/jira/browse/AVRO-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Martin Kleppmann updated AVRO-1536: --- Attachment: AVRO-1536.patch Attached a patch, extracted from [~wvanbergen]'s patch for AVRO-1499. Willem, could you please check whether this looks right? Remove monkeypatching of Enumerable --- Key: AVRO-1536 URL: https://issues.apache.org/jira/browse/AVRO-1536 Project: Avro Issue Type: Improvement Components: ruby Affects Versions: 1.7.6 Reporter: Martin Kleppmann Fix For: 1.7.7 Attachments: AVRO-1536.patch The Avro Ruby gem adds a method {{collect_hash}} to the core module {{Enumerable}}. It's bad form for a library to extend core modules like this, and it's also unnecessary (stdlib methods can do the job perfectly well). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (AVRO-1499) Ruby 2+ Writes Invalid avro files using the avro gem
[ https://issues.apache.org/jira/browse/AVRO-1499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Martin Kleppmann updated AVRO-1499: --- Attachment: AVRO-1499-3.patch Thanks for your patch, [~wvanbergen]. I've attached a new patch which merges our changes together: - Removing the monkeypatching of Enumerable is a good idea, but it's a separate issue, so I've split it out into AVRO-1536. - Given that we've just removed a monkeypatch on a core module, I'm not so keen to add a new one for String#bytesize. I've changed it to check for the presence of String#bytesize at the point where it's needed. - I've retained my change to set the encoding of the buffer to BINARY. That by itself is actually sufficient, because String#size returns the number of bytes if the encoding is set to binary. However, I've also kept it using #bytesize to make clear what's going on. I've tested it in a broad range of Ruby versions. I'll commit this soon unless there are objections. Ruby 2+ Writes Invalid avro files using the avro gem Key: AVRO-1499 URL: https://issues.apache.org/jira/browse/AVRO-1499 Project: Avro Issue Type: Bug Components: ruby Affects Versions: 1.7.5 Reporter: Michael Ries Assignee: Martin Kleppmann Labels: ruby Fix For: 1.7.7 Attachments: AVRO-1499-2.patch, AVRO-1499-3.patch, AVRO-1499.patch The rubygem writes corrupted avro files under ruby 2.0.0 and ruby 2.1.1. It appears to work correctly under jruby-1.7.10 and ruby 1.9.3. Here is a reproducible: ```ruby require 'avro' data = [ {guid=144045de-eb44-dd1b-d9af-6c8b5d41a96e, user_guid=0cd41235-5c14-eae9-00ed-c6eb11dd9119, name=My Awesome Bank, created_at=1390617818, updated_at=1398180288, deleted_at=nil}, {guid=51e06057-14d2-7527-81fa-b07dba0a263b, user_guid=0cd41235-5c14-eae9-00ed-c6eb11dd9119, name=Student Loans R' Us, created_at=1386178342, updated_at=1398180286, deleted_at=nil}, {guid=b4d1d99f-4351-d0e7-221c-a3fae08716bc, user_guid=0cd41235-5c14-eae9-00ed-c6eb11dd9119, name=My Awesome Bank, created_at=1390617026, updated_at=1398180288, deleted_at=nil}, {guid=084638fa-a78d-bbdd-e075-7c9c957a9b46, user_guid=0cd41235-5c14-eae9-00ed-c6eb11dd9119, name=My Awesome Bank, created_at=1390617138, updated_at=1398180288, deleted_at=nil}, {guid=79287c76-4e8f-0a21-7569-a2bcdc2b2f4d, user_guid=0cd41235-5c14-eae9-00ed-c6eb11dd9119, name=My Awesome Bank, created_at=1390617135, updated_at=1398180288, deleted_at=nil}, {guid=3bcc26b2-7d3b-6c4d-cb27-4eb1574b3c20, user_guid=0cd41235-5c14-eae9-00ed-c6eb11dd9119, name=Cayman Islands Bank, created_at=1386902345, updated_at=1398180288, deleted_at=nil}, {guid=75e1e56c-7611-4030-d002-afa2af70e5a1, user_guid=0cd41235-5c14-eae9-00ed-c6eb11dd9119, name=My Awesome Bank, created_at=1390617427, updated_at=1398180288, deleted_at=nil}, ] member_schema = -SCHEMA {namespace: md.data_logs, type: record, name: Member, fields: [ {name: guid, type: string}, {name: user_guid, type: string}, {name: name, type: [string,null]}, {name: created_at, type:long}, {name: updated_at, type:long}, {name: deleted_at, type:[long,null]} ] } SCHEMA filepath = ./members.avro File.unlink(filepath) if File.exists?(filepath) Avro::DataFile.open(filepath, w, member_schema) do |dw| data.each do |entry| dw entry end end entries = [] Avro::DataFile.open(filepath, r) do |reader| reader.each do |entry| entries entry end end puts Here is the data I wrote into the file: data.each{|e| p e } print \n\n\n\n puts Here is the data I read from the file: entries.each{|e| p e } ``` Under ruby 2+ it fails with the message undefined method 'unpack' for nil:NilClass (NoMethodError). I have also tested that the rubygem can correctly read avro files written by the java client, but the java client fails to read files written by the ruby client, so the issue is definitely in how the rubygem is trying to write the binary file. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Ruby gem fork - contribute back?
On Wed, Jul 2, 2014 at 4:36 AM, Martin Kleppmann mar...@kleppmann.com wrote: FWIW, Ruby isn't the only language with a tricky setup. I spent ages trying to get the Avro tests for PHP to work, for example. As discussed on another thread [1], I think a Docker container might be a good way of building a baseline configuration on which everyone can easily test changes and make release candidates. Any help with this would be most welcome. Martin, Is there already a ticket to track this part of the effort? -Sean
[jira] [Commented] (AVRO-1516) Unit test failure in Ruby 2.0 and above
[ https://issues.apache.org/jira/browse/AVRO-1516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14050045#comment-14050045 ] Willem van Bergen commented on AVRO-1516: - The problem is `union[double, int]` types. Integers get encoded as doubles, because that's the first matching type. When reading it back, it will be read as a double instead of an int. In more recent version of Ruby, the equality operator will return false when comparing the int vs. the double. The fix requires changing the `schema#validate` method, something like this: https://github.com/wvanbergen/tros/commit/e5941173c37553b417663ae0ef4e6b4d9265c65b I can work on a patch when I get back from vacation, at the end of this month. Unit test failure in Ruby 2.0 and above --- Key: AVRO-1516 URL: https://issues.apache.org/jira/browse/AVRO-1516 Project: Avro Issue Type: Test Components: ruby Affects Versions: 1.7.6 Reporter: Martin Kleppmann The following unit test fails when run with Ruby 2.0 and above: {noformat} $ bundle exec rake test /Users/mkleppma/.rubies/ruby-2.0.0-p195/bin/ruby -Ilib:ext:bin:test -I/Users/mkleppma/.gem/ruby/2.0.0/gems/rake-10.3.1/lib /Users/mkleppma/.gem/ruby/2.0.0/gems/rake-10.3.1/lib/rake/rake_test_loader.rb test/test_datafile.rb test/test_help.rb test/test_io.rb test/test_protocol.rb test/test_schema.rb test/test_socket_transport.rb Run options: # Running tests: [30/41] TestIO#test_union = 0.00 s 1) Failure: test_union(TestIO) [/Users/mkleppma/Applications/avro/lang/ruby/test/test_io.rb:339]: -3372032630846393039 expected but was -3.372032630846393e+18. Finished tests in 0.346139s, 118.4495 tests/s, 2207.2058 assertions/s. 41 tests, 764 assertions, 1 failures, 0 errors, 0 skips ruby -v: ruby 2.0.0p195 (2013-05-14 revision 40734) [x86_64-darwin12.3.0] rake aborted! Command failed with status (1): [ruby -Ilib:ext:bin:test -I/Users/mkleppma/.gem/ruby/2.0.0/gems/rake-10.3.1/lib /Users/mkleppma/.gem/ruby/2.0.0/gems/rake-10.3.1/lib/rake/rake_test_loader.rb test/test_datafile.rb test/test_help.rb test/test_io.rb test/test_protocol.rb test/test_schema.rb test/test_socket_transport.rb ] /Users/mkleppma/.gem/ruby/2.0.0/gems/echoe-4.6.5/lib/echoe.rb:749:in `block in define_tasks' Tasks: TOP = test_inner (See full trace by running task with --trace) {noformat} Brief investigation suggests that this isn't a bug in Avro, but just a badly written test. The test is comparing -3372032630846393039 and -3372032630846393000.0, which Ruby 1.9 and below consider to be equal, but Ruby 2.0 and above consider to be non-equal. Our tests shouldn't be relying on such edge cases of type coercion. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (AVRO-1499) Ruby 2+ Writes Invalid avro files using the avro gem
[ https://issues.apache.org/jira/browse/AVRO-1499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14050073#comment-14050073 ] Willem van Bergen commented on AVRO-1499: - W.r.t. monkey patching String. I agree with you in general, but in this case I think it is preferable over doing a `respond_to` in the write method. - It basically backports a method in a way that is completely compatible with Ruby 1.9+, and it only does so if it's not available. - This makes it a lot easier to drop 1.8 support later. Just one file of backports to delete, instead of having to go through the entire source code to find occurrences. - Performance: only one respond_to check when the library is loaded, instead of a check on every write. I have no real strong feelings about it, so feel free to ignore this :) Ruby 2+ Writes Invalid avro files using the avro gem Key: AVRO-1499 URL: https://issues.apache.org/jira/browse/AVRO-1499 Project: Avro Issue Type: Bug Components: ruby Affects Versions: 1.7.5 Reporter: Michael Ries Assignee: Martin Kleppmann Labels: ruby Fix For: 1.7.7 Attachments: AVRO-1499-2.patch, AVRO-1499-3.patch, AVRO-1499.patch The rubygem writes corrupted avro files under ruby 2.0.0 and ruby 2.1.1. It appears to work correctly under jruby-1.7.10 and ruby 1.9.3. Here is a reproducible: ```ruby require 'avro' data = [ {guid=144045de-eb44-dd1b-d9af-6c8b5d41a96e, user_guid=0cd41235-5c14-eae9-00ed-c6eb11dd9119, name=My Awesome Bank, created_at=1390617818, updated_at=1398180288, deleted_at=nil}, {guid=51e06057-14d2-7527-81fa-b07dba0a263b, user_guid=0cd41235-5c14-eae9-00ed-c6eb11dd9119, name=Student Loans R' Us, created_at=1386178342, updated_at=1398180286, deleted_at=nil}, {guid=b4d1d99f-4351-d0e7-221c-a3fae08716bc, user_guid=0cd41235-5c14-eae9-00ed-c6eb11dd9119, name=My Awesome Bank, created_at=1390617026, updated_at=1398180288, deleted_at=nil}, {guid=084638fa-a78d-bbdd-e075-7c9c957a9b46, user_guid=0cd41235-5c14-eae9-00ed-c6eb11dd9119, name=My Awesome Bank, created_at=1390617138, updated_at=1398180288, deleted_at=nil}, {guid=79287c76-4e8f-0a21-7569-a2bcdc2b2f4d, user_guid=0cd41235-5c14-eae9-00ed-c6eb11dd9119, name=My Awesome Bank, created_at=1390617135, updated_at=1398180288, deleted_at=nil}, {guid=3bcc26b2-7d3b-6c4d-cb27-4eb1574b3c20, user_guid=0cd41235-5c14-eae9-00ed-c6eb11dd9119, name=Cayman Islands Bank, created_at=1386902345, updated_at=1398180288, deleted_at=nil}, {guid=75e1e56c-7611-4030-d002-afa2af70e5a1, user_guid=0cd41235-5c14-eae9-00ed-c6eb11dd9119, name=My Awesome Bank, created_at=1390617427, updated_at=1398180288, deleted_at=nil}, ] member_schema = -SCHEMA {namespace: md.data_logs, type: record, name: Member, fields: [ {name: guid, type: string}, {name: user_guid, type: string}, {name: name, type: [string,null]}, {name: created_at, type:long}, {name: updated_at, type:long}, {name: deleted_at, type:[long,null]} ] } SCHEMA filepath = ./members.avro File.unlink(filepath) if File.exists?(filepath) Avro::DataFile.open(filepath, w, member_schema) do |dw| data.each do |entry| dw entry end end entries = [] Avro::DataFile.open(filepath, r) do |reader| reader.each do |entry| entries entry end end puts Here is the data I wrote into the file: data.each{|e| p e } print \n\n\n\n puts Here is the data I read from the file: entries.each{|e| p e } ``` Under ruby 2+ it fails with the message undefined method 'unpack' for nil:NilClass (NoMethodError). I have also tested that the rubygem can correctly read avro files written by the java client, but the java client fails to read files written by the ruby client, so the issue is definitely in how the rubygem is trying to write the binary file. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (AVRO-1536) Remove monkeypatching of Enumerable
[ https://issues.apache.org/jira/browse/AVRO-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14050222#comment-14050222 ] Willem van Bergen commented on AVRO-1536: - Looks good! Remove monkeypatching of Enumerable --- Key: AVRO-1536 URL: https://issues.apache.org/jira/browse/AVRO-1536 Project: Avro Issue Type: Improvement Components: ruby Affects Versions: 1.7.6 Reporter: Martin Kleppmann Assignee: Martin Kleppmann Fix For: 1.7.7 Attachments: AVRO-1536.patch The Avro Ruby gem adds a method {{collect_hash}} to the core module {{Enumerable}}. It's bad form for a library to extend core modules like this, and it's also unnecessary (stdlib methods can do the job perfectly well). -- This message was sent by Atlassian JIRA (v6.2#6252)
Circular references and non-string map-keys patch
Hi Avro Developers, I have submitted a patch for Circular references and non-string map-keys support in Avro at https://issues.apache.org/jira/browse/AVRO-695 Can someone guide me how I can get it reviewed and commit this patch? Thanks for helping me out, Sachin
[jira] [Commented] (AVRO-1533) permit promotions between string and bytes
[ https://issues.apache.org/jira/browse/AVRO-1533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14050739#comment-14050739 ] Doug Cutting commented on AVRO-1533: It won't generate runtime errors for invalid UTF-8, but instead replaces erroneous sequences with the character �: http://docs.oracle.com/javase/7/docs/api/java/lang/String.html#String(byte[],%20java.nio.charset.Charset) I think can be considered a compatible change, since it won't break existing applications. Today attempts to switch a field from bytes to string would fail. I suppose an application could currently rely on such failures, but I consider that unlikely enough that I'm willing to ignore it. Do others disagree? We could: # revert this change entirely, declaring it incompatible # revert just the change to the specification, so that Avro Java is more lenient in what conversions it permits than the specification (following Postel's law) # file issues to update the AVRO-1315 schema validation to permit such conversions - also file issues for C, C++ and C# to update their schema resolution to support these conversions Thoughts? permit promotions between string and bytes -- Key: AVRO-1533 URL: https://issues.apache.org/jira/browse/AVRO-1533 Project: Avro Issue Type: New Feature Components: java Reporter: Doug Cutting Assignee: Doug Cutting Fix For: 1.7.7 Attachments: AVRO-1533.patch, AVRO-1533.patch Avro strings are a subset of bytes, so promoting from string to bytes is lossless and should be possible. Promotion from bytes to strings may cause problems, as not all byte strings are valid UTF8, but it also might be useful. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (AVRO-1533) permit promotions between string and bytes
[ https://issues.apache.org/jira/browse/AVRO-1533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14050759#comment-14050759 ] graham sanderson commented on AVRO-1533: I had a quick look at the patch, and there are a few s.getBytes() new String(byte[])... since the intention seems to be to assume that bytes are interchangeable with UTF-8 encoded strings, it should probably be explicit as it is in Utf8, no? permit promotions between string and bytes -- Key: AVRO-1533 URL: https://issues.apache.org/jira/browse/AVRO-1533 Project: Avro Issue Type: New Feature Components: java Reporter: Doug Cutting Assignee: Doug Cutting Fix For: 1.7.7 Attachments: AVRO-1533.patch, AVRO-1533.patch Avro strings are a subset of bytes, so promoting from string to bytes is lossless and should be possible. Promotion from bytes to strings may cause problems, as not all byte strings are valid UTF8, but it also might be useful. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (AVRO-695) Cycle Reference Support
[ https://issues.apache.org/jira/browse/AVRO-695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14050789#comment-14050789 ] Doug Cutting commented on AVRO-695: --- For some reason I cannot apply the patch file you've generated, so it's hard for me to analyze it in detail. What tool are you using to generate this patch? I'd prefer we use an explicit type rather than overload string for this purpose. A union with a record like: {code} {type:record,name:org.apache.avro.CircularRef, fields:[{name:ref, type:int}]} {code} We don't ever have to create or define such a record. Rather, a CustomEncoding can be used to directly resolve such references at read time. (If circular references are not enabled then a GenericRecord would be read.) No string prefixing, etc. would then be required. Rather than modifying ReflectData to support this, might we instead create a subclass of ReflectData that supports circular references? Non-string map key support in reflection should be addressed in a separate issue. Cycle Reference Support --- Key: AVRO-695 URL: https://issues.apache.org/jira/browse/AVRO-695 Project: Avro Issue Type: New Feature Components: spec Affects Versions: 1.7.6 Reporter: Moustapha Cherri Attachments: avro-1.4.1-cycle.patch.gz, avro-1.4.1-cycle.patch.gz, avro_circular_references.zip, avro_circular_refs_2014_06_14.zip, circular_refs_and_nonstring_map_keys_2014_06_25.zip Original Estimate: 672h Remaining Estimate: 672h This is a proposed implementation to add cycle reference support to Avro. It basically introduce a new type named Cycle. Cycles contains a string representing the path to the other reference. For example if we have an object of type Message that have a member named previous with type Message too. If we have have this hierarchy: message previous : message2 message2 previous : message2 When serializing the cycle path for message2.previous will be previous. The implementation depend on ANTLR to evaluate those cycle at read time to resolve them. I used ANTLR 3.2. This dependency is not mandated; I just used ANTLR to speed thing up. I kept in this implementation the generated code from ANTLR though this should not be the case as this should be generated during the build. I only updated the Java code. I did not make full unit testing but you can find avrotest.Main class that can be used a preliminary test. Please do not hesitate to contact me for further clarification if this seems interresting. Best regards, Moustapha Cherri -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (AVRO-680) Allow for non-string keys
[ https://issues.apache.org/jira/browse/AVRO-680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sachin Goyal updated AVRO-680: -- Affects Version/s: 1.7.7 1.7.6 Status: Patch Available (was: Open) For a map with non-string keys in Java such as: {code} Map EmployeeId, EmployeeInfo {code}, the map is converted to an array using: *GenericDatumWriter.java* {code} map.entrySet() {code} Corresponding schema change is done in *ReflectData.java* Diff created using _diff -ru_ Patch can be applied using _patch -i non_string_map_keys.patch_ Unit tests included. Allow for non-string keys - Key: AVRO-680 URL: https://issues.apache.org/jira/browse/AVRO-680 Project: Avro Issue Type: Improvement Affects Versions: 1.7.6, 1.7.7 Reporter: Jeremy Hanna Attachments: non_string_map_keys.zip Based on an email thread back in April, Doug Cutting proposed a possible solution for having non-string keys: Stu Hood wrote: I can understand the reasoning behind AVRO-9, but now I need to look for an alternative to a 'map' that will allow me to store an association of bytes keys to values. A map of Foo has the same binary format as an array of records, each with a string field and a Foo field. So an application can use an array schema similar to this to represent map-like structures with, e.g., non-string keys. Perhaps we could establish standard properties that indicate that a given array of records should be represented in a map-like way if possible? E.g.,: {type: array, isMap: true, items: {type:record, ...}} Doug -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (AVRO-680) Allow for non-string keys
[ https://issues.apache.org/jira/browse/AVRO-680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sachin Goyal updated AVRO-680: -- Attachment: non_string_map_keys.zip Allow for non-string keys - Key: AVRO-680 URL: https://issues.apache.org/jira/browse/AVRO-680 Project: Avro Issue Type: Improvement Affects Versions: 1.7.6, 1.7.7 Reporter: Jeremy Hanna Attachments: non_string_map_keys.zip Based on an email thread back in April, Doug Cutting proposed a possible solution for having non-string keys: Stu Hood wrote: I can understand the reasoning behind AVRO-9, but now I need to look for an alternative to a 'map' that will allow me to store an association of bytes keys to values. A map of Foo has the same binary format as an array of records, each with a string field and a Foo field. So an application can use an array schema similar to this to represent map-like structures with, e.g., non-string keys. Perhaps we could establish standard properties that indicate that a given array of records should be represented in a map-like way if possible? E.g.,: {type: array, isMap: true, items: {type:record, ...}} Doug -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (AVRO-695) Cycle Reference Support
[ https://issues.apache.org/jira/browse/AVRO-695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051012#comment-14051012 ] Sachin Goyal commented on AVRO-695: --- Thanks Doug. I created the patch using *diff -r* and it can be patched using *patch -i patch-file* For non-string map-keys, I have submitted a separate patch at https://issues.apache.org/jira/browse/AVRO-680 Could you please explain your comment regarding circular references in more detail? I will change my patch accordingly. For example, here is a circular reference schema generated using Avro: {code:javascript} { type : record, name : SimpleParent, namespace : org.apache.avro.generic, fields : [ { name : parentName, type : [ null, string ], default : null }, { name : child, type : [ null, { type : record, name : SimpleChild, fields : [ { name : childName, type : [ null, string ], default : null }, { name : parent, type : [ null, SimpleParent], default : null } ] }, string ], default : null } ] } {code} The current code converts it to the following: {code:javascript} { type : record, name : SimpleParent, namespace : org.apache.avro.generic, fields : [ { name : __crefId, type : string }, { name : parentName, type : [ null, string ], default : null }, { name : child, type : [ null, { type : record, name : SimpleChild, fields : [ { name : __crefId, type : string }, { name : childName, type : [ null, string ], default : null }, { name : parent, type : [ null, SimpleParent, string ], default : null } ], circularRefIdPrefix : __crefId }, string ], default : null } ], circularRefIdPrefix : __crefId } {code} Can you please apply your comments above to this example? It will help me in understanding how it would be different from the above solution. As per my understanding, circular references can come in any record-type element. So CustomeEncoder approach would need to write it as a record sometimes or a string/int sometimes. Can a CustomEncoder pass the control to regular Avro Encoder to write a record normally? Cycle Reference Support --- Key: AVRO-695 URL: https://issues.apache.org/jira/browse/AVRO-695 Project: Avro Issue Type: New Feature Components: spec Affects Versions: 1.7.6 Reporter: Moustapha Cherri Attachments: avro-1.4.1-cycle.patch.gz, avro-1.4.1-cycle.patch.gz, avro_circular_references.zip, avro_circular_refs_2014_06_14.zip, circular_refs_and_nonstring_map_keys_2014_06_25.zip Original Estimate: 672h Remaining Estimate: 672h This is a proposed implementation to add cycle reference support to Avro. It basically introduce a new type named Cycle. Cycles contains a string representing the path to the other reference. For example if we have an object of type Message that have a member named previous with type Message too. If we have have this hierarchy: message previous : message2 message2 previous : message2 When serializing the cycle path for message2.previous will be previous. The implementation depend on ANTLR to evaluate those cycle at read time to resolve them. I used ANTLR 3.2. This dependency is not mandated; I just used ANTLR to speed thing up. I kept in this implementation the generated code from ANTLR though this should not be the case as this should be generated during the build. I only updated the Java code. I did not make full unit testing but you can find avrotest.Main class that can be used a preliminary test. Please do not hesitate to contact me for further clarification if this seems interresting. Best regards, Moustapha Cherri -- This message was sent by Atlassian JIRA (v6.2#6252)