[ 
https://issues.apache.org/jira/browse/AVRO-1636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15027733#comment-15027733
 ] 

Thiruvalluvan M. G. commented on AVRO-1636:
-------------------------------------------

Avro C++ streams in JSON objects. That is it reads the next token only when the 
client is ready to consume it. The consequence is that the order of fields in 
the JSON object should exactly match than in the schema. An alternative 
approach could have been to read the entire JSON object into memory and then 
interpret it. The flip side of this approach is that if the object is large, it 
will take up a lot of memory.

We made a trade off in favor of conserving memory. Avro decoder will be able to 
read JSON streams written using Avro encoder, but if the order of fields is 
altered afterwards, it may fail. 

The limitation is common at least to C++ and Java implementations. As mentioned 
in the [Avro Specification|https://avro.apache.org/docs/1.7.7/spec.html], the 
JSON encoding exists as an aid to developers to help debug issues and not meant 
to be used in production.

> C++ JsonDecoder expects json object to be ordered
> -------------------------------------------------
>
>                 Key: AVRO-1636
>                 URL: https://issues.apache.org/jira/browse/AVRO-1636
>             Project: Avro
>          Issue Type: Bug
>          Components: c++
>    Affects Versions: 1.7.7
>            Reporter: Mann Du
>
> I am using  Shafquat Rahman's original post for this problem reported in Avro 
> user mailing list in last May for the description - ( Thiru provided a fix 
> for the exact problem for Java in Oct. 2011 with Avro-895.)
> I have been experimenting with avro in C++ (version 1.7.5) and ran into an 
> issue with the json decoder which expects ordered json objects. The problem I 
> am seeing appears similar to this post I found for an older avro java library:
> http://search-hadoop.com/m/7WG37aVaBd/v=plain
> I have a simple record:
> {
>     "name" : "SimpleRecord",
>     "type" : "record",
>     "fields" :[ 
>         { "name" : "A", "type" : "int"},
>         { "name" : "B", "type" : "int"}
>     ]
> }
> I generate the C++ header using avrogencpp. The generated  code has 
> codec_traits specialization for SimpleRecord that fixes the order for the 
> JsonEncoder and JsonDecoder.
> ...snip...
> namespace avro {
> template<> struct codec_traits<SimpleRecord> {
>     static void encode(Encoder& e, const SimpleRecord& v) {
>         avro::encode(e, v.A);
>         avro::encode(e, v.B);
>     }
>     static void decode(Decoder& d, SimpleRecord& v) {
>         avro::decode(d, v.A);
>         avro::decode(d, v.B);
>     }
> };
> ...snip...
> The JsonDecoder successfully decodes json objects of the form{"A" : 1, "B" : 
> 2} into SimpleRecord. But if I try to decode {"B" : 2, "A" : 1} it throws 
> 'avro::Exception' with "Incorrect field" from impl/parsing/JsonCodec.cc:182 
> in the following method:
> JsonDecoderHandler(JsonParser& p) : in_(p) { }
>     size_t handle(const Symbol& s) {
>         switch (s.kind()) {
>         case Symbol::sRecordStart:
>             expectToken(in_, JsonParser::tkObjectStart);
>             break;
>         case Symbol::sRecordEnd:
>             expectToken(in_, JsonParser::tkObjectEnd);
>             break;
>         case Symbol::sField:
>             expectToken(in_, JsonParser::tkString);
>             if (s.extra<string>() != in_.stringValue()) {
>                 throw Exception("Incorrect field");
>             }
>             break;
>         default:
>             break;
>         }
>         return 0;
>     }
> The stack shows that avro::decode(d, v.A) is  the call the eventually causes 
> the exception.
> According to the json spec the fields in a json object are unordered. ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to