Yes, thanks for this.  I had this problem with the c library too.  I'm glad to 
see a recommendation and some links.



From: Scott Reynolds <[email protected]>
Date: Friday, May 21, 2021 at 9:32 AM
To: [email protected] <[email protected]>
Subject: Re:
This confuses people often with Avro and Json Decoding. Json has almost 0 type 
information and therefore, when decoding JSON information for a Union schema 
type, Avro's Json Decoder *requires* a JSON Object detailing the type chosen. I 
have updated your code to demonstrate this.

Here is an explanation from Doug: 
http://mail-archives.apache.org/mod_mbox/avro-user/201412.mbox/%3CCALEq1Z-sKNT-fBpMhAa%3DGTjLq5wuKf5mAuvLYos4Ba17hUi%2Bfw%40mail.gmail.com%3E<https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmail-archives.apache.org%2Fmod_mbox%2Favro-user%2F201412.mbox%2F%253CCALEq1Z-sKNT-fBpMhAa%253DGTjLq5wuKf5mAuvLYos4Ba17hUi%252Bfw%2540mail.gmail.com%253E&data=04%7C01%7Cbmcqueen%40linkedin.com%7Cb659b404c3254839c53108d91c761293%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637572115713748215%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=yAxocB41Cf1OhTH3YK5WksYjgRi%2BwTYAZr7xTqVcuCg%3D&reserved=0>

and here is this information in the spec: 
http://avro.apache.org/docs/current/spec.html#json_encoding<https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Favro.apache.org%2Fdocs%2Fcurrent%2Fspec.html%23json_encoding&data=04%7C01%7Cbmcqueen%40linkedin.com%7Cb659b404c3254839c53108d91c761293%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637572115713758209%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=FDtPdxAgESsVP0uvjQy05ISHyKyfX8vsMjFMd27NKEM%3D&reserved=0>

#+BEGIN_SRC c++
      std::string schema;
      schema += "{";
      schema += "   \"name\" : \"simple\", ";
      schema += "   \"type\" : \"record\", ";
      schema += "   \"fields\" : [ { \"name\" : \"last\", \"type\" : [ 
\"null\", \"string\"] } ] ";
      schema += "}";

      std::string value;
      value += "{";
      value += "   \"last\" : {\"string\": \"dog\" }";
      value += "}";

      std::istringstream schemass(schema);
      std::istringstream valuess(value);

      avro::ValidSchema cpxSchema;
      avro::compileJsonSchema(schemass, cpxSchema);

      std::unique_ptr<avro::InputStream> json_is = 
avro::istreamInputStream(valuess);

      /* JSON decoder */
      avro::DecoderPtr json_decoder = avro::jsonDecoder(cpxSchema);
      avro::GenericDatum *datum = new avro::GenericDatum(cpxSchema);

      try
      {
         /* Decode JSON to Avro datum */
         json_decoder->init(*json_is);
         avro::decode(*json_decoder, *datum);
      }
      catch(const avro::Exception &_e)
      {
          // throws Incorrect token in the stream. Expected: Object start, 
found String
      }
#+END_SRC

On Fri, May 21, 2021 at 12:36 AM svend frolund 
<[email protected]<mailto:[email protected]>> wrote:
Hello,

I cannot seem to get avro union types to work properly in the c++ codebase that 
I pulled from your github repo a couple of weeks ago. I want to specify that an 
object attribute can be either null or a string in order to capture some notion 
of optional attributes in my json data. However, when decoding data that 
actually has a string value for the "optional" attribute in question, I get the 
following exception: "Incorrect token in the stream. Expected: Object start, 
found String". Here is a small program that replicates the issue:

      std::string schema;
      schema += "{";
      schema += "   \"name\" : \"simple\", ";
      schema += "   \"type\" : \"record\", ";
      schema += "   \"fields\" : [ { \"name\" : \"last\", \"type\" : [ 
\"null\", \"string\"] } ] ";
      schema += "}";

      std::string value;
      value += "{";
      value += "   \"last\" : \"dog\" ";
      value += "}";

      std::istringstream schemass(schema);
      std::istringstream valuess(value);

      avro::ValidSchema cpxSchema;
      avro::compileJsonSchema(schemass, cpxSchema);

      std::unique_ptr<avro::InputStream> json_is = 
avro::istreamInputStream(valuess);

      /* JSON decoder */
      avro::DecoderPtr json_decoder = avro::jsonDecoder(cpxSchema);
      avro::GenericDatum *datum = new avro::GenericDatum(cpxSchema);

      try
      {
         /* Decode JSON to Avro datum */
         json_decoder->init(*json_is);
         avro::decode(*json_decoder, *datum);
      }
      catch(const avro::Exception &_e)
      {
          // throws Incorrect token in the stream. Expected: Object start, 
found String
      }

Do I need to configure the system in a particular way for this to work, or does 
the current implementation simply not support these types of unions.

I sincerely hope someone can help!

All the best,

   Svend

Reply via email to