This confuses people often with Avro and Json Decoding. Json has almost 0 type information and therefore, when decoding JSON information for a Union schema type, Avro's Json Decoder *requires* a JSON Object detailing the type chosen. I have updated your code to demonstrate this.
Here is an explanation from Doug: http://mail-archives.apache.org/mod_mbox/avro-user/201412.mbox/%3CCALEq1Z-sKNT-fBpMhAa%3DGTjLq5wuKf5mAuvLYos4Ba17hUi%2Bfw%40mail.gmail.com%3E and here is this information in the spec: http://avro.apache.org/docs/current/spec.html#json_encoding #+BEGIN_SRC c++ std::string schema; schema += "{"; schema += " \"name\" : \"simple\", "; schema += " \"type\" : \"record\", "; schema += " \"fields\" : [ { \"name\" : \"last\", \"type\" : [ \"null\", \"string\"] } ] "; schema += "}"; std::string value; value += "{"; value += " \"last\" : {\"string\": \"dog\" }"; value += "}"; std::istringstream schemass(schema); std::istringstream valuess(value); avro::ValidSchema cpxSchema; avro::compileJsonSchema(schemass, cpxSchema); std::unique_ptr<avro::InputStream> json_is = avro::istreamInputStream(valuess); /* JSON decoder */ avro::DecoderPtr json_decoder = avro::jsonDecoder(cpxSchema); avro::GenericDatum *datum = new avro::GenericDatum(cpxSchema); try { /* Decode JSON to Avro datum */ json_decoder->init(*json_is); avro::decode(*json_decoder, *datum); } catch(const avro::Exception &_e) { // throws Incorrect token in the stream. Expected: Object start, found String } #+END_SRC On Fri, May 21, 2021 at 12:36 AM svend frolund <svendf2...@gmail.com> wrote: > Hello, > > I cannot seem to get avro union types to work properly in the c++ codebase > that I pulled from your github repo a couple of weeks ago. I want to > specify that an object attribute can be either null or a string in order to > capture some notion of optional attributes in my json data. However, when > decoding data that actually has a string value for the "optional" attribute > in question, I get the following exception: "Incorrect token in the stream. > Expected: Object start, found String". Here is a small program that > replicates the issue: > > std::string schema; > schema += "{"; > schema += " \"name\" : \"simple\", "; > schema += " \"type\" : \"record\", "; > schema += " \"fields\" : [ { \"name\" : \"last\", \"type\" : [ > \"null\", \"string\"] } ] "; > schema += "}"; > > std::string value; > value += "{"; > value += " \"last\" : \"dog\" "; > value += "}"; > > std::istringstream schemass(schema); > std::istringstream valuess(value); > > avro::ValidSchema cpxSchema; > avro::compileJsonSchema(schemass, cpxSchema); > > std::unique_ptr<avro::InputStream> json_is = > avro::istreamInputStream(valuess); > > /* JSON decoder */ > avro::DecoderPtr json_decoder = avro::jsonDecoder(cpxSchema); > avro::GenericDatum *datum = new avro::GenericDatum(cpxSchema); > > try > { > /* Decode JSON to Avro datum */ > json_decoder->init(*json_is); > avro::decode(*json_decoder, *datum); > } > catch(const avro::Exception &_e) > { > // throws Incorrect token in the stream. Expected: Object start, > found String > } > > Do I need to configure the system in a particular way for this to work, or > does the current implementation simply not support these types of unions. > > I sincerely hope someone can help! > > All the best, > > Svend >