Thorsten Hake created AVRO-2702:
-----------------------------------

             Summary: Avro ResolvingGrammarGenerator does not honor 
"avro.java.string" property in inner record schemas
                 Key: AVRO-2702
                 URL: https://issues.apache.org/jira/browse/AVRO-2702
             Project: Apache Avro
          Issue Type: Bug
          Components: java
    Affects Versions: 1.9.1
            Reporter: Thorsten Hake
         Attachments: Bar.kt

The type property "avro.java.string" is being used to qualify the CharSequence 
implementation of a string type in java. This property will be set in the java 
code generated by the avro maven plugin, if the <stringType> property is set to 
"String".

However the ResolvingGrammarGenerator, which helps in matching the writer 
schema to the reader schema, does not honor this property for inner records 
within unions. Instead of deserializing to java.lang.String, the strings of the 
inner record will be deserialized to org.apache.avro.util.Utf8. String 
properties belonging to the outer record will be correctly deserialized to 
java.lang.String.

If you try to deserialize an Avro record from a schema that has an inner record 
within an union type with the java code generated by the maven plugin 
(<stringType> is set to "String"), you'll get a ClassCastException:
{noformat}
Caused by: java.lang.ClassCastException: class org.apache.avro.util.Utf8 cannot 
be cast to class java.lang.String
{noformat}
This is because the generated java code expects the strings to be deserialized 
according to the "avro.java.string" property which does not happen for the 
inner record.

I would expect that the deserializer treats the strings in the inner record the 
same as the strings in the outer record.

Example:

writer schema:
{code:json}
{
  "type": "record",
  "name": "foo",
  "fields": [
    {
      "name": "k",
      "type": "string"
    },
    {
      "name": "value",
      "type": [
        "null",
        {
          "type": "record",
          "name": "bar",
          "fields": [
            {
              "name": "str",
              "type": "string"
            }
          ]
        }
      ]
    }
  ]
}
{code}
 reader schema:
{code:json}
{
  "type": "record",
  "name": "foo",
  "fields": [
    {
      "name": "k",
      "type": {
        "type": "string",
        "avro.java.string": "String"
      }
    },
    {
      "name": "value",
      "type": [
        "null",
        {
          "type": "record",
          "name": "bar",
          "fields": [
            {
              "name": "str",
              "type": {
                "type": "string",
                "avro.java.string": "String"
              }
            }
          ]
        }
      ]
    }
  ]
}
{code}
You'll find some example kotlin code demonstrating the problem in the attached 
Bar.kt.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to