[
https://issues.apache.org/jira/browse/AVRO-2438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian J. updated AVRO-2438:
-------------------------------
Description:
Having a schema fragment like this:
{code:java}
{
"name": "ownerId",
"type": [
"null",
{
"type": "string",
"java-class": "java.net.URI"
}
],
"default": null
}{code}
can be perfectly deserialized in a generated POJO with
{code:java}
@org.apache.avro.specific.AvroGenerated
public class MyAvroDataObject extends
org.apache.avro.specific.SpecificRecordBase implements
org.apache.avro.specific.SpecificRecord {
...
@Deprecated public java.net.URI ownerId;{code}
as
{{GenericDatumReader.readString(Object, Schema, Decoder)}} uses via the
{{stringClassCache}} with
{code:java}
{"type":"string","java-class":"java.net.URI"}=class java.net.URI{code}
The {{URI}} class itself to rehydrate the value via {{newInstanceFromString}}.
On the other hand, {{deepCopy}} only considers the schema-type of the field and
turns in {{org.apache.avro.generic.GenericData.deepCopy(Schema, T)}}
the {{URI}} value into an {{org.apache.avro.util.Utf8}} via the {{String}} case
which then causes a {{ClassCastException}}:
{noformat}
java.lang.ClassCastException: org.apache.avro.util.Utf8 cannot be cast to
java.net.URI
at com.example.MyAvroDataObject.put(MyAvroDataObject.java:104)
at org.apache.avro.generic.GenericData.setField(GenericData.java:660)
at org.apache.avro.generic.GenericData.setField(GenericData.java:677)
at org.apache.avro.generic.GenericData.deepCopy(GenericData.java:1082)
at org.apache.avro.generic.GenericData.deepCopy(GenericData.java:1102)
at
org.apache.avro.generic.GenericData.deepCopy(GenericData.java:1080){noformat}
The following dirty hack seems to avoid the issue - but is not in sync with the
{{stringClassCache}} which should be consulted, too:
{code:java}
case STRING:
// Strings are immutable
if (value instanceof String) {
return (T)value;
}
// Dirty Harry 9 3/4 start
// URIs are immutable and are probably modeled as an URI itself
// TODO: Check with stringClassCache & the schema
else if ((value instanceof URI)
&& URI.class.getName().equals(schema.getProp("java-class"))
) {
return (T)value;
}
// Dirt Harry 9 3/4 end
// Some CharSequence subclasses are mutable, so we still need to make
// a copy
else if (value instanceof Utf8) {
// Utf8 copy constructor is more efficient than converting
// to string and then back to Utf8
return (T)new Utf8((Utf8)value);
}
return (T)new Utf8(value.toString());
{code}
Also tried with Avro `1.10-SNAPSHOT` of 2019-06-20 /
2d3b1fe7efd865639663ba785877182e7e038c45 due to
[https://github.com/apache/avro/pull/329] - but the issue remains.
was:
Having a schema fragment like this:
{code:java}
{
"name": "ownerId",
"type": [
"null",
{
"type": "string",
"java-class": "java.net.URI"
}
],
"default": null
}{code}
can be perfectly deserialized in a generated POJO with
{code:java}
@org.apache.avro.specific.AvroGenerated
public class MyAvroDataObject extends
org.apache.avro.specific.SpecificRecordBase implements
org.apache.avro.specific.SpecificRecord {
...
@Deprecated public java.net.URI ownerId;{code}
as
{{GenericDatumReader.readString(Object, Schema, Decoder)}} uses via the
{{stringClassCache}} with
{code:java}
{"type":"string","java-class":"java.net.URI"}=class java.net.URI{code}
The {{URI}} class itself to rehydrate the value via {{newInstanceFromString}}.
On the other hand, {{deepCopy}} only considers the schema-type of the field and
turns in {{org.apache.avro.generic.GenericData.deepCopy(Schema, T)}}
the {{URI}} value into an {{org.apache.avro.util.Utf8}} via the {{String}} case
which then causes a {{ClassCastException}}:
{noformat}
java.lang.ClassCastException: org.apache.avro.util.Utf8 cannot be cast to
java.net.URI
at com.example.MyAvroDataObject.put(MyAvroDataObject.java:104)
at org.apache.avro.generic.GenericData.setField(GenericData.java:660)
at org.apache.avro.generic.GenericData.setField(GenericData.java:677)
at org.apache.avro.generic.GenericData.deepCopy(GenericData.java:1082)
at org.apache.avro.generic.GenericData.deepCopy(GenericData.java:1102)
at
org.apache.avro.generic.GenericData.deepCopy(GenericData.java:1080){noformat}
The following dirty hack seems to avoid the issue - but is not in sync with the
{{stringClassCache}} which should be consulted, too:
{code:java}
case STRING:
// Strings are immutable
if (value instanceof String) {
return (T)value;
}
// Dirty Harry 9 3/4 start
// URIs are immutable and are probably modeled as an URI itself
// TODO: Check with stringClassCache & the schema
else if ((value instanceof URI)
&& URI.class.getName().equals(schema.getProp("java-class"))
) {
return (T)value;
}
// Dirt Harry 9 3/4 end
// Some CharSequence subclasses are mutable, so we still need to make
// a copy
else if (value instanceof Utf8) {
// Utf8 copy constructor is more efficient than converting
// to string and then back to Utf8
return (T)new Utf8((Utf8)value);
}
return (T)new Utf8(value.toString());
{code}
> SpecificData.deepCopy() cannot be used with URI fields
> ------------------------------------------------------
>
> Key: AVRO-2438
> URL: https://issues.apache.org/jira/browse/AVRO-2438
> Project: Apache Avro
> Issue Type: Bug
> Components: java
> Affects Versions: 1.9.0, 1.8.2
> Reporter: Sebastian J.
> Priority: Major
>
> Having a schema fragment like this:
> {code:java}
> {
> "name": "ownerId",
> "type": [
> "null",
> {
> "type": "string",
> "java-class": "java.net.URI"
> }
> ],
> "default": null
> }{code}
> can be perfectly deserialized in a generated POJO with
> {code:java}
> @org.apache.avro.specific.AvroGenerated
> public class MyAvroDataObject extends
> org.apache.avro.specific.SpecificRecordBase implements
> org.apache.avro.specific.SpecificRecord {
> ...
> @Deprecated public java.net.URI ownerId;{code}
> as
> {{GenericDatumReader.readString(Object, Schema, Decoder)}} uses via the
> {{stringClassCache}} with
> {code:java}
> {"type":"string","java-class":"java.net.URI"}=class java.net.URI{code}
> The {{URI}} class itself to rehydrate the value via {{newInstanceFromString}}.
>
> On the other hand, {{deepCopy}} only considers the schema-type of the field
> and turns in {{org.apache.avro.generic.GenericData.deepCopy(Schema, T)}}
> the {{URI}} value into an {{org.apache.avro.util.Utf8}} via the {{String}}
> case which then causes a {{ClassCastException}}:
> {noformat}
> java.lang.ClassCastException: org.apache.avro.util.Utf8 cannot be cast to
> java.net.URI
> at com.example.MyAvroDataObject.put(MyAvroDataObject.java:104)
> at org.apache.avro.generic.GenericData.setField(GenericData.java:660)
> at org.apache.avro.generic.GenericData.setField(GenericData.java:677)
> at org.apache.avro.generic.GenericData.deepCopy(GenericData.java:1082)
> at org.apache.avro.generic.GenericData.deepCopy(GenericData.java:1102)
> at
> org.apache.avro.generic.GenericData.deepCopy(GenericData.java:1080){noformat}
>
> The following dirty hack seems to avoid the issue - but is not in sync with
> the {{stringClassCache}} which should be consulted, too:
> {code:java}
> case STRING:
> // Strings are immutable
> if (value instanceof String) {
> return (T)value;
> }
> // Dirty Harry 9 3/4 start
> // URIs are immutable and are probably modeled as an URI itself
> // TODO: Check with stringClassCache & the schema
> else if ((value instanceof URI)
> && URI.class.getName().equals(schema.getProp("java-class"))
> ) {
> return (T)value;
> }
> // Dirt Harry 9 3/4 end
> // Some CharSequence subclasses are mutable, so we still need to make
> // a copy
> else if (value instanceof Utf8) {
> // Utf8 copy constructor is more efficient than converting
> // to string and then back to Utf8
> return (T)new Utf8((Utf8)value);
> }
> return (T)new Utf8(value.toString());
> {code}
>
> Also tried with Avro `1.10-SNAPSHOT` of 2019-06-20 /
> 2d3b1fe7efd865639663ba785877182e7e038c45 due to
> [https://github.com/apache/avro/pull/329] - but the issue remains.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)