[
https://issues.apache.org/jira/browse/AVRO-2831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jinge Dai updated AVRO-2831:
----------------------------
Description:
When avro java library deserializes and tries to resolve fields in a Union, it
will use the writer schema if the reader field type is same as writer's, but it
ignores the properties and logical type.
{code:java}
private static boolean unionEquiv(Schema write, Schema read, Map<SeenPair,
Boolean> seen) {
final Schema.Type wt = write.getType();
if (wt != read.getType()) {
return false;
}
{code}
However, I have hit two issues:
# https://issues.apache.org/jira/browse/AVRO-2702 If producer is not java, the
string field the strings of the inner record will be deserialized to
org.apache.avro.util.Utf8 and then get error
{code:java}
Caused by: java.lang.ClassCastException: class org.apache.avro.util.Utf8 cannot
be cast to class java.lang.String{code}
2. I am using a built-in-house C# lib on the producer side, and it builds Date
field schema as INT type without logicalTyple, this is bug on my side. However,
AVRO 1.8.2 is able to handle it by using reader schema, whereas 1.9.2 gets
error
{code:java}
Caused by: java.lang.IllegalArgumentException: Parameters cannot be null!
Parameter values:[-129, "int", null,
org.apache.avro.data.TimeConversions$DateConversion@d3449b1]
{code}
The reason is, although the consumer side generates the class properly and
sets the field to be converted to logical type, but the convertToLogicalType
failed because the writer schema is used and the logical type is null
{code:java}
public static Object convertToLogicalType(Object datum, Schema schema,
LogicalType type, Conversion<?> conversion) {
if (datum == null) {
return null;
}
if (schema == null || type == null || conversion == null) {
throw new IllegalArgumentException("Parameters cannot be null! Parameter
values:"
+ Arrays.deepToString(new Object[] { datum, schema, type, conversion
}));
}
{code}
I am wondering if we should check type and properties in "unionEquiv" so that
we can use the reader schema if there is difference. This complies the AVRO spec
was:
When avro java library deserializes and tries to resolve fields in a Union, it
will use the writer schema if the reader field type is same as writer's, but it
ignores the properties and logical type.
{code:java}
private static boolean unionEquiv(Schema write, Schema read, Map<SeenPair,
Boolean> seen) {
final Schema.Type wt = write.getType();
if (wt != read.getType()) {
return false;
}
{code}
However, I have hit two issues:
# https://issues.apache.org/jira/browse/AVRO-2702 If producer is not java, the
string field the strings of the inner record will be deserialized to
org.apache.avro.util.Utf8 and then get error
{code:java}
Caused by: java.lang.ClassCastException: class org.apache.avro.util.Utf8 cannot
be cast to class java.lang.String{code}
2. I am using a built-in-house C# lib on the producer side, and it builds Date
field schema as INT type without logicalTyple, this is bug on my side. However,
AVRO 1.8.2 is able to handle it by using reader schema, whereas 1.9.2 gets
error
{code:java}
Caused by: java.lang.IllegalArgumentException: Parameters cannot be null!
Parameter values:[-129, "int", null,
org.apache.avro.data.TimeConversions$DateConversion@d3449b1]
{code}
The reason is, although the consumer side generates the class properly and
sets the field to be converted to logical type, but the convertToLogicalType
failed because the writer schema is used and the logical type is null
{code:java}
public static Object convertToLogicalType(Object datum, Schema schema,
LogicalType type, Conversion<?> conversion) {
if (datum == null) {
return null;
}
if (schema == null || type == null || conversion == null) {
throw new IllegalArgumentException("Parameters cannot be null! Parameter
values:"
+ Arrays.deepToString(new Object[] { datum, schema, type, conversion
}));
}
{code}
I am wondering if we should check type and properties in "unionEquiv" so that
we can use the reader schema if there is difference. This complies the AVRO spec
> Resolver cannot find reader/writer schema difference of fields in Union
> -----------------------------------------------------------------------
>
> Key: AVRO-2831
> URL: https://issues.apache.org/jira/browse/AVRO-2831
> Project: Apache Avro
> Issue Type: Bug
> Components: java
> Affects Versions: 1.9.0, 1.9.1, 1.9.2
> Reporter: Jinge Dai
> Priority: Blocker
>
> When avro java library deserializes and tries to resolve fields in a Union,
> it will use the writer schema if the reader field type is same as writer's,
> but it ignores the properties and logical type.
> {code:java}
> private static boolean unionEquiv(Schema write, Schema read, Map<SeenPair,
> Boolean> seen) {
> final Schema.Type wt = write.getType();
> if (wt != read.getType()) {
> return false;
> }
> {code}
> However, I have hit two issues:
> # https://issues.apache.org/jira/browse/AVRO-2702 If producer is not java,
> the string field the strings of the inner record will be deserialized to
> org.apache.avro.util.Utf8 and then get error
> {code:java}
> Caused by: java.lang.ClassCastException: class org.apache.avro.util.Utf8
> cannot be cast to class java.lang.String{code}
>
> 2. I am using a built-in-house C# lib on the producer side, and it builds
> Date field schema as INT type without logicalTyple, this is bug on my side.
> However, AVRO 1.8.2 is able to handle it by using reader schema, whereas
> 1.9.2 gets error
> {code:java}
> Caused by: java.lang.IllegalArgumentException: Parameters cannot be null!
> Parameter values:[-129, "int", null,
> org.apache.avro.data.TimeConversions$DateConversion@d3449b1]
> {code}
> The reason is, although the consumer side generates the class properly and
> sets the field to be converted to logical type, but the convertToLogicalType
> failed because the writer schema is used and the logical type is null
> {code:java}
> public static Object convertToLogicalType(Object datum, Schema schema,
> LogicalType type, Conversion<?> conversion) {
> if (datum == null) {
> return null;
> }
> if (schema == null || type == null || conversion == null) {
> throw new IllegalArgumentException("Parameters cannot be null! Parameter
> values:"
> + Arrays.deepToString(new Object[] { datum, schema, type, conversion
> }));
> }
> {code}
>
> I am wondering if we should check type and properties in "unionEquiv" so that
> we can use the reader schema if there is difference. This complies the AVRO
> spec
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)