[ 
https://issues.apache.org/jira/browse/AVRO-2831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinge Dai updated AVRO-2831:
----------------------------
    Description: 
When avro java library deserializes and tries to resolve fields in a Union, it 
will use the writer schema if the reader field type is same as writer's, but it 
ignores the properties and logical type.
{code:java}
private static boolean unionEquiv(Schema write, Schema read, Map<SeenPair, 
Boolean> seen) {
  final Schema.Type wt = write.getType();
  if (wt != read.getType()) {
    return false;
  }
{code}
However, I have hit two issues: 
 # https://issues.apache.org/jira/browse/AVRO-2702 If producer is not java, the 
string field the strings of the inner record will be deserialized to 
org.apache.avro.util.Utf8 and then get error

{code:java}
Caused by: java.lang.ClassCastException: class org.apache.avro.util.Utf8 cannot 
be cast to class java.lang.String{code}
 

2. I am using a built-in-house C# lib on the producer side, and it builds Date 
field schema as INT type without logicalTyple, this is bug on my side. However, 
AVRO 1.8.2 is able to handle it by using reader schema, whereas 1.9.2 gets 
error 
{code:java}
Caused by: java.lang.IllegalArgumentException: Parameters cannot be null! 
Parameter values:[-129, "int", null, 
org.apache.avro.data.TimeConversions$DateConversion@d3449b1]
{code}
The reason is,  although the consumer side generates the class properly and 
sets the field to be converted to logical type, but the convertToLogicalType 
failed because the writer schema is used and the logical type is null
{code:java}
public static Object convertToLogicalType(Object datum, Schema schema, 
LogicalType type, Conversion<?> conversion) {
  if (datum == null) {
    return null;
  }

  if (schema == null || type == null || conversion == null) {
    throw new IllegalArgumentException("Parameters cannot be null! Parameter 
values:"
        + Arrays.deepToString(new Object[] { datum, schema, type, conversion 
}));
  }
{code}
 

I am wondering if we should check type and properties in "unionEquiv" so that 
we can use the reader schema if there is difference. This complies the AVRO spec

 

  was:
When avro java library deserializes and tries to resolve fields in a Union, it 
will use the writer schema if the reader field type is same as writer's, but it 
ignores the properties and logical type.
{code:java}
private static boolean unionEquiv(Schema write, Schema read, Map<SeenPair, 
Boolean> seen) {
  final Schema.Type wt = write.getType();
  if (wt != read.getType()) {
    return false;
  }
{code}
However, I have hit two issues: 
 # https://issues.apache.org/jira/browse/AVRO-2702 If producer is not java, the 
string field the strings of the inner record will be deserialized to 
org.apache.avro.util.Utf8 and then get error

{code:java}
Caused by: java.lang.ClassCastException: class org.apache.avro.util.Utf8 cannot 
be cast to class java.lang.String{code}
 

2. I am using a built-in-house C# lib on the producer side, and it builds Date 
field schema as INT type without logicalTyple, this is bug on my side. However, 
AVRO 1.8.2 is able to handle it by using reader schema, whereas 1.9.2 gets 
error 

 
{code:java}
Caused by: java.lang.IllegalArgumentException: Parameters cannot be null! 
Parameter values:[-129, "int", null, 
org.apache.avro.data.TimeConversions$DateConversion@d3449b1]
{code}
The reason is,  although the consumer side generates the class properly and 
sets the field to be converted to logical type, but the convertToLogicalType 
failed because the writer schema is used and the logical type is null

 

 
{code:java}
public static Object convertToLogicalType(Object datum, Schema schema, 
LogicalType type, Conversion<?> conversion) {
  if (datum == null) {
    return null;
  }

  if (schema == null || type == null || conversion == null) {
    throw new IllegalArgumentException("Parameters cannot be null! Parameter 
values:"
        + Arrays.deepToString(new Object[] { datum, schema, type, conversion 
}));
  }
{code}
 

I am wondering if we should check type and properties in "unionEquiv" so that 
we can use the reader schema if there is difference. This complies the AVRO spec

 


> Resolver cannot find reader/writer schema difference of fields in Union
> -----------------------------------------------------------------------
>
>                 Key: AVRO-2831
>                 URL: https://issues.apache.org/jira/browse/AVRO-2831
>             Project: Apache Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.9.0, 1.9.1, 1.9.2
>            Reporter: Jinge Dai
>            Priority: Blocker
>
> When avro java library deserializes and tries to resolve fields in a Union, 
> it will use the writer schema if the reader field type is same as writer's, 
> but it ignores the properties and logical type.
> {code:java}
> private static boolean unionEquiv(Schema write, Schema read, Map<SeenPair, 
> Boolean> seen) {
>   final Schema.Type wt = write.getType();
>   if (wt != read.getType()) {
>     return false;
>   }
> {code}
> However, I have hit two issues: 
>  # https://issues.apache.org/jira/browse/AVRO-2702 If producer is not java, 
> the string field the strings of the inner record will be deserialized to 
> org.apache.avro.util.Utf8 and then get error
> {code:java}
> Caused by: java.lang.ClassCastException: class org.apache.avro.util.Utf8 
> cannot be cast to class java.lang.String{code}
>  
> 2. I am using a built-in-house C# lib on the producer side, and it builds 
> Date field schema as INT type without logicalTyple, this is bug on my side. 
> However, AVRO 1.8.2 is able to handle it by using reader schema, whereas 
> 1.9.2 gets error 
> {code:java}
> Caused by: java.lang.IllegalArgumentException: Parameters cannot be null! 
> Parameter values:[-129, "int", null, 
> org.apache.avro.data.TimeConversions$DateConversion@d3449b1]
> {code}
> The reason is,  although the consumer side generates the class properly and 
> sets the field to be converted to logical type, but the convertToLogicalType 
> failed because the writer schema is used and the logical type is null
> {code:java}
> public static Object convertToLogicalType(Object datum, Schema schema, 
> LogicalType type, Conversion<?> conversion) {
>   if (datum == null) {
>     return null;
>   }
>   if (schema == null || type == null || conversion == null) {
>     throw new IllegalArgumentException("Parameters cannot be null! Parameter 
> values:"
>         + Arrays.deepToString(new Object[] { datum, schema, type, conversion 
> }));
>   }
> {code}
>  
> I am wondering if we should check type and properties in "unionEquiv" so that 
> we can use the reader schema if there is difference. This complies the AVRO 
> spec
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to