Yi Hu created AVRO-4203:
---------------------------
Summary: inconsistent implicit int->long conversion for non-null
and nullable long
Key: AVRO-4203
URL: https://issues.apache.org/jira/browse/AVRO-4203
Project: Apache Avro
Issue Type: Bug
Components: java
Reporter: Yi Hu
When using GenericDatumWriter to write an int to a long field, inconsistent
implicit int->long conversion is observed depending on the type is nullable or
not:
* write int to target schema int64, succeed
* write int to target schema nullable(int64), using storage_write_api, succeed
* write int to target schema nullable(int64), using file_load avro, exception
An example that reproduces the issue:
{code:java}
public class AvroTest {
private static final String SCHEMA_JSON = "{\n" +
" \"type\": \"record\",\n" +
" \"name\": \"UserEvent\",\n" +
" \"namespace\": \"com.example.avro\",\n" +
" \"fields\": [\n" +
" {\"name\": \"userId\", \"type\": \"string\"},\n" +
" {\"name\": \"nonNullLong\", \"type\": \"long\"},\n" +
" {\"name\": \"nullableLong\", \"type\": [\"null\", \"long\"],
\"default\": null}\n" +
" ]\n" +
"}"; public static void main(String[] argv) throws AvroRuntimeException,
IOException {
Schema schema = new Schema.Parser().parse(SCHEMA_JSON);
GenericRecord eventWithTimestamp = new GenericData.Record(schema);
eventWithTimestamp.put("userId", "user-123");
eventWithTimestamp.put("nonNullLong", 123);
eventWithTimestamp.put("nullableLong", 123); // fail
File avroOutputFile = new File("user-events.avro");
DatumWriter<GenericRecord> datumWriter = new GenericDatumWriter<>(schema);
try (DataFileWriter<GenericRecord> dataFileWriter = new
DataFileWriter<>(datumWriter)) {
dataFileWriter.create(schema, avroOutputFile);
dataFileWriter.append(eventWithTimestamp);
}
}
} {code}
Stack trace:
{code:java}
Caused by: org.apache.avro.UnresolvedUnionException: Not in union
["null","long"]: 123 (field=nullableLong)
at
org.apache.avro.generic.GenericDatumWriter.writeField(GenericDatumWriter.java:247)
at
org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:234)
at
org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:145)
at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:95)
at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:82)
at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:314)
... 1 more
Suppressed: org.apache.avro.UnresolvedUnionException: Not in union
["null","long"]: 123
Caused by: org.apache.avro.UnresolvedUnionException: Not in union
["null","long"]: 123 (field=nullableLong)
at org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:910)
at
org.apache.avro.generic.GenericDatumWriter.resolveUnion(GenericDatumWriter.java:307)
at
org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:157)
at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:95)
at
org.apache.avro.generic.GenericDatumWriter.writeField(GenericDatumWriter.java:245)
... 6 more {code}
This affects Apache Beam: https://github.com/apache/beam/issues/36735
--
This message was sent by Atlassian Jira
(v8.20.10#820010)