James Clarke created AVRO-3101:
----------------------------------

             Summary: Primitive number values are silently truncated in Java 
GenericDatumWriter
                 Key: AVRO-3101
                 URL: https://issues.apache.org/jira/browse/AVRO-3101
             Project: Apache Avro
          Issue Type: Bug
          Components: java
    Affects Versions: 1.10.2, 1.10.1, 1.10.0
            Reporter: James Clarke


Primitive java numeric types are silently truncated in GenericDatumWriter.

Previously (1.9.2) a Type.LONG field with a double value set would cause a 
ClassCastException when serializing the datum.

Changes in AVRO-2070 cause a double value to be silently truncated.

I don't know if this is a bug or expected behavior since in 1.9.2 (and way way 
earlier) Type.INT would be silently truncated but other numerics would not.

My use-case involves users generating data which conforms to a dynamically 
generated Avro schema. The current change provides type safety (for downstream 
consumers) but does not maintain data integrity. From my POV it would be better 
to users to explicitly error with a ClassCastException than to introduce 
corrupt data.

Example test case, which throws ClassCastException in 1.9.2 and prints 456 (not 
the value set) in 1.10.2. 
{code:java}
@Test
fun testWritingDoubleToLong() {
 val longType = Schema.create(Schema.Type.LONG)
 val field = Schema.Field("long", longType)
 val fields = listOf(field)
 val schema = Schema.createRecord("test", "doc", "", false, fields)
 val record: GenericRecord = GenericData.Record(schema)
 record.put("long", 456.4)

 val stream = ByteArrayOutputStream()
 val datumWriter: DatumWriter<GenericRecord> = GenericDatumWriter(schema)
 val encoder = EncoderFactory.get().binaryEncoder(stream, null)
 datumWriter.write(record, encoder)
 encoder.flush()
 val decoder = DecoderFactory.get().binaryDecoder(stream.toByteArray(), null)
 val datumReader: DatumReader<GenericRecord> = GenericDatumReader(schema)
 val output = datumReader.read(null, decoder)
 println(output["long"])
}{code}
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to