[
https://issues.apache.org/jira/browse/AVRO-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17313052#comment-17313052
]
Ryan Skraba commented on AVRO-3101:
-----------------------------------
Yes, this is weird -- serialization succeeds but
{{GenericData.get().validate(schema, record)}} returns false. That might be a
workaround for the moment for your specific case.
I'd expect {{output.equals(record)}} to be false as well, which isn't *ideal*
for a round-trip serialization, but it looks like the current behaviour is to
throw the ClassCastException!
Should this be a configurable option in GenericDatumWriter?
> Primitive number values are silently truncated in Java GenericDatumWriter
> -------------------------------------------------------------------------
>
> Key: AVRO-3101
> URL: https://issues.apache.org/jira/browse/AVRO-3101
> Project: Apache Avro
> Issue Type: Bug
> Components: java
> Affects Versions: 1.10.0, 1.10.1, 1.10.2
> Reporter: James Clarke
> Priority: Major
>
> Primitive java numeric types are silently truncated in GenericDatumWriter.
> Previously (1.9.2) a Type.LONG field with a double value set would cause a
> ClassCastException when serializing the datum.
> Changes in AVRO-2070 cause a double value to be silently truncated.
> I don't know if this is a bug or expected behavior since in 1.9.2 (and way
> way earlier) Type.INT would be silently truncated but other numerics would
> not.
> My use-case involves users generating data which conforms to a dynamically
> generated Avro schema. The current change provides type safety (for
> downstream consumers) but does not maintain data integrity. From my POV it
> would be better to users to explicitly error with a ClassCastException than
> to introduce corrupt data.
> Example test case, which throws ClassCastException in 1.9.2 and prints 456
> (not the value set) in 1.10.2.
> {code:java}
> @Test
> fun testWritingDoubleToLong() {
> val longType = Schema.create(Schema.Type.LONG)
> val field = Schema.Field("long", longType)
> val fields = listOf(field)
> val schema = Schema.createRecord("test", "doc", "", false, fields)
> val record: GenericRecord = GenericData.Record(schema)
> record.put("long", 456.4)
> val stream = ByteArrayOutputStream()
> val datumWriter: DatumWriter<GenericRecord> = GenericDatumWriter(schema)
> val encoder = EncoderFactory.get().binaryEncoder(stream, null)
> datumWriter.write(record, encoder)
> encoder.flush()
> val decoder = DecoderFactory.get().binaryDecoder(stream.toByteArray(), null)
> val datumReader: DatumReader<GenericRecord> = GenericDatumReader(schema)
> val output = datumReader.read(null, decoder)
> println(output["long"])
> }{code}
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)