Hello,
I have a question on a specific design decision in Avro. I have a schema
with a "logicalType=decimal" field. When using SpecificDatumReader to
deserialize it, the field will get correctly deserialized as BigDecimal,
because the set of Converters contains the BigDecimalConversion.
When converting with a GenericDatumReader, the set of converters is empty.
Is there a reason why it's empty? Why are the default converters not
included?
When reading the field with a GenericDatumReader, the converters set is
provided by the GenericData object. So if I provide a GenericData with the
converters, it will get converted to BigDecimal. If GenericData is not
provided in the GenericDatumReader's constructor, I will get a ByteBuffer.
Sample code below:
car.avsc:
{
"type": "record",
"namespace": "com.schwarzenegger",
"name": "Car",
"fields": [
{ "name": "model", "type": "string" },
{ "name": "engineCode", "type": { "type": "bytes",
"logicalType": "decimal", "precision": 8, "scale": 0 } }
]
}
Test.java:
public class Test {
public static void main(String[] args) throws Exception {
Schema schema = new Schema.Parser().parse( ... );
System.out.println("LogicalType = " +
schema.getField("engineCode").schema().getLogicalType());
GenericData.Record record = null;
try (FileInputStream payloadInputStream = new FileInputStream(new
File("C:\\Temp\\car.txt"))) {
GenericData genericData = new GenericData();
genericData.addLogicalTypeConversion(new
Conversions.DecimalConversion());
GenericDatumReader genericReader = new
GenericDatumReader(schema, schema, genericData);
record = (GenericData.Record) genericReader.read(null,
DecoderFactory.get().binaryDecoder(payloadInputStream, null));
}
Object engineCode =
record.get(record.getSchema().getField("engineCode").pos());
System.out.println(String.format("code = %s, class = %s",
engineCode, engineCode.getClass().getName()));
}
}
This will print out:
LogicalType = org.apache.avro.LogicalTypes$Decimal@f8
code = 12345678, class = java.math.BigDecimal
If I remove that Genericdata part and create the GenericDatumReader without
it, then I will get backa ByteBuffer because the conversions set is empty.
Is there a reason why that is? If not, can we modify Avro and add the
default conversions to the GenericDatumReader?
In GenericDatumReader this is the relevant code:
****************
protected Object read(Object old, Schema expected, ResolvingDecoder in)
throws IOException {
Object datum = this.readWithoutConversion(old, expected, in);
LogicalType logicalType = expected.getLogicalType();
if (logicalType != null) {
Conversion<?> conversion =
this.getData().getConversionFor(logicalType);
if (conversion != null) {
return this.convert(datum, expected, logicalType,
conversion);
}
}
return datum;
}
public Conversion<Object> getConversionFor(LogicalType logicalType) {
return logicalType == null ? null :
(Conversion)this.conversions.get(logicalType.getName());
}
****************
Thanks,
Csaba