[
https://issues.apache.org/jira/browse/AVRO-2636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16985049#comment-16985049
]
Ryan Skraba commented on AVRO-2636:
-----------------------------------
Hello! It appears that every *internal* use of {{getDefaultValue(field)}}
inside Avro makes a deep copy before using it as a datum. This isn't well
documented as a requirement on this method, and I'm taking a look at the impact
it would have to either (1) not cache the default value of bytes, or (2) to
return a read-only {{bb.duplicate()}} directly from this method.
I just made a PR on the related AVRO-2592 to avoid modifying the ByteBuffer
during decimal conversion, which would correct your unit test, but not the
underlying problem. If it is a problem... I'm thinking that a {{byte[]}}
default value for fixed types will always be mutable. At the very minimum,
there should be a strong advisory/warning in the javadoc.
As a quick question, are you using the {{getDefaultValue}} method in a
different way?
> GenericData defaultValueCache caches mutable ByteBuffers
> --------------------------------------------------------
>
> Key: AVRO-2636
> URL: https://issues.apache.org/jira/browse/AVRO-2636
> Project: Apache Avro
> Issue Type: Bug
> Components: java
> Reporter: Valentin Nikotin
> Priority: Minor
>
> It appears that for default value for Byte type (and Decimal logical type if
> it uses underlying Bytes type) value rendered with getDefaultValue is cached.
> This leads to bugs when you read the same value (for example if converted
> with DecimalConversion). For single thread environment workaround would be to
> reset ByteBuffer after read, but in concurrent environment we should not
> cache mutable objects.
>
> {code:java}
> @Test(expected=NumberFormatException.class)
> public void testReuse() {
> Conversions.DecimalConversion decimalConversion =
> new Conversions.DecimalConversion();
> LogicalType logicalDecimal =
> LogicalTypes.decimal(38, 9);
> ByteBuffer defaultValue =
> decimalConversion.toBytes(
> BigDecimal.valueOf(42L).setScale(9),
> null,
> logicalDecimal);
> Schema schema = SchemaBuilder
> .record("test")
> .fields()
> .name("decimal")
>
> .type(logicalDecimal.addToSchema(SchemaBuilder.builder().bytesType()))
> .withDefault(defaultValue)
> .endRecord();
> BigDecimal firstRead = decimalConversion
> .fromBytes(
> (ByteBuffer)
> GenericData.get().getDefaultValue(schema.getField("decimal")),
> null,
> logicalDecimal);
> BigDecimal secondRead = decimalConversion
> .fromBytes(
> (ByteBuffer)
> GenericData.get().getDefaultValue(schema.getField("decimal")),
> null,
> logicalDecimal);
> }
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)