[
https://issues.apache.org/jira/browse/AVRO-607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15308554#comment-15308554
]
Daniel Halperin commented on AVRO-607:
--------------------------------------
Thanks Ryan.
So if I'm reading this right, the only way to avoid this bug is to serialize
access to SpecificData.getSchema. Right now, we're hitting this via the
following code path:
{code}
--- Thread: Thread[pool-1-thread-15,5,main] State: RUNNABLE stack: ---
java.util.WeakHashMap.get(WeakHashMap.java:403)
org.apache.avro.specific.SpecificData.getSchema(SpecificData.java:187)
org.apache.avro.reflect.ReflectData.isRecord(ReflectData.java:168)
org.apache.avro.generic.GenericData.getSchemaName(GenericData.java:612)
org.apache.avro.specific.SpecificData.getSchemaName(SpecificData.java:265)
org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:601)
org.apache.avro.generic.GenericDatumWriter.resolveUnion(GenericDatumWriter.java:151)
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:71)
org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:143)
org.apache.avro.generic.GenericDatumWriter.writeField(GenericDatumWriter.java:114)
org.apache.avro.reflect.ReflectDatumWriter.writeField(ReflectDatumWriter.java:175)
org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:104)
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)
org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:143)
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:58)
org.apache.beam.sdk.coders.AvroCoder.encode(AvroCoder.java:264)
{code}
We have many different threads all calling AvroCoder#encode (on different
DatumWriter instances), which eventually reflectively uses the static cache.
Are there any standard ways to populate the cache to avoid this problem?
> SpecificData.getSchema not thread-safe
> --------------------------------------
>
> Key: AVRO-607
> URL: https://issues.apache.org/jira/browse/AVRO-607
> Project: Avro
> Issue Type: Bug
> Components: java
> Affects Versions: 1.3.3, 1.8.1
> Reporter: Stephen Tu
> Fix For: 1.8.2
>
> Attachments: AVRO-607.patch
>
>
> SpecificData.getSchema uses a WeakHashMap to cache schemas, but WeakHashMap
> is not thread-safe, and the method itself is not synchronized. Seems like
> this could lead to the data structure getting corrupted.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)