[
https://issues.apache.org/jira/browse/AVRO-1760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15025699#comment-15025699
]
Ryan Blue commented on AVRO-1760:
---------------------------------
[~mulugeta], thanks for reporting the problem. Nice looking doc to point out
the difference, too!
I had a look at the patch and I don't think we can merge it as-is. While it
certainly does fix the performance problem, it introduces a memory leak because
the cache you're changing is no longer a WeakHashMap. The constructor for
ConcurrentHashMap that you use copies the contents of the other map instead of
what I think you intended, using the other map for internal storage like the
synchronizedMap factory method does. So this patch would cause Avro to keep
strong references to fields that never get removed.
Luckily, I think we already have a solution from AVRO-607. That issue also
requires a weak identity concurrent map and I built one a few months ago that
is waiting for review. If you don't mind, could you review [that
patch|https://github.com/apache/avro/pull/30] and make sure that fixes this as
well? Once we get that in, we can use the same class to fix this problem.
[~busbey], could you have a look as well?
> Thread scalability problem with the use of SynchronizedMap
> ----------------------------------------------------------
>
> Key: AVRO-1760
> URL: https://issues.apache.org/jira/browse/AVRO-1760
> Project: Avro
> Issue Type: Improvement
> Components: java
> Affects Versions: 1.7.4, 1.7.5, 1.7.6, 1.7.7, 1.8.0, 1.8.1, 1.9.0
> Reporter: Mulugeta Mammo
> Priority: Critical
> Labels: patch, performance
> Fix For: 1.7.4, 1.7.5, 1.7.6, 1.7.7, 1.7.8, 1.8.0, 1.8.1, 1.9.0
>
> Attachments: AVRO-1760-Issue-and-Patch.pdf, AVRO-1760.patch
>
>
> While running Adam Genomics (which uses Avro) on Apache Spark, we discovered
> that threads (tasks in Spark Context) block in Avro while executing the
> getDefaultValue(Field field) method in
> https://github.com/apache/avro/blob/trunk/lang/java/avro/src/main/java/org/apache/avro/generic/GenericData.java
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)