[
https://issues.apache.org/jira/browse/AVRO-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13610409#comment-13610409
]
Doug Cutting commented on AVRO-1277:
------------------------------------
Thanks for benchmarking this. The improvements look significant and we should
incorporate them.
I doubt the caches need to be static, they could instead be per ReflectData
instance without significantly harming performance. Folks currently use
ReflectData.get() to access a static instance, but I suspect performance would
not be much worse if ReflectData.get() allocated a new ReflectData instance.
For example, ReflectDatumReader(Class) already allocates a new ReflectData
instance. The cost of allocating this and its caches per the lifetime of a
DatumReader or DatumWriter is minimal.
So, to address Holger's concerns, if the performance is no worse, I'd argue we
should:
- change all caches in ReflectData, SpecificData & GenericData to be
non-static, non-thread safe
- change ReflectData.get() etc. to allocate new instances.
These might be done in a separate, follow-up issue. For now, the new caches
should be non-static.
Using Unsafe when on a JVM that supports it is a great idea. Would you like to
file a Jira for that?
> Improve performance of reflection-based serializers
> ---------------------------------------------------
>
> Key: AVRO-1277
> URL: https://issues.apache.org/jira/browse/AVRO-1277
> Project: Avro
> Issue Type: Improvement
> Components: java
> Affects Versions: 1.7.4
> Reporter: Leo Romanoff
> Priority: Minor
> Attachments: AVRO-1277.patch, AVRO-1277-v2.patch
>
>
> I've played a bit with Avro serialization based on reflection, i.e. with the
> classes from org.apache.avro.generic package.
> It works fine in this mode, but is rather slow compared to such frameworks
> like protostuff or kryo.
> Quick look at the source code has shown that a lot of reflection-based
> operations and class lookups are not cached. And such operations are usually
> pretty expensive when executed by a JVM.
> So I changed some of org.apache.avro.generic classes, introduced caching and
> a few other optimizations. Now it seems to perform much better.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira