[ 
https://issues.apache.org/jira/browse/AVRO-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13610409#comment-13610409
 ] 

Doug Cutting commented on AVRO-1277:
------------------------------------

Thanks for benchmarking this.  The improvements look significant and we should 
incorporate them.

I doubt the caches need to be static, they could instead be per ReflectData 
instance without significantly harming performance.  Folks currently use 
ReflectData.get() to access a static instance, but I suspect performance would 
not be much worse if ReflectData.get() allocated a new ReflectData instance.  
For example, ReflectDatumReader(Class) already allocates a new ReflectData 
instance.  The cost of allocating this and its caches per the lifetime of a 
DatumReader or DatumWriter is minimal.

So, to address Holger's concerns, if the performance is no worse, I'd argue we 
should:
 - change all caches in ReflectData, SpecificData & GenericData to be 
non-static, non-thread safe
 - change ReflectData.get() etc. to allocate new instances.

These might be done in a separate, follow-up issue.  For now, the new caches 
should be non-static.

Using Unsafe when on a JVM that supports it is a great idea.  Would you like to 
file a Jira for that?
                
> Improve performance of reflection-based serializers
> ---------------------------------------------------
>
>                 Key: AVRO-1277
>                 URL: https://issues.apache.org/jira/browse/AVRO-1277
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>    Affects Versions: 1.7.4
>            Reporter: Leo Romanoff
>            Priority: Minor
>         Attachments: AVRO-1277.patch, AVRO-1277-v2.patch
>
>
> I've played a bit with Avro serialization based on reflection, i.e. with the 
> classes from org.apache.avro.generic package.
> It works fine in this mode, but is rather slow compared to such frameworks 
> like protostuff or kryo.
> Quick look at the source code has shown that a lot of reflection-based 
> operations and class lookups are not cached. And such operations are usually 
> pretty expensive when executed by a JVM. 
> So I changed some of org.apache.avro.generic classes, introduced caching and 
> a few other optimizations. Now it seems to perform much better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to