[ https://issues.apache.org/jira/browse/AVRO-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13610409#comment-13610409 ]
Doug Cutting commented on AVRO-1277: ------------------------------------ Thanks for benchmarking this. The improvements look significant and we should incorporate them. I doubt the caches need to be static, they could instead be per ReflectData instance without significantly harming performance. Folks currently use ReflectData.get() to access a static instance, but I suspect performance would not be much worse if ReflectData.get() allocated a new ReflectData instance. For example, ReflectDatumReader(Class) already allocates a new ReflectData instance. The cost of allocating this and its caches per the lifetime of a DatumReader or DatumWriter is minimal. So, to address Holger's concerns, if the performance is no worse, I'd argue we should: - change all caches in ReflectData, SpecificData & GenericData to be non-static, non-thread safe - change ReflectData.get() etc. to allocate new instances. These might be done in a separate, follow-up issue. For now, the new caches should be non-static. Using Unsafe when on a JVM that supports it is a great idea. Would you like to file a Jira for that? > Improve performance of reflection-based serializers > --------------------------------------------------- > > Key: AVRO-1277 > URL: https://issues.apache.org/jira/browse/AVRO-1277 > Project: Avro > Issue Type: Improvement > Components: java > Affects Versions: 1.7.4 > Reporter: Leo Romanoff > Priority: Minor > Attachments: AVRO-1277.patch, AVRO-1277-v2.patch > > > I've played a bit with Avro serialization based on reflection, i.e. with the > classes from org.apache.avro.generic package. > It works fine in this mode, but is rather slow compared to such frameworks > like protostuff or kryo. > Quick look at the source code has shown that a lot of reflection-based > operations and class lookups are not cached. And such operations are usually > pretty expensive when executed by a JVM. > So I changed some of org.apache.avro.generic classes, introduced caching and > a few other optimizations. Now it seems to perform much better. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira