[ https://issues.apache.org/jira/browse/AVRO-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16654739#comment-16654739 ]
Raymie Stata edited comment on AVRO-2090 at 10/18/18 7:19 AM: -------------------------------------------------------------- I've attached my two runs of Perf.java combined into a single file ([^perf-data.txt]). The first four columns of numbers in this file are the results with custom-encoders turned off; the next four columns are the results with custom-encoders on. For the two SpecificRecord cases: On my machine, FooBarSpecificRecordTestWrite improved 36% (from 3577 ms to 2296 ms), while FooBarSpecificRecordTestRead improved 12% (4728 ms to 4130 ms). It's not surprising that the read case improved less: the overhead of accommodating schema migration is high. I have some ideas on how improve performance even more, esp. for the read case. That said, a >10% improvement is not bad, and 36% improvement is quite good, so I suggest we commit this change as-is and save further improvements to future patches. (Thiru points out that FooBarSpecificRecord a very small class that probably understates the performance-improvements of this patch. In our work at Aqfer, we've seen larger improvements.) was (Author: raymie): I've attached my two runs of Perf.java combined into a single file ([^perf-data.txt]). The first four columns of numbers in this file are the results with custom-encoders turned off; the next four columns are the results with custom-encoders on. For the two SpecificRecord cases: On my machine, FooBarSpecificRecordTestWrite improved 36% (from 3577 ms to 2296 ms), while FooBarSpecificRecordTestRead improved 12% (4728 ms to 4130 ms). It's not surprising that the read case improved less: the overhead of accommodating schema migration is high. I have some ideas on how improve performance even more, esp. for the read case. That said, a >10% improvement is not bad, and 36% improvement is quite good, so I suggest we commit this change as-is and save further improvements to future patches. > Improve encode/decode time for SpecificRecord using code generation > ------------------------------------------------------------------- > > Key: AVRO-2090 > URL: https://issues.apache.org/jira/browse/AVRO-2090 > Project: Avro > Issue Type: Improvement > Components: java > Reporter: Raymie Stata > Assignee: Raymie Stata > Priority: Major > Attachments: customcoders.md, perf-data.txt > > > Compared to GenericRecords, SpecificRecords offer type-safety plus the > performance of traditional getters/setters/instance variables. But these are > only beneficial to Java code accessing those records. SpecificRecords > inherit serialization and deserialization code from GenericRecords, which is > dynamic and thus slow (in fact, benchmarks show that serialization and > deserialization is _slower_ for SpecificRecord than for GenericRecord). > This patch extends record.vm to generate custom, higher-performance encoder > and decoder functions for SpecificRecords. We've run a public benchmark > showing that the new code reduces serialization time by 2/3 and > deserialization time by close to 50%. -- This message was sent by Atlassian JIRA (v7.6.3#76005)