[ 
https://issues.apache.org/jira/browse/AVRO-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16654739#comment-16654739
 ] 

Raymie Stata edited comment on AVRO-2090 at 10/18/18 7:19 AM:
--------------------------------------------------------------

I've attached my two runs of Perf.java combined into a single file 
([^perf-data.txt]).  The first four columns of numbers in this file are the 
results with custom-encoders turned off; the next four columns are the results 
with custom-encoders on.

For the two SpecificRecord cases: On my machine, FooBarSpecificRecordTestWrite 
improved 36% (from 3577 ms to 2296 ms), while FooBarSpecificRecordTestRead 
improved 12% (4728 ms to 4130 ms).  It's not surprising that the read case 
improved less: the overhead of accommodating schema migration is high.  I have 
some ideas on how improve performance even more, esp. for the read case.  That 
said, a >10% improvement is not bad, and 36% improvement is quite good, so I 
suggest we commit this change as-is and save further improvements to future 
patches.

(Thiru points out that FooBarSpecificRecord a very small class that probably 
understates the performance-improvements of this patch.  In our work at Aqfer, 
we've seen larger improvements.)


was (Author: raymie):
I've attached my two runs of Perf.java combined into a single file 
([^perf-data.txt]).  The first four columns of numbers in this file are the 
results with custom-encoders turned off; the next four columns are the results 
with custom-encoders on.

For the two SpecificRecord cases: On my machine, FooBarSpecificRecordTestWrite 
improved 36% (from 3577 ms to 2296 ms), while FooBarSpecificRecordTestRead 
improved 12% (4728 ms to 4130 ms).  It's not surprising that the read case 
improved less: the overhead of accommodating schema migration is high.  I have 
some ideas on how improve performance even more, esp. for the read case.  That 
said, a >10% improvement is not bad, and 36% improvement is quite good, so I 
suggest we commit this change as-is and save further improvements to future 
patches.

> Improve encode/decode time for SpecificRecord using code generation
> -------------------------------------------------------------------
>
>                 Key: AVRO-2090
>                 URL: https://issues.apache.org/jira/browse/AVRO-2090
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>            Reporter: Raymie Stata
>            Assignee: Raymie Stata
>            Priority: Major
>         Attachments: customcoders.md, perf-data.txt
>
>
> Compared to GenericRecords, SpecificRecords offer type-safety plus the 
> performance of traditional getters/setters/instance variables.  But these are 
> only beneficial to Java code accessing those records.  SpecificRecords 
> inherit serialization and deserialization code from GenericRecords, which is 
> dynamic and thus slow (in fact, benchmarks show that serialization and 
> deserialization is _slower_ for SpecificRecord than for GenericRecord).
> This patch extends record.vm to generate custom, higher-performance encoder 
> and decoder functions for SpecificRecords.  We've run a public benchmark 
> showing that the new code reduces serialization time by 2/3 and 
> deserialization time by close to 50%.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to