[ 
https://issues.apache.org/jira/browse/AVRO-3527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17958772#comment-17958772
 ] 

ASF subversion and git services commented on AVRO-3527:
-------------------------------------------------------

Commit c880f4729e0dcb99bab82dd5b5efb15f0da55890 in avro's branch 
refs/heads/main from Steven Aerts
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=c880f4729e ]

AVRO-3527: codegen equals and hashCode for Records (#1708)

Update the compiler to generate the implementation of the `.equals()`
and `.hashCode() function, instead of relying on the
implementation of GenericData.  This improves the performance of
those functions significantly.

The generated implementations are factor 10 to 20 faster for
`.equals()` and a factor 5 to 10 for `.hashCode()`.

Result of Perf test before the change:

```
Benchmark              Mode  Cnt          Score             Error  Units
SpecficTest.equals    thrpt    3   12598610.194 +/-  11160265.279  ops/s
SpecficTest.hashCode  thrpt    3   24729446.862 +/-  29051332.794  ops/s
```

Results using generated functions:

```
Benchmark              Mode  Cnt          Score             Error  Units
SpecficTest.equals    thrpt    3  211314296.950 +/- 104154793.126  ops/s
SpecficTest.hashCode  thrpt    3  180349506.632 +/- 143639246.771  ops/s
```

Signed-off-by: Steven Aerts <[email protected]>

> Generated equals() and hashCode() for SpecificRecords
> -----------------------------------------------------
>
>                 Key: AVRO-3527
>                 URL: https://issues.apache.org/jira/browse/AVRO-3527
>             Project: Apache Avro
>          Issue Type: Improvement
>          Components: java
>            Reporter: Steven Aerts
>            Assignee: Christophe Le Saec
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.12.0
>
>         Attachments: equals_hashcode_after.txt, equals_hashcode_before.txt, 
> flame_graph.jpeg
>
>          Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> When profiling our production system, we found that it was spending almost 
> 40% of its overall time in the {{SpecificRecordBase.hashCode()}} and 
> {{SpecificRecordBase.equals()}} implementations.
> In some sections of its logic we see that almost all time is spend in those 
> function, as can be seen in attached flame graph  (blue "pyramids")
> !flame_graph.jpeg|width=385,height=99!
> By generating the {{.equals()}} and {{.hashCode()}} all this overhead 
> disappeared and this application became 35% faster overall. 
> Also on other AVRO heavy applications we saw noticeable performance gains 
> where we hadn't expect them due to this improvement.
> A generated implementation of {{.hashCode()}} becomes 5 to 10 times faster 
> than its generic counterpart. For {{.equals()}} it is 10 to 20 times faster.
> Which is also visible in the attached JMH benchmarks.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to