[
https://issues.apache.org/jira/browse/AVRO-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ryan Skraba updated AVRO-2999:
------------------------------
Resolution: Fixed
Status: Resolved (was: Patch Available)
> Optimize Ruby union serialization
> ---------------------------------
>
> Key: AVRO-2999
> URL: https://issues.apache.org/jira/browse/AVRO-2999
> Project: Apache Avro
> Issue Type: Improvement
> Components: ruby
> Affects Versions: 1.10.0
> Reporter: Joel Turkel
> Assignee: Joel Turkel
> Priority: Major
> Fix For: 1.11.0, 1.10.2
>
>
> Profiling Avro serialization in our union heavy schema shows some memory and
> throughput bottlenecks:
> * Validation calls repeatedly allocate constant hashes
> * Validation calls repeatedly allocate constant strings
> * Validation calls are expensive and can be avoided when determining of a
> datum matches a null union member type (a common pattern for "optional"
> fields)
> Optimizing these codepaths reduces memory allocations by 78% and improves
> throughput 1.9X in our encoding benchmarks. A Github PR is coming shortly.
> Note: Encoding unions is still expensive because the code must determine
> which member of the union a datum is targeting. Allowing clients to
> explicitly specify this would speed up serialization even further but that
> requires a larger API change.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)