Joel Turkel created AVRO-2999:
---------------------------------
Summary: Optimize Ruby union serialization
Key: AVRO-2999
URL: https://issues.apache.org/jira/browse/AVRO-2999
Project: Apache Avro
Issue Type: Improvement
Components: ruby
Affects Versions: 1.10.0
Reporter: Joel Turkel
Assignee: Joel Turkel
Profiling Avro serialization in our union heavy schema shows some memory and
throughput bottlenecks:
* Validation calls repeatedly allocate constant hashes
* Validation calls repeatedly allocate constant strings
* Validation calls are expensive and can be avoided when determining of a
datum matches a null union member type (a common pattern for "optional" fields)
Optimizing these codepaths reduces memory allocations by 78% and improves
throughput 1.9X in our encoding benchmarks. A Github PR is coming shortly.
Note: Encoding unions is still expensive because the code must determine which
member of the union a datum is targeting. Allowing clients to explicitly
specify this would speed up serialization even further but that requires a
larger API change.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)