Joel Turkel created AVRO-2999:
---------------------------------

             Summary: Optimize Ruby union serialization
                 Key: AVRO-2999
                 URL: https://issues.apache.org/jira/browse/AVRO-2999
             Project: Apache Avro
          Issue Type: Improvement
          Components: ruby
    Affects Versions: 1.10.0
            Reporter: Joel Turkel
            Assignee: Joel Turkel


Profiling Avro serialization in our union heavy schema shows some memory and 
throughput bottlenecks:
 * Validation calls repeatedly allocate constant hashes
 * Validation calls repeatedly allocate constant strings
 * Validation calls are expensive and can be avoided when determining of a 
datum matches a null union member type (a common pattern for "optional" fields)

Optimizing these codepaths reduces memory allocations by 78% and improves 
throughput 1.9X in our encoding benchmarks. A Github PR is coming shortly.

Note: Encoding unions is still expensive because the code must determine which 
member of the union a datum is targeting. Allowing clients to explicitly 
specify this would speed up serialization even further but that requires a 
larger API change.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to