Kevin created AVRO-3451:
---------------------------
Summary: fix poor Avro write performance
Key: AVRO-3451
URL: https://issues.apache.org/jira/browse/AVRO-3451
Project: Apache Avro
Issue Type: Improvement
Components: rust
Affects Versions: 1.11.0
Environment: Mac OS X Big Sur
{code:java}
installed toolchains
--------------------
stable-x86_64-apple-darwin (default)
nightly-x86_64-apple-darwin
active toolchain
----------------
stable-x86_64-apple-darwin (default)
rustc 1.56.1 (59eed8a2a 2021-11-01) {code}
Reporter: Kevin
Rust implementation of Apache Avro library – apache-avro (née avro-rs) –
demonstrates poor write performance when serializing Rust structures to Avro.
Profiling indicates that this implementation spends an inordinate amount of
time in the function {{encode::encode_ref}} performing {{clone()}} and {{drop}}
operations related to a HashMap<String, Schema> type.
We modified the function {{encode_ref0}} as follows:
{code:java}
-pub fn encode_ref(value: &Value, schema: &Schema, buffer: &mut Vec<u8>) {
- fn encode_ref0(
+pub fn encode_ref<'a>(value: &Value, schema: &'a Schema, buffer: &mut Vec<u8>)
{
+ fn encode_ref0<'a>(
value: &Value,
- schema: &Schema,
+ schema: &'a Schema,
buffer: &mut Vec<u8>,
- schemas_by_name: &mut HashMap<String, Schema>,
+ schemas_by_name: &mut HashMap<&'a str, &'a Schema>,
) {
match &schema {
Schema::Ref { ref name } => {
- let resolved =
schemas_by_name.get(name.name.as_str()).unwrap();
+ let resolved = schemas_by_name.get(&name.name as
&str).unwrap();
return encode_ref0(value, resolved, buffer, &mut
schemas_by_name.clone());
}
Schema::Record { ref name, .. }
| Schema::Enum { ref name, .. }
| Schema::Fixed { ref name, .. } => {
- schemas_by_name.insert(name.name.clone(), schema.clone());
+ schemas_by_name.insert(&name.name, &schema);
}
_ => (),
}{code}
to remove any need for Clone in the {{schemas_by_name}} cache and see a notable
improvement (factor of 4 to 5) in our application with this change.
After this change, all Cargo Tests still pass and Benchmarks display a very
significant improvement in Write performance across the board.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)