[
https://issues.apache.org/jira/browse/AVRO-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated AVRO-4069:
---------------------------------
Labels: pull-request-available (was: )
> Remove Reader String Cache from Generic Datum Reader
> ----------------------------------------------------
>
> Key: AVRO-4069
> URL: https://issues.apache.org/jira/browse/AVRO-4069
> Project: Apache Avro
> Issue Type: Improvement
> Components: java
> Affects Versions: 1.12.0
> Reporter: David Mollitor
> Assignee: David Mollitor
> Priority: Minor
> Labels: pull-request-available
> Fix For: 1.13.0
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> I was doing some profiling, and this "ReaderCache" code lit up:
> {code:java}
> public Class getStringClass(final Schema s) {
> final IdentitySchemaKey key = new IdentitySchemaKey(s);
> return this.stringClassCache.computeIfAbsent(key, (IdentitySchemaKey k)
> -> this.findStringClass.apply(k.schema));
> }
> }
> private final ReaderCache readerCache = new
> ReaderCache(this::findStringClass);
> protected Class findStringClass(Schema schema) {
> String name = schema.getProp(GenericData.STRING_PROP);
> if (name == null)
> return CharSequence.class;
> switch (GenericData.StringType.valueOf(name)) {
> case String:
> return String.class;
> default:
> return CharSequence.class;
> }
> }
> {code}
> The String cache here is simply caching a single value: the class of the
> {{STRING_PROP}} in the Schema. Well, this is a lot over overhead for caching
> a relatively simple mapping. Consider that this must create a new
> {{IdentitySchemaKey}} object every time it does this lookup and this is a HOT
> path. It would take less time time, and add less Heap pressure, to perform
> the simple mapping for each invocation.
> Follow on work: the Map in the Schema is synchronized. Maybe the map can be
> made non-synchronized or the Schema can explicitly cache this value in a
> non-synchronized way to make this one property load faster.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)