[
https://issues.apache.org/jira/browse/AVRO-853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13065028#comment-13065028
]
Douglas Kaminsky commented on AVRO-853:
---------------------------------------
The purpose of a "quacksLike" method is to determine if two schemas are
structurally equal - in the particular example where we encountered the
original slowdown, we were defining a custom method for serializing schemas,
where we performed certain optimizations if we had encountered the schema
before. In our example, we don't care if we encounter the exact same schema
elsewhere in our protocol, nor if the properties or aliases have been modified.
For our purposes, structurally equivalent schemas are the same schema...
Take for example the following schema with corresponding fields (in non-JSON to
save typing):
{code}
{
"name" : "A",
"type" : "record",
"fields" : [{"name" : "foo", "type" : "int"},
{"name" : "bar", "type" : "long"}]
}
{code}
Now let's say that at some point another thread (for the purpose of argument)
modifies the properties of this schema:
{code}
{
"name" : "A",
"type" : "record",
"fields" : [{"name" : "foo", "type" : "int"},
{"name" : "bar", "type" : "long"}]
"java-type-hint" : "some.type.Here"
}
{code}
A.equals(B) == false
A.quacksLike(B) == true
I almost want to say it's about congruence, but a true congruence predicate
would probably ignore naming, too.
> Cache hash codes in Schema and Field
> ------------------------------------
>
> Key: AVRO-853
> URL: https://issues.apache.org/jira/browse/AVRO-853
> Project: Avro
> Issue Type: Improvement
> Components: java
> Affects Versions: 1.5.1
> Reporter: Douglas Kaminsky
> Attachments: AVRO-853-approach2.patch, AVRO-853.patch
>
>
> We are experiencing a serious performance degradation when trying to
> store/retrieve fields and schemas in hash-based data structures (eg.
> HashMap). Since all fields and schemas are immutable (with the exception of
> RecordSchema allowing deferred setting of Fields) it makes sense to cache the
> hash code on the object instead of recalculating every time the hashCode
> method gets called.
> (Are there other mutable Schema sub-types that I'm not thinking about?)
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira