[ 
https://issues.apache.org/jira/browse/AVRO-853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13065028#comment-13065028
 ] 

Douglas Kaminsky commented on AVRO-853:
---------------------------------------

The purpose of a "quacksLike" method is to determine if two schemas are 
structurally equal - in the particular example where we encountered the 
original slowdown, we were defining a custom method for serializing schemas, 
where we performed certain optimizations if we had encountered the schema 
before. In our example, we don't care if we encounter the exact same schema 
elsewhere in our protocol, nor if the properties or aliases have been modified. 
For our purposes, structurally equivalent schemas are the same schema...

Take for example the following schema with corresponding fields (in non-JSON to 
save typing):

{code}
{
 "name" : "A",
 "type" : "record",
 "fields" : [{"name" : "foo", "type" : "int"},
             {"name" : "bar", "type" : "long"}]
}
{code}

Now let's say that at some point another thread (for the purpose of argument) 
modifies the properties of this schema:

{code}
{
 "name" : "A",
 "type" : "record",
 "fields" : [{"name" : "foo", "type" : "int"},
             {"name" : "bar", "type" : "long"}]
 "java-type-hint" : "some.type.Here"
}
{code}


A.equals(B) == false
A.quacksLike(B) == true

I almost want to say it's about congruence, but a true congruence predicate 
would probably ignore naming, too.

> Cache hash codes in Schema and Field
> ------------------------------------
>
>                 Key: AVRO-853
>                 URL: https://issues.apache.org/jira/browse/AVRO-853
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>    Affects Versions: 1.5.1
>            Reporter: Douglas Kaminsky
>         Attachments: AVRO-853-approach2.patch, AVRO-853.patch
>
>
> We are experiencing a serious performance degradation when trying to 
> store/retrieve fields and schemas in hash-based data structures (eg. 
> HashMap). Since all fields and schemas are immutable (with the exception of 
> RecordSchema allowing deferred setting of Fields) it makes sense to cache the 
> hash code on the object instead of recalculating every time the hashCode 
> method gets called. 
> (Are there other mutable Schema sub-types that I'm not thinking about?)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to