[
https://issues.apache.org/jira/browse/AVRO-853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13061293#comment-13061293
]
Douglas Kaminsky commented on AVRO-853:
---------------------------------------
Further, I question whether properties actually need to be part of the hash
code. Certainly they factor in to the equality check, but what is the real harm
if two perfectly identical schemas with slightly different properties end up in
the same hash bucket? How often will this actually happen? I can't imagine this
would significantly impact hashing performance.
Equal objects should have equal hashcodes, but equal hashcodes don't imply
equal objects
> Cache hash codes in Schema and Field
> ------------------------------------
>
> Key: AVRO-853
> URL: https://issues.apache.org/jira/browse/AVRO-853
> Project: Avro
> Issue Type: Improvement
> Components: java
> Affects Versions: 1.5.1
> Reporter: Douglas Kaminsky
> Attachments: AVRO-853-approach2.patch, AVRO-853.patch
>
>
> We are experiencing a serious performance degradation when trying to
> store/retrieve fields and schemas in hash-based data structures (eg.
> HashMap). Since all fields and schemas are immutable (with the exception of
> RecordSchema allowing deferred setting of Fields) it makes sense to cache the
> hash code on the object instead of recalculating every time the hashCode
> method gets called.
> (Are there other mutable Schema sub-types that I'm not thinking about?)
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira