[
https://issues.apache.org/jira/browse/AVRO-853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13060984#comment-13060984
]
Scott Carey commented on AVRO-853:
----------------------------------
Something like this is fine for this 'version' of the Schema. A possible
future immutable Schema implementation would differ.
We should add a big javadoc warning around changing Schemas after they have
been used in a map or set to the class and setProp.
Since it is only valid to call setFields() once on a Record, rather than
'reset' the hash, just throw a runtime exception if hashCode is called and the
fields are null. It is an invalid record if it is used prior to setting the
fields -- returning a hashCode would be an error.
We should pick a large random number for the sentinel 'unset' hash value rather
than 0. 0 is more common than a random int because several simple hash
functions can return 0 -- for example String's hashCode has a bias for 0 on
common strings.
> Cache hash codes in Schema and Field
> ------------------------------------
>
> Key: AVRO-853
> URL: https://issues.apache.org/jira/browse/AVRO-853
> Project: Avro
> Issue Type: Improvement
> Components: java
> Affects Versions: 1.5.1
> Reporter: Douglas Kaminsky
> Attachments: AVRO-853.patch
>
>
> We are experiencing a serious performance degradation when trying to
> store/retrieve fields and schemas in hash-based data structures (eg.
> HashMap). Since all fields and schemas are immutable (with the exception of
> RecordSchema allowing deferred setting of Fields) it makes sense to cache the
> hash code on the object instead of recalculating every time the hashCode
> method gets called.
> (Are there other mutable Schema sub-types that I'm not thinking about?)
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira