[jira] [Commented] (CALCITE-3836) The hash codes of RelNodes are unreliable

Julian Hyde (Jira) Thu, 05 Mar 2020 09:56:22 -0800


    [ 
https://issues.apache.org/jira/browse/CALCITE-3836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17052386#comment-17052386
 ]


Julian Hyde commented on CALCITE-3836:
--------------------------------------

I agree that determinism is useful, although not at all costs. But if you want 
determinism, avoid data structures whose order is affected by random elements. 
E.g. if you need deterministic iteration order over a map, use a LinkedHashMap 
rather than a HashMap.

You should accept that hash codes are intrinsically chaotic - that is their 
purpose.

it's ironic that what started this discussion was me making 
AbstractRelNode.hashCode() final - with the explicit goal, and instruction, for 
people to devise their own keys for RelNodes that met their particular purpose. 
I stand by that advice.

> The hash codes of RelNodes are unreliable
> -----------------------------------------
>
>                 Key: CALCITE-3836
>                 URL: https://issues.apache.org/jira/browse/CALCITE-3836
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>            Reporter: Liya Fan
>            Priority: Minor
>              Labels: pull-request-available
>          Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> For all sub-classes of AbstractRelNode, the {{hashCode}} methods depend on 
> {{AbstractRelNode#hashCode}}, because it is declared as final. 
> {{AbstractRelNode#hashCode}} depends on {{Object#hashCode}}, which is called 
> identify hash code. The details of identity hash code depends on the specific 
> JVM implementation. For many JVMs, the implementation is based on the object 
> address in the memory. The problem is that, the address of an object may 
> change in a JVM, due to GC, memory contraction, etc. So the hash code of an 
> object may change, even if the content of the object is not changed (This can 
> be confirmed from the JavaDoc of {{Object#hashCode}}). 
> This problem may cause severe issues that are hard to diagnose and debug, 
> like an object is in the hash table, but cannot be retrieved; duplicate 
> objects in the hash map, etc. 
> To solve the problem, we compute a hash code solely from the node id. This is 
> consistent with the previous semantics, and solves the above problem. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-3836) The hash codes of RelNodes are unreliable

Reply via email to