[jira] [Commented] (CALCITE-3836) The hash codes of RelNodes are unreliable

Ruben Q L (Jira) Tue, 03 Mar 2020 05:48:57 -0800


    [ 
https://issues.apache.org/jira/browse/CALCITE-3836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17050225#comment-17050225
 ]


Ruben Q L commented on CALCITE-3836:
------------------------------------

IMHO the current ticket should be moved from "Bug" to "Improvement", because 
(as others have already said), I believe there is no problem with the current 
behavior.
Having said that, I think we should consider the proposed patch, it may have 
some theoretical advantages (hash code computation performance + avoiding the 
disability of biased locking). As [~julianhyde] said, it would be nice to have 
some benchmarks to validate these assumptions.

[~xndai]
{quote}this article says - "simply asking for the identity hash code of an 
object will disable biased locking". This is because you cannot store both hash 
code and thread id in the mark word. So calling object.hashCode() would 
transfer it into unbiased mode. You get the same behavior even if you choose to 
return RelNode id. It has nothing to do with the hash algorithm, but just an 
artifact of the implementation of biased lock.
{quote}
I think this is not correct. My understanding is that the biased locking 
disability arrives only when the identity hash code ({{Object#hashCode}}) is 
used; if we override {{hashCode}} in our class, we would not get this situation.

> The hash codes of RelNodes are unreliable
> -----------------------------------------
>
>                 Key: CALCITE-3836
>                 URL: https://issues.apache.org/jira/browse/CALCITE-3836
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>            Reporter: Liya Fan
>            Priority: Critical
>              Labels: pull-request-available
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> For all sub-classes of AbstractRelNode, the {{hashCode}} methods depend on 
> {{AbstractRelNode#hashCode}}, because it is declared as final. 
> {{AbstractRelNode#hashCode}} depends on {{Object#hashCode}}, which is called 
> identify hash code. The details of identity hash code depends on the specific 
> JVM implementation. For many JVMs, the implementation is based on the object 
> address in the memory. The problem is that, the address of an object may 
> change in a JVM, due to GC, memory contraction, etc. So the hash code of an 
> object may change, even if the content of the object is not changed (This can 
> be confirmed from the JavaDoc of {{Object#hashCode}}). 
> This problem may cause severe issues that are hard to diagnose and debug, 
> like an object is in the hash table, but cannot be retrieved; duplicate 
> objects in the hash map, etc. 
> To solve the problem, we compute a hash code solely from the node id. This is 
> consistent with the previous semantics, and solves the above problem. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-3836) The hash codes of RelNodes are unreliable

Reply via email to