[ 
https://issues.apache.org/jira/browse/CALCITE-3786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17035644#comment-17035644
 ] 

Vladimir Sitnikov commented on CALCITE-3786:
--------------------------------------------

{quote} Digest would need a toString() method which would only ever be invoked 
in a debugger{quote}
That is true

{quote} There is one potential performance downside: spreading pieces of digest 
all over the heap, so that walking a digest (for deep equals, for example) 
would generate a lot of cache misses{quote}
That is true as well. However, walking over RexNodes still generates cache 
misses.

An extra downside of "too small" bits might be "object header" overheads.

I guess the way to deal with this risk is to implement the thing and measure 
how it behaves.
I think it is hard to predict if "rope-like" or "string-like" digest would win.

At the end of the day, we might implement digest as "a single string blob", so 
there's always a way back :)


> Add Digest (HashStrategy?) interface to enable efficient hashCode/equals for 
> RexNode, RelNode
> ---------------------------------------------------------------------------------------------
>
>                 Key: CALCITE-3786
>                 URL: https://issues.apache.org/jira/browse/CALCITE-3786
>             Project: Calcite
>          Issue Type: New Feature
>          Components: core
>    Affects Versions: 1.21.0
>            Reporter: Vladimir Sitnikov
>            Priority: Major
>
> Current digests for RexNode, RelNode, RelType, and similar cases use String 
> concatenation.
> It is easy to implement, however, it has drawbacks:
> 1) String objects cannot be reused. For instance, RexCall has operands, 
> however, the digest is duplicated. It causes extra memory use, and extra CPU 
> for string copying
> 2) There's no way to have multiple #toString() methods. RelType might need 
> multiple digests: "including field names", "excluding field names".
> Suggested resolution might be behind the lines of
> {code:java}
> class Digest { // immutable
>   final int hashCode; // speedup hashCode and equals
>   final Object[] contents; // The values are either other Digest objects or 
> Strings
> }
> {code}
> Then the digest for RexCall could be the bits relevant to RexCall itself + 
> digests of the operands (which can be reused as is)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to