[
https://issues.apache.org/jira/browse/CALCITE-3786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17138289#comment-17138289
]
Danny Chen commented on CALCITE-3786:
-------------------------------------
Hi, [~vladimirsitnikov] ~
I write a benchmark there [1] to compare the performance and memory usage diff
between the pure string digest and the new Digest structure.
{code:xml}
Benchmark (isStringDigest)
(joins) (whereClauseDisjunctions) Mode Cnt Score Error Units
DigestBenchmark.getRelFromDigestToRelMap false
1 1 avgt 5 0.113 ± 0.009 us/op
DigestBenchmark.getRelFromDigestToRelMap:Max memory heap false
1 1 avgt 5 376963072.000 bytes
DigestBenchmark.getRelFromDigestToRelMap false
1 10 avgt 5 0.146 ± 0.029 us/op
DigestBenchmark.getRelFromDigestToRelMap:Max memory heap false
1 10 avgt 5 346554368.000 bytes
DigestBenchmark.getRelFromDigestToRelMap false
1 100 avgt 5 0.138 ± 0.014 us/op
DigestBenchmark.getRelFromDigestToRelMap:Max memory heap false
1 100 avgt 5 348127232.000 bytes
DigestBenchmark.getRelFromDigestToRelMap false
10 1 avgt 5 0.452 ± 0.041 us/op
DigestBenchmark.getRelFromDigestToRelMap:Max memory heap false
10 1 avgt 5 397934592.000 bytes
DigestBenchmark.getRelFromDigestToRelMap false
10 10 avgt 5 0.450 ± 0.050 us/op
DigestBenchmark.getRelFromDigestToRelMap:Max memory heap false
10 10 avgt 5 383254528.000 bytes
DigestBenchmark.getRelFromDigestToRelMap false
10 100 avgt 5 0.452 ± 0.085 us/op
DigestBenchmark.getRelFromDigestToRelMap:Max memory heap false
10 100 avgt 5 353894400.000 bytes
DigestBenchmark.getRelFromDigestToRelMap false
20 1 avgt 5 0.819 ± 0.239 us/op
DigestBenchmark.getRelFromDigestToRelMap:Max memory heap false
20 1 avgt 5 327155712.000 bytes
DigestBenchmark.getRelFromDigestToRelMap false
20 10 avgt 5 0.814 ± 0.123 us/op
DigestBenchmark.getRelFromDigestToRelMap:Max memory heap false
20 10 avgt 5 427819008.000 bytes
DigestBenchmark.getRelFromDigestToRelMap false
20 100 avgt 5 0.844 ± 0.218 us/op
DigestBenchmark.getRelFromDigestToRelMap:Max memory heap false
20 100 avgt 5 366477312.000 bytes
{code}
{code:xml}
Benchmark (isStringDigest)
(joins) (whereClauseDisjunctions) Mode Cnt Score Error Units
DigestBenchmark.getRelFromDigestToRelMap true
1 1 avgt 5 1.797 ± 0.218 us/op
DigestBenchmark.getRelFromDigestToRelMap:Max memory heap true
1 1 avgt 5 412090368.000 bytes
DigestBenchmark.getRelFromDigestToRelMap true
1 10 avgt 5 1.824 ± 0.147 us/op
DigestBenchmark.getRelFromDigestToRelMap:Max memory heap true
1 10 avgt 5 405274624.000 bytes
DigestBenchmark.getRelFromDigestToRelMap true
1 100 avgt 5 2.109 ± 0.453 us/op
DigestBenchmark.getRelFromDigestToRelMap:Max memory heap true
1 100 avgt 5 402653184.000 bytes
DigestBenchmark.getRelFromDigestToRelMap true
10 1 avgt 5 12.118 ± 0.113 us/op
DigestBenchmark.getRelFromDigestToRelMap:Max memory heap true
10 1 avgt 5 346030080.000 bytes
DigestBenchmark.getRelFromDigestToRelMap true
10 10 avgt 5 12.231 ± 0.807 us/op
DigestBenchmark.getRelFromDigestToRelMap:Max memory heap true
10 10 avgt 5 438304768.000 bytes
DigestBenchmark.getRelFromDigestToRelMap true
10 100 avgt 5 12.102 ± 0.243 us/op
DigestBenchmark.getRelFromDigestToRelMap:Max memory heap true
10 100 avgt 5 412090368.000 bytes
DigestBenchmark.getRelFromDigestToRelMap true
20 1 avgt 5 31.184 ± 0.347 us/op
DigestBenchmark.getRelFromDigestToRelMap:Max memory heap true
20 1 avgt 5 357564416.000 bytes
DigestBenchmark.getRelFromDigestToRelMap true
20 10 avgt 5 32.900 ± 1.832 us/op
DigestBenchmark.getRelFromDigestToRelMap:Max memory heap true
20 10 avgt 5 322437120.000 bytes
DigestBenchmark.getRelFromDigestToRelMap true
20 100 avgt 5 32.072 ± 1.185 us/op
DigestBenchmark.getRelFromDigestToRelMap:Max memory heap true
20 100 avgt 5 309329920.000 bytes
{code}
In order to reduce the disturbing factors, i ran the old and new in 2 JVMs, the
results show that there is an impressive improvement(20x) for performance,
for the memory usage, when the join nodes was less than 10, there are about 10%
promotion, but when the join nodes was 20, the data has some floating,
I used the max used heap mem as the metric, is there better way to compare the
memory there ?
[1]
https://github.com/danny0405/calcite/commit/848bafba39bee0de8399a5906885d0960b33397d
> Add Digest interface to enable efficient hashCode(equals) for RexNode and
> RelNode
> ---------------------------------------------------------------------------------
>
> Key: CALCITE-3786
> URL: https://issues.apache.org/jira/browse/CALCITE-3786
> Project: Calcite
> Issue Type: New Feature
> Components: core
> Affects Versions: 1.21.0
> Reporter: Vladimir Sitnikov
> Assignee: Danny Chen
> Priority: Major
> Fix For: 1.24.0
>
> Time Spent: 5h 20m
> Remaining Estimate: 0h
>
> Current digests for RexNode, RelNode, RelType, and similar cases use String
> concatenation.
> It is easy to implement, however, it has drawbacks:
> 1) String objects cannot be reused. For instance, RexCall has operands,
> however, the digest is duplicated. It causes extra memory use and extra CPU
> for string copying
> 2) There's no way to have multiple #toString() methods. RelType might need
> multiple digests: "including field names", "excluding field names".
> A suggested resolution might be behind the lines of
> {code:java}
> class Digest { // immutable
> final int hashCode; // speedup hashCode and equals
> final Object[] contents; // The values are either other Digest objects or
> Strings
> String toString(); // e.g. for debugging purposes
> int compareTo(Digest); // e.g. for debugging purposes.
> }
> {code}
> Note how fields in Kotlin are aligned much better, and it makes it easier to
> read:
> {code:java}
> class Digest { // immutable
> val hashCode: Int // speedup hashCode and equals
> val contents: Array<Any> // The values are either other Digest objects or
> Strings
> fun toString(): String // e.g. for debugging purposes
> fun compareTo(other: Digest): Int // e.g. for debugging purposes.
> }
> {code}
> Then the digest for RexCall could be the bits relevant to RexCall itself +
> digests of the operands (which can be reused as is)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)