[
https://issues.apache.org/jira/browse/CASSANDRA-15213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17012390#comment-17012390
]
Benedict Elliott Smith commented on CASSANDRA-15213:
----------------------------------------------------
Looking good. A couple of minor suggestions:
# The {{value <= 0}} check can be moved inside of the {{value <= 8}} check,
since both will rarely be performed
# Can {{(int) value - (bucketOffsets[0] == 0 ? 0 : 1)}} be simplified to
{{(int) value - bucketOffsets[0]}} ?
# {{value > bucketOffsets[bucketOffsets.length - 1]}} could be replaced by
(e.g.) {{firstCandidate = min(firstCandidate,bucketOffsets.length-2)}}, which
only requires touching the already-used header portion of the array, avoiding
one potential cache-miss (though not one measured on the existing benchmarks I
expect).
As it stands, I think in the normal course of execution it's already reasonable
to expect no mispredicted branches in the implementation you've put up, and
these tweaks should improve the situation a little. I think it's fairly likely
the final pair of return statements will be implemented as a {{cmov}}, but it
would be interesting to take a look at the final assembly. When we have a
final product maybe we can take a look.
> DecayingEstimatedHistogramReservoir Inefficiencies
> --------------------------------------------------
>
> Key: CASSANDRA-15213
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15213
> Project: Cassandra
> Issue Type: Bug
> Components: Observability/Metrics
> Reporter: Benedict Elliott Smith
> Assignee: Jordan West
> Priority: Normal
> Fix For: 4.0-beta
>
>
> * {{LongAdder}} introduced to trunk consumes 9MiB of heap without user
> schemas, and this will grow significantly under contention and user schemas
> with many tables. This is because {{LongAdder}} is a very heavy class
> designed for single contended values.
> ** This can likely be improved significantly, without significant loss of
> performance in the contended case, by simply increasing the size of our
> primitive backing array and providing multiple buckets, with each thread
> picking a bucket to increment, or simply multiple backing arrays. Probably a
> better way still to do this would be to introduce some competition detection
> to the update, much like {{LongAdder}} utilises, that increases the number of
> backing arrays under competition.
> ** To save memory this approach could partition the space into chunks that
> are likely to be updated together, so that we do not need to duplicate the
> entire array under competition.
> * Similarly, binary search is costly and a measurable cost as a share of the
> new networking work (without filtering it was > 10% of the CPU used overall).
> We can compute an approximation floor(log2 n / log2 1.2) extremely cheaply,
> to save the random memory access costs.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]