[
https://issues.apache.org/jira/browse/CASSANDRA-3881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402144#comment-13402144
]
Sam Overton commented on CASSANDRA-3881:
----------------------------------------
I tried out the different approaches for removing this sychronization
requirement on Topology:
* [clone TokenMetadata before passing to
calculateNaturalEndpoints|https://github.com/acunu/cassandra/compare/bb6f4b40c8a8276971e9e4c50e92e6801dba08bf...3881-review-clone-tmd]
([raw
diff|https://github.com/acunu/cassandra/compare/bb6f4b40c8a8276971e9e4c50e92e6801dba08bf...3881-review-clone-tmd.diff])
This looks much more hairy than having synchronization - the places where
tokenMetadata needs to be cloned look arbitrary and it's much less obvious what
the convention is for other people looking at this (eg. why do you need to
clone before passing into AbstractReplicationStrategy.getAddressRanges() but
not AbstractReplicationStrategy.getPendingAddressRanges() ? A: it's because
getPendingAddressRanges makes its own clone to update a token).
* [return copies from Topology.getDatacenterEndpoints() and
Topology.getDatacenterRacks()|https://github.com/acunu/cassandra/compare/bb6f4b40c8a8276971e9e4c50e92e6801dba08bf...3881-review-copy-topo]
([raw
diff|https://github.com/acunu/cassandra/compare/bb6f4b40c8a8276971e9e4c50e92e6801dba08bf...3881-review-copy-topo.diff])
This is less error-prone because Topology now handles all its own
synchonization. Unfortunately copying those maps adds a large overhead which
removes most of the benefit of this ticket. See the [updated graph of
calculatePendingRanges vs. cluster
size|https://github.com/acunu/cassandra/wiki/images/calculate_pending_ranges_topo_copy.png].
This is not surprising because these copies require O(N) time.
* [document the synchronization requirements of Topology so that it's clear
what is
necessary|https://github.com/acunu/cassandra/commit/6d25673c372587beea3971d8ce31b06372525861]
([raw
diff|https://github.com/acunu/cassandra/commit/6d25673c372587beea3971d8ce31b06372525861.diff])
This is my preferred solution. Currently only calculateNaturalEndpoints uses
the TMD.Topology, so requiring to synchronize around it seems like a reasonable
solution to me. The javadoc should make it obvious to any new uses of it that
they need to synchronize.
> reduce computational complexity of processing topology changes
> --------------------------------------------------------------
>
> Key: CASSANDRA-3881
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3881
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Reporter: Peter Schuller
> Assignee: Sam Overton
> Labels: vnodes
>
> This constitutes follow-up work from CASSANDRA-3831 where a partial
> improvement was committed, but the fundamental issue was not fixed. The
> maximum "practical" cluster size was significantly improved, but further work
> is expected to be necessary as cluster sizes grow.
> _Edit0: Appended patch information._
> h3. Patches
> ||Compare||Raw diff||Description||
> |[00_snitch_topology|https://github.com/acunu/cassandra/compare/refs/top-bases/p/3881/00_snitch_topology...p/3881/00_snitch_topology]|[00_snitch_topology.patch|https://github.com/acunu/cassandra/compare/refs/top-bases/p/3881/00_snitch_topology...p/3881/00_snitch_topology.diff]|Adds
> some functionality to TokenMetadata to track which endpoints and racks exist
> in a DC.|
> |[01_calc_natural_endpoints|https://github.com/acunu/cassandra/compare/refs/top-bases/p/3881/01_calc_natural_endpoints...p/3881/01_calc_natural_endpoints]|[01_calc_natural_endpoints.patch|https://github.com/acunu/cassandra/compare/refs/top-bases/p/3881/01_calc_natural_endpoints...p/3881/01_calc_natural_endpoints.diff]|Rewritten
> O(logN) implementation of calculateNaturalEndpoints using the topology
> information from the tokenMetadata.|
> ----
> _Note: These are branches managed with TopGit. If you are applying the patch
> output manually, you will either need to filter the TopGit metadata files
> (i.e. {{wget -O - <url> | filterdiff -x*.topdeps -x*.topmsg | patch -p1}}),
> or remove them afterward ({{rm .topmsg .topdeps}})._
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira