[
https://issues.apache.org/jira/browse/CASSANDRA-12347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15406163#comment-15406163
]
Tyler Hobbs commented on CASSANDRA-12347:
-----------------------------------------
It's been a while since we discussed this, so I apologize if my understanding
of things is rusty. I definitely see the value in CASSANDRA-12346, and think
it's an excellent step forward. However, I'm not as convinced yet about this
ticket. Is it fair to say that these are the two main reasons for using
broadcast trees?
# Minimize the amount of time it takes for changes to be disseminated
throughout the cluster
# Minimize the amount of redundant messaging, thereby minimizing the amount of
gossip traffic
In the past, we discussed addressing #2 by reserving Gossip for infrequent
changes (schema, topology) and basing the failure detector on coordinated
requests instead of gossip heartbeats. That, combined with removing "severity"
from Gossip (which we recently started ignoring anyway) would make Gossip
traffic quite low already, leaving #2 as a pretty minor concern.
Regarding #1, assuming that we only use Gossip for relatively infrequent
changes, shouldn't broadcasting to every node in the "active view" (in
HyParView terms, CASSANDRA-12346) achieve rapid dissemination on its own?
Thicket is considerably more complex than just HyParView, so I want to make
sure that we're getting good enough benefits out of the additional complexity.
> Gossip 2.0 - broadcast tree for data dissemination
> --------------------------------------------------
>
> Key: CASSANDRA-12347
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12347
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Jason Brown
>
> Description: A broadcast tree (spanning tree) allows an originating node to
> efficiently send out updates to all of the peers in the cluster by
> constructing a balanced, self-healing tree based upon the view it gets from
> the peer sampling service (CASSANDRA-12346).
> I propose we use an algorithm based on the [Thicket
> paper|http://www.gsd.inesc-id.pt/%7Ejleitao/pdf/srds10-mario.pdf], which
> describes a dynamic, self-healing broadcast tree. When a given node needs to
> send out a message, it dynamically builds a tree for each node in the
> cluster; thus giving us a unique tree for every node in the cluster (a tree
> rooted at every cluster node). The trees, of course, would be reusable until
> the cluster configurations changes or failures are detected (by the mechanism
> described in the paper). Additionally, Thicket includes a mechanism for
> load-balancing the trees such that nodes spread out the work amongst
> themselves.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)