[ 
https://issues.apache.org/jira/browse/CASSANDRA-14174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16331190#comment-16331190
 ] 

Jason Brown commented on CASSANDRA-14174:
-----------------------------------------

bq. I'd want to double-check this before I'd call it a formal review

lol - I should probably post a patch, as well :D. I wanted to get your initial 
thoughts, to make sure they line up with mine. I'll get something together 
within the hour.

> Remove GossipDigestSynVerbHandler#doSort()
> ------------------------------------------
>
>                 Key: CASSANDRA-14174
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14174
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Jason Brown
>            Assignee: Jason Brown
>            Priority: Minor
>             Fix For: 4.x
>
>
> I have personally tripped up on this function a couple of times over the 
> years, believing that it contributes to bugs in some way or another. While I 
> have not found that (necessarily!) to be the case, I feel this function is 
> completely useless in the grand scope of things.
> Going back through the mists of time (that is, {{git log}}), it appears this 
> function was part of the original code drop from Facebook when they open 
> sourced cassandra. Looking at the {{#doSort()}} method, all it does is sort 
> the incoming list of \{{GossipDigest}} s by the difference between the remote 
> node's maxValue for a given peer and the local nodes' maxValue.
> The only universe where is actually an optimization is if you go back and 
> read the [Scuttlebutt 
> paper|https://www.cs.cornell.edu/home/rvr/papers/flowgossip.pdf] (upon which 
> cassandra's Gossip anti-reconcilliation is based). The end of section 3.2 
> describes ordering of the incoming digests such that, in the case where you 
> do not return all of the differences (because you are optimizing for the 
> return message size), you can gather the differences for the peers which are 
> most of out sync. The ordering implemented in cassandra is the second 
> ordering described in the paper, called "scuttle depth".
> As we always send all differences between two nodes (message size be damned), 
> this optimization, borrowed from the paper, is largely irrelevant for 
> Cassandra's purposes.
> Thus, I propose we remove this method for the following gains:
>  - less garbage created
>  - less CPU (sure, it's mostly trivial; see next point)
>  - less time spent on unnecessary functionality on the *single threaded* 
> gossip stage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to