[
https://issues.apache.org/jira/browse/CASSANDRA-14192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17162686#comment-17162686
]
Stefan Miklosovic commented on CASSANDRA-14192:
-----------------------------------------------
FYI [~Bereng] this should be automatically resolved, there was fair amount of
work which went into CASSANDRA-15406 for that one to happen. E.g.
CASSANDRA-15694. If you ever test CASSANDRA-15406 as a reviewer, it would be
awesome if this one is took into consideration so we may close this one "for
free".
> netstats information mismatch between senders and receivers
> -----------------------------------------------------------
>
> Key: CASSANDRA-14192
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14192
> Project: Cassandra
> Issue Type: Bug
> Components: Legacy/Observability
> Reporter: Jonathan Ballet
> Assignee: Vincent White
> Priority: Low
>
> When adding a new node to an existing cluster, the {{netstats}} command
> called while the node is joining show different statistic values between the
> node receiving the data and the nodes sending the data.
> Receiving node:
> {code}
> Mode: JOINING
> Bootstrap 0a599bf0-01c5-11e8-a256-8d847377f816
> /172.20.13.184
> /172.20.30.7
> Receiving 433 files, 36.64 GiB total. Already received 88 files, 4.6
> GiB total
> [...]
> /172.20.40.128
> /172.20.16.45
> Receiving 405 files, 38.3 GiB total. Already received 86 files, 6.02
> GiB total
> [...]
> /172.20.9.63
> Read Repair Statistics:
> Attempted: 0
> Mismatch (Blocking): 0
> Mismatch (Background): 0
> Pool Name Active Pending Completed Dropped
> Large messages n/a 0 0 0
> Small messages n/a 0 11121 0
> Gossip messages n/a 0 32690 0
> {code}
> Sending node 1:
> {code}
> Mode: NORMAL
> Bootstrap 0a599bf0-01c5-11e8-a256-8d847377f816
> /172.20.21.19
> Sending 433 files, 36.64 GiB total. Already sent 433 files, 36.64 GiB
> total
> [...]
> Read Repair Statistics:
> Attempted: 680832
> Mismatch (Blocking): 716
> Mismatch (Background): 279
> Pool Name Active Pending Completed Dropped
> Large messages n/a 2 123307 4
> Small messages n/a 2 637010302 509
> Gossip messages n/a 23 798851 11535
> {code}
> Sending node 2:
> {code}
> Mode: NORMAL
> Bootstrap 0a599bf0-01c5-11e8-a256-8d847377f816
> /172.20.21.19
> Sending 405 files, 38.3 GiB total. Already sent 405 files, 38.3 GiB
> total
> [...]
> Read Repair Statistics:
> Attempted: 84967
> Mismatch (Blocking): 17568
> Mismatch (Background): 3078
> Pool Name Active Pending Completed Dropped
> Large messages n/a 2 17818 2
> Small messages n/a 2 126082304 507
> Gossip messages n/a 34 202810 11725
> {code}
> In this case, the join process is running since a while and the sending nodes
> seem to say they sent everything already. This output stays the same for a
> while though (maybe ~15% of the total joining time).
> However, the receiving node values stay like this once the sending nodes have
> sent everything, until it goes from this state to the {{NORMAL}} state (so
> there's visually no catching up from ~86 files to ~405 files for example, it
> goes directly from the state showed above to {{NORMAL}})
> This makes tracking the progress of the join process a bit more difficult
> than needed, because we need to compare and deduce the actual state from both
> the receiving node values and the sending nodes values, which are both "not
> correct" (sending nodes say everything has been sent but stays in this state
> for a long time, receiving node says it still needs to download lot of
> files/data before finishing.)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]