[
https://issues.apache.org/jira/browse/CASSANDRA-12791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jeremy Hanna updated CASSANDRA-12791:
-------------------------------------
Component/s: (was: Materialized Views)
> MessageIn logic to determine if the message is cross-node is wrong
> ------------------------------------------------------------------
>
> Key: CASSANDRA-12791
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12791
> Project: Cassandra
> Issue Type: Bug
> Components: Observability, Streaming and Messaging
> Reporter: Sylvain Lebresne
> Assignee: Sylvain Lebresne
> Priority: Minor
> Fix For: 3.10
>
>
> {{MessageIn}} has the following code to read the 'creation time' of the
> message on the receiving side:
> {noformat}
> public static ConstructionTime readTimestamp(InetAddress from, DataInputPlus
> input, long timestamp) throws IOException
> {
> // make sure to readInt, even if cross_node_to is not enabled
> int partial = input.readInt();
> long crossNodeTimestamp = (timestamp & 0xFFFFFFFF00000000L) | (((partial
> & 0xFFFFFFFFL) << 2) >> 2);
> if (timestamp > crossNodeTimestamp)
> {
> MessagingService.instance().metrics.addTimeTaken(from, timestamp -
> crossNodeTimestamp);
> }
> if(DatabaseDescriptor.hasCrossNodeTimeout())
> {
> return new ConstructionTime(crossNodeTimestamp, timestamp !=
> crossNodeTimestamp);
> }
> else
> {
> return new ConstructionTime();
> }
> }
> {noformat}
> where {{timestamp}} is really the local time on the receiving node when
> calling that method.
> The incorrect part, I believe, is the {{timestamp != crossNodeTimestamp}}
> used to set the {{isCrossNode}} field of {{ConstructionTime}}. A first
> problem is that this will basically always be {{true}}: for it to be
> {{false}}, we'd need the low-bytes of the timestamp taken on the sending node
> to coincide exactly with the ones taken on the receiving side, which is
> _very_ unlikely. It is also a relatively meaningless test: having that test
> be {{false}} basically means the lack of clock sync between the 2 nodes is
> exactly the time the 2 calls to {{System.currentTimeMillis()}} (on sender and
> receiver), which is definitively not what we care about.
> What the result of this test is used for is to determine if the message was
> crossNode or local. It's used to increment different metrics (we separate
> metric local versus crossNode dropped messages) in {{MessagingService}} for
> instance. And that's where this is kind of a bug: not only the {{timestamp !=
> crossNodeTimestamp}}, but if {{DatabaseDescriptor.hasCrossNodeTimeout()}}, we
> *always* have this {{isCrossNode}} false, which means we'll never increment
> the "cross-node dropped messages" metric, which is imo unexpected.
> That is, it is true that if {{DatabaseDescriptor.hasCrossNodeTimeout() ==
> false}}, then we end using the receiver side timestamp to timeout messages,
> and so you end up only dropping messages that timeout locally. And _in that
> sense_, always incrementing the "locally" dropped messages metric is not
> completely illogical. But I doubt most users are aware of those pretty
> specific nuance when looking at the related metrics, and I'm relatively sure
> users expect a metrics named {{droppedCrossNodeTimeout}} to actually count
> cross-node messages by default (keep in mind that
> {{DatabaseDescriptor.hasCrossNodeTimeout()}} is actually false by default).
> Anyway, to sum it up I suggest that the following change should be done:
> # the {{timestamp != crossNodeTimestamp}} test is definitively not what we
> want. We should at a minimum just replace it to {{true}} as that's basically
> what it ends up being except for very rare and arguably random cases.
> # given how the {{ConstructionTime.isCrossNode}} is used, I suggest that we
> really want it to mean if the message has shipped cross-node, not just be a
> synonymous of {{DatabaseDescriptor.hasCrossNodeTimeout()}}. It should be
> whether the message shipped cross-node, i.e. whether {{from ==
> BroadcastAdress()}} or not.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]