[
https://issues.apache.org/jira/browse/CASSANDRA-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15905541#comment-15905541
]
Jason Brown commented on CASSANDRA-13289:
-----------------------------------------
Some thoughts:
- maybe only instantiate
{{AbstractWriteResponseHandler#responsesAndExpirations}} in
{{#setIdealCLResponseHandler()}}, and thus only create the {{AtomicInteger}}
when you know you are actually going to use it.
- if the ideal CL and the requested CL are the same, should we even bother
capturing metrics about it? I'm kinda mixed on it...
- what happens if the user mixes non-CAS consistency levels with CAS
consistency levels (or vice versa)? I think the behavior will be correct (we
won't inadvertantly violate paxos semantics), but the semantic difference
between CAS and non-CAS requests might not be meaningful. So perhaps ignore the
idealCl if the CL types are different? wdyt?
- how will timed out message metrics be affected? We create an entry in
{{MessagingService#callbacks}} for each peer contacted for an operation (just
talking reads/mutations right now), and say the request CL is satisfied, but
the idealCL doesn't hear back from some nodes. In that case we'll increment the
timeouts, {{ConnectionMetrics.totalTimeouts.mark()}}, even though they weren't
explicitly part of the user's request. It might be confusing to users or
operators. I'm not sure how hard it is to code around that, or if it's
worthwhile. If we feel it's not, perhaps we just document it in the yaml that
"you may see higher than usual timeout counts". Thoughts?
- calling it "ideal consistency level" doesn't sound quite right. Maybe
something like "alternative" or "secondary" might work. It might be good to
point out that the emphasis here should be on discovering the latencies a
different CL would bring, and not necessarily the impact on data consistency
itself.
> Make it possible to monitor an ideal consistency level separate from actual
> consistency level
> ---------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-13289
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13289
> Project: Cassandra
> Issue Type: New Feature
> Components: Core
> Reporter: Ariel Weisberg
> Assignee: Ariel Weisberg
> Fix For: 4.0
>
>
> As an operator there are several issues related to multi-datacenter
> replication and consistency you may want to have more information on from
> your production database.
> For instance. If your application writes at LOCAL_QUORUM how often are those
> writes failing to achieve EACH_QUORUM at other data centers. If you failed
> your application over to one of those data centers roughly how inconsistent
> might it be given the number of writes that didn't propagate since the last
> incremental repair?
> You might also want to know roughly what the latency of writes would be if
> you switched to a different consistency level. For instance you are writing
> at LOCAL_QUORUM and want to know what would happen if you switched to
> EACH_QUORUM.
> The proposed change is to allow an ideal_consistency_level to be specified in
> cassandra.yaml as well as get/set via JMX. If no ideal consistency level is
> specified no additional tracking is done.
> if an ideal consistency level is specified then the
> {{AbstractWriteResponesHandler}} will contain a delegate WriteResponseHandler
> that tracks whether the ideal consistency level is met before a write times
> out. It also tracks the latency for achieving the ideal CL of successful
> writes.
> These two metrics would be reported on a per keyspace basis.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)