On 2017-05-16 08:53 (-0700), Nitan Kainth <ni...@bamlabs.com> wrote: 
> Hi,
> 
> We see read timeouts intermittently. Mostly after they have occurred. 
> Timeouts are not consistent and does not occur in 100s at a moment. 
> 
> 1. Does read timeout considered as Dropped Mutation?

No, a dropped mutation is a failed write, not a failed read.

> 2. What is best way to nail down exact issue of scattered timeouts?
> 

First, be aware that tombstone overwhelming exceptions also get propagated as 
read timeouts - you should check your logs for warnings about tombstone 
problems.

Second, you need to identify the slow queries somehow. You have a few options:

1) If you happen to be running 3.10 or newer , turn on the slow query log ( 
https://issues.apache.org/jira/browse/CASSANDRA-12403 ) . 3.10 is the newest 
release, and may not be fully stable, so you probably don't want to upgrade to 
3.10 JUST to get this feature. But if you're already on that version, 
definitely use that tool.

2) Some drivers have a log-slow-queries feature. Consider turning that on, and 
let the application side log the slow queries. It's possible that you have a 
bad partition or two, and you may see patterns there.

3) Probabilistic tracing - you can tell cassandra to trace 1% of your queries, 
and hope you catch a timeout. It'll be unpleasant to track alone - this is 
really a last-resort type option, because you'll need to dig through that trace 
table to find the outliers after the fact.



---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org

Reply via email to