Re: Unexplainable spikes of requests latency

2018-12-13 Thread Nitan Kainth
Latency and mismatch matches may not align but mismatch means data is not
in sync AND can cause read latency.

On Thu, Dec 13, 2018 at 2:34 AM Виталий Савкин 
wrote:

> Good catch. We ran repairs few times but don't do it on a regular basis.
> But I found no dependency between count of DigestMismatchExceptions and
> latency spikes (see attached graphs for example).
> One important point I didn't mention in the original mail is that all
> requests (both reads and writes) have CL=LOCAL_QUORUM.
>
> ср, 12 дек. 2018 г. в 19:49, Nitan Kainth :
>
>> DigestMismatchExceptions   --> could be due to data out of sync.Are you
>> running repairs?
>>
>> On Wed, Dec 12, 2018 at 11:39 AM Виталий Савкин 
>> wrote:
>>
>>> Hi everyone!
>>>
>>> Few times a day I see spikes of requests latencies on my cassandra
>>> clients. Usually 99thPercentile is below 100ms but that times it grows
>>> above 1 second.
>>> Type of request doesn't matter: different services are affected and I
>>> found that three absolutely identical requests (to the same partition key,
>>> issued in a three-second interval) completed in 1ms, 30ms and 1100ms. Also
>>> I found no correlation between spikes and patterns of load. G1 GC does not
>>> report any significant (>50ms) delays.
>>> Few suspicious things:
>>>
>>>- nodetool shows that there are dropped READs
>>>- there are DigestMismatchExceptions in logs
>>>- in tracing events I see that event "Executing single-partition
>>>query on *" sometimes happens right after "READ message received from
>>>/*.*.*.*" (in less than 100 micros) and sometimes after hundreds of
>>>milliseconds
>>>
>>> My cluster runs on six c5.2xlarge Amazon instances, data is stored on
>>> EBS. Cassandra version is 3.10.
>>> Any help in explaining this behavior is appreciated. I'm glad to share
>>> more details if needed.
>>>
>>> Thanks,
>>> Vitaliy Savkin.
>>>
>>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org


Re: Unexplainable spikes of requests latency

2018-12-13 Thread Виталий Савкин
Good catch. We ran repairs few times but don't do it on a regular basis.
But I found no dependency between count of DigestMismatchExceptions and
latency spikes (see attached graphs for example).
One important point I didn't mention in the original mail is that all
requests (both reads and writes) have CL=LOCAL_QUORUM.

ср, 12 дек. 2018 г. в 19:49, Nitan Kainth :

> DigestMismatchExceptions   --> could be due to data out of sync.Are you
> running repairs?
>
> On Wed, Dec 12, 2018 at 11:39 AM Виталий Савкин 
> wrote:
>
>> Hi everyone!
>>
>> Few times a day I see spikes of requests latencies on my cassandra
>> clients. Usually 99thPercentile is below 100ms but that times it grows
>> above 1 second.
>> Type of request doesn't matter: different services are affected and I
>> found that three absolutely identical requests (to the same partition key,
>> issued in a three-second interval) completed in 1ms, 30ms and 1100ms. Also
>> I found no correlation between spikes and patterns of load. G1 GC does not
>> report any significant (>50ms) delays.
>> Few suspicious things:
>>
>>- nodetool shows that there are dropped READs
>>- there are DigestMismatchExceptions in logs
>>- in tracing events I see that event "Executing single-partition
>>query on *" sometimes happens right after "READ message received from
>>/*.*.*.*" (in less than 100 micros) and sometimes after hundreds of
>>milliseconds
>>
>> My cluster runs on six c5.2xlarge Amazon instances, data is stored on
>> EBS. Cassandra version is 3.10.
>> Any help in explaining this behavior is appreciated. I'm glad to share
>> more details if needed.
>>
>> Thanks,
>> Vitaliy Savkin.
>>
>

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org

Re: Unexplainable spikes of requests latency

2018-12-12 Thread Nitan Kainth
DigestMismatchExceptions   --> could be due to data out of sync.Are you
running repairs?

On Wed, Dec 12, 2018 at 11:39 AM Виталий Савкин 
wrote:

> Hi everyone!
>
> Few times a day I see spikes of requests latencies on my cassandra
> clients. Usually 99thPercentile is below 100ms but that times it grows
> above 1 second.
> Type of request doesn't matter: different services are affected and I
> found that three absolutely identical requests (to the same partition key,
> issued in a three-second interval) completed in 1ms, 30ms and 1100ms. Also
> I found no correlation between spikes and patterns of load. G1 GC does not
> report any significant (>50ms) delays.
> Few suspicious things:
>
>- nodetool shows that there are dropped READs
>- there are DigestMismatchExceptions in logs
>- in tracing events I see that event "Executing single-partition query
>on *" sometimes happens right after "READ message received from /*.*.*.*"
>(in less than 100 micros) and sometimes after hundreds of milliseconds
>
> My cluster runs on six c5.2xlarge Amazon instances, data is stored on EBS.
> Cassandra version is 3.10.
> Any help in explaining this behavior is appreciated. I'm glad to share
> more details if needed.
>
> Thanks,
> Vitaliy Savkin.
>


Unexplainable spikes of requests latency

2018-12-12 Thread Виталий Савкин
Hi everyone!

Few times a day I see spikes of requests latencies on my cassandra clients.
Usually 99thPercentile is below 100ms but that times it grows above 1
second.
Type of request doesn't matter: different services are affected and I found
that three absolutely identical requests (to the same partition key, issued
in a three-second interval) completed in 1ms, 30ms and 1100ms. Also I found
no correlation between spikes and patterns of load. G1 GC does not report
any significant (>50ms) delays.
Few suspicious things:

   - nodetool shows that there are dropped READs
   - there are DigestMismatchExceptions in logs
   - in tracing events I see that event "Executing single-partition query
   on *" sometimes happens right after "READ message received from /*.*.*.*"
   (in less than 100 micros) and sometimes after hundreds of milliseconds

My cluster runs on six c5.2xlarge Amazon instances, data is stored on EBS.
Cassandra version is 3.10.
Any help in explaining this behavior is appreciated. I'm glad to share more
details if needed.

Thanks,
Vitaliy Savkin.