Re: High CPU after read timeout

2017-07-14 Thread Vladimir Yudovin
I've created JIRA https://issues.apache.org/jira/browse/CASSANDRA-13695



Best regards, Vladimir Yudovin, 

Winguzone - Cloud Cassandra Hosting






 On Fri, 14 Jul 2017 07:23:57 -0400 Vladimir Yudovin 
vla...@winguzone.com wrote 




gt; If a client disconnects from a coordinator there is also no way for 
the replicas to know that the client was disconnected. 

 

Got it. 

 

 

 

gt; Theres internal mechanisms that don't really have a concept of 
atimeout and where we would want it to never timeout 

 

Can such timeout be passed to executing thread? For read requests it can be 
taken from xxx_equest_timeout_in_ms parameters. 

 

Because now one bad SELECT can put nodes in high load for very long time, and 
actually paralyze cluster in certain situations. 

 

 

 

 

 

Best regards, Vladimir Yudovin, 

 

Winguzone - Cloud Cassandra Hosting 

 

 

 

 

 

 

 On Fri, 14 Jul 2017 00:57:14 -0400 Chris Lohfink 
lt;clohfin...@gmail.comgt; wrote  

 

 

 

 

There is no mechanism for reads to timeout once they have started. The 

 

messaging service will drop the request when its received on the ReadStage 

 

or RequestResponseStage. This is how its always operated so not unique to 

 

3.9. If a client disconnects from a coordinator there is also no way for 

 

the replicas who received a read request from the coordinator to know that 

 

the client was disconnected. 

 

 

 

Would be an interesting JIRA but as a note, it will likely not be a quick 

 

fix. Theres internal mechanisms that don't really have a concept of a 

 

timeout and where we would want it to never timeout (ie a compaction, 

 

reading system tables to fill meta data, repairs etc) and currently theres 

 

no way of differentiating between them. 

 

 

 

Chris 

 

 

 

On Thu, Jul 13, 2017 at 10:53 PM, Vladimir Yudovin 
lt;vla...@winguzone.comgt; 

 

wrote: 

 

 

 

gt; Hi, 

 

gt; 

 

gt; 

 

gt; 

 

gt; Cassandra 3.9, I found after some ALLOW FILTERING request running on 
huge 

 

gt; partition fails with Cassandra timeout during read query at 
consistency ONE 

 

gt; (1 responses were required but only 0 replica responded) nodes 
continue to 

 

gt; consume CPU in ReadStage-N threads, as if they still performing search 

 

gt; despite failed request and even disconnected client. 

 

gt; 

 

gt; 

 

gt; 

 

gt; Is it something known or probably it's worth JIRA filling? 

 

gt; 

 

gt; 

 

gt; 

 

gt; 

 

gt; 

 

gt; Best regards, Vladimir Yudovin, 

 

gt; 

 

gt; Winguzone - Cloud Cassandra Hosting 

 

gt; 

 

gt; 

 

gt; 

 

gt; 

 

gt; 

 

gt; 

 

 

 

 

 

 








Re: High CPU after read timeout

2017-07-14 Thread Vladimir Yudovin
 If a client disconnects from a coordinator there is also no way for the 
replicas to know that the client was disconnected.

Got it.



 Theres internal mechanisms that don't really have a concept of atimeout 
and where we would want it to never timeout

Can such timeout be passed to executing thread? For read requests it can be 
taken from xxx_equest_timeout_in_ms parameters.

Because now one bad SELECT can put nodes in high load for very long time, and 
actually paralyze cluster in certain situations. 





Best regards, Vladimir Yudovin, 

Winguzone - Cloud Cassandra Hosting






 On Fri, 14 Jul 2017 00:57:14 -0400 Chris Lohfink 
clohfin...@gmail.com wrote 




There is no mechanism for reads to timeout once they have started. The 

messaging service will drop the request when its received on the ReadStage 

or RequestResponseStage. This is how its always operated so not unique to 

3.9. If a client disconnects from a coordinator there is also no way for 

the replicas who received a read request from the coordinator to know that 

the client was disconnected. 

 

Would be an interesting JIRA but as a note, it will likely not be a quick 

fix. Theres internal mechanisms that don't really have a concept of a 

timeout and where we would want it to never timeout (ie a compaction, 

reading system tables to fill meta data, repairs etc) and currently theres 

no way of differentiating between them. 

 

Chris 

 

On Thu, Jul 13, 2017 at 10:53 PM, Vladimir Yudovin vla...@winguzone.com 

wrote: 

 

 Hi, 

 

 

 

 Cassandra 3.9, I found after some ALLOW FILTERING request running on huge 

 partition fails with Cassandra timeout during read query at consistency 
ONE 

 (1 responses were required but only 0 replica responded) nodes continue to 

 consume CPU in ReadStage-N threads, as if they still performing search 

 despite failed request and even disconnected client. 

 

 

 

 Is it something known or probably it's worth JIRA filling? 

 

 

 

 

 

 Best regards, Vladimir Yudovin, 

 

 Winguzone - Cloud Cassandra Hosting 

 

 

 

 

 

 








Re: High CPU after read timeout

2017-07-13 Thread Chris Lohfink
There is no mechanism for reads to timeout once they have started. The
messaging service will drop the request when its received on the ReadStage
or RequestResponseStage. This is how its always operated so not unique to
3.9. If a client disconnects from a coordinator there is also no way for
the replicas who received a read request from the coordinator to know that
the client was disconnected.

Would be an interesting JIRA but as a note, it will likely not be a quick
fix. Theres internal mechanisms that don't really have a concept of a
timeout and where we would want it to never timeout (ie a compaction,
reading system tables to fill meta data, repairs etc) and currently theres
no way of differentiating between them.

Chris

On Thu, Jul 13, 2017 at 10:53 PM, Vladimir Yudovin 
wrote:

> Hi,
>
>
>
> Cassandra 3.9, I found after some ALLOW FILTERING request running on huge
> partition fails with Cassandra timeout during read query at consistency ONE
> (1 responses were required but only 0 replica responded) nodes continue to
> consume CPU in ReadStage-N threads, as if they still performing search
> despite failed request and even disconnected client.
>
>
>
>  Is it something known or probably it's worth JIRA filling?
>
>
>
>
>
> Best regards, Vladimir Yudovin,
>
> Winguzone - Cloud Cassandra Hosting
>
>
>
>
>
>


High CPU after read timeout

2017-07-13 Thread Vladimir Yudovin
Hi,



Cassandra 3.9, I found after some ALLOW FILTERING request running on huge 
partition fails with Cassandra timeout during read query at consistency ONE (1 
responses were required but only 0 replica responded) nodes continue to consume 
CPU in ReadStage-N threads, as if they still performing search despite failed 
request and even disconnected client.



 Is it something known or probably it's worth JIRA filling?





Best regards, Vladimir Yudovin, 

Winguzone - Cloud Cassandra Hosting