Re: All time blocked in nodetool tpstats

2019-04-11 Thread Paul Chandler
Hi Abdul,

That all depends on the cluster, so it really is best to experiment.

By adding more threads you will use more of the system resources, so before you 
start you need to know if there is spare capacity in the CPU usage and the disk 
throughput. If there is spare capacity then increase the threads in steps, I 
normally go in steps of 32., but that is based on the size of machines I 
normally work with. 

But as Anthony said, if it is a high read system, then it could easily be 
tombstones or garbage collection. 

Thanks 

Paul Chandler

> On 11 Apr 2019, at 03:57, Abdul Patel  wrote:
> 
> Do we have any recommendations on concurrents reads ans writes settings?
> Mine is 18 node 3 dc cluster with 20 core cpu
> 
> On Wednesday, April 10, 2019, Anthony Grasso  > wrote:
> Hi Abdul,
> 
> Usually we get no noticeable improvement at tuning concurrent_reads and 
> concurrent_writes above 128. I generally try to keep current_reads to no 
> higher than 64 and concurrent_writes to no higher than 128. In creasing the 
> values beyond that you might start running into issues where the kernel IO 
> scheduler and/or the disk become saturated. As Paul mentioned, it will depend 
> on the size of your nodes though.
> 
> If the client is timing out, it is likely that the node that is selected as 
> the coordinator for the read has a resource contention somewhere. The root 
> cause is usually due to a number of things going on though. As Paul 
> mentioned, one of the issues could be the query design. It is worth 
> investigating if a particular read query is timing out.
> 
> I would also inspect the Cassandra logs and garbage collection logs on the 
> node where you are seeing the timeouts. The things to look out for is high 
> garbage collection frequency, long garbage collection pauses, and high 
> tombstone read warnings.
> 
> Regards,
> Anthony
> 
> On Thu, 11 Apr 2019 at 06:01, Abdul Patel  > wrote:
> Yes the queries are all select queries as they are more of read intensive app.
> Last night i rebooted cluster and today they are fine .(i know its temporary) 
> as i still is all time blocked values.
> I am thinking of incresiing concurrent 
> 
> On Wednesday, April 10, 2019, Paul Chandler  > wrote:
> Hi Abdul,
> 
> When I have seen dropped messages, I normally double check to ensure the node 
> not CPU bound. 
> 
> If you have a high CPU idle value, then it is likely that tuning the thread 
> counts will help.
> 
> I normally start with concurrent_reads and concurrent_writes, so in your case 
> as reads are being dropped then increase concurrent_reads, I normally change 
> it to 96 to start with, but it will depend on size of your nodes.
> 
> Otherwise it might be badly designed queries, have you investigated which 
> queries are producing the client timeouts?
> 
> Regards 
> 
> Paul Chandler 
> 
> 
> 
> > On 9 Apr 2019, at 18:58, Abdul Patel  > > wrote:
> > 
> > Hi,
> > 
> > My nodetool tpstats arw showing all time blocked high numbers a d also read 
> > dropped messages as 400 .
> > Client is expeirince high timeouts.
> > Checked few online forums they recommend to increase, 
> > native_transport_max_threads.
> > As of jow its commented with 128 ..
> > Is it adviabke to increase this and also can this fix timeout issue?
> > 
> 
> 
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org 
> 
> For additional commands, e-mail: user-h...@cassandra.apache.org 
> 
> 



Re: All time blocked in nodetool tpstats

2019-04-10 Thread Abdul Patel
Do we have any recommendations on concurrents reads ans writes settings?
Mine is 18 node 3 dc cluster with 20 core cpu

On Wednesday, April 10, 2019, Anthony Grasso 
wrote:

> Hi Abdul,
>
> Usually we get no noticeable improvement at tuning concurrent_reads and
> concurrent_writes above 128. I generally try to keep current_reads to no
> higher than 64 and concurrent_writes to no higher than 128. In creasing
> the values beyond that you might start running into issues where the kernel
> IO scheduler and/or the disk become saturated. As Paul mentioned, it will
> depend on the size of your nodes though.
>
> If the client is timing out, it is likely that the node that is selected
> as the coordinator for the read has a resource contention somewhere. The
> root cause is usually due to a number of things going on though. As Paul
> mentioned, one of the issues could be the query design. It is worth
> investigating if a particular read query is timing out.
>
> I would also inspect the Cassandra logs and garbage collection logs on the
> node where you are seeing the timeouts. The things to look out for is high
> garbage collection frequency, long garbage collection pauses, and high
> tombstone read warnings.
>
> Regards,
> Anthony
>
> On Thu, 11 Apr 2019 at 06:01, Abdul Patel  wrote:
>
>> Yes the queries are all select queries as they are more of read intensive
>> app.
>> Last night i rebooted cluster and today they are fine .(i know its
>> temporary) as i still is all time blocked values.
>> I am thinking of incresiing concurrent
>>
>> On Wednesday, April 10, 2019, Paul Chandler  wrote:
>>
>>> Hi Abdul,
>>>
>>> When I have seen dropped messages, I normally double check to ensure the
>>> node not CPU bound.
>>>
>>> If you have a high CPU idle value, then it is likely that tuning the
>>> thread counts will help.
>>>
>>> I normally start with concurrent_reads and concurrent_writes, so in your
>>> case as reads are being dropped then increase concurrent_reads, I normally
>>> change it to 96 to start with, but it will depend on size of your nodes.
>>>
>>> Otherwise it might be badly designed queries, have you investigated
>>> which queries are producing the client timeouts?
>>>
>>> Regards
>>>
>>> Paul Chandler
>>>
>>>
>>>
>>> > On 9 Apr 2019, at 18:58, Abdul Patel  wrote:
>>> >
>>> > Hi,
>>> >
>>> > My nodetool tpstats arw showing all time blocked high numbers a d also
>>> read dropped messages as 400 .
>>> > Client is expeirince high timeouts.
>>> > Checked few online forums they recommend to increase,
>>> native_transport_max_threads.
>>> > As of jow its commented with 128 ..
>>> > Is it adviabke to increase this and also can this fix timeout issue?
>>> >
>>>
>>>
>>> -
>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>>
>>>


Re: All time blocked in nodetool tpstats

2019-04-10 Thread Anthony Grasso
Hi Abdul,

Usually we get no noticeable improvement at tuning concurrent_reads and
concurrent_writes above 128. I generally try to keep current_reads to no
higher than 64 and concurrent_writes to no higher than 128. In creasing the
values beyond that you might start running into issues where the kernel IO
scheduler and/or the disk become saturated. As Paul mentioned, it will
depend on the size of your nodes though.

If the client is timing out, it is likely that the node that is selected as
the coordinator for the read has a resource contention somewhere. The root
cause is usually due to a number of things going on though. As Paul
mentioned, one of the issues could be the query design. It is worth
investigating if a particular read query is timing out.

I would also inspect the Cassandra logs and garbage collection logs on the
node where you are seeing the timeouts. The things to look out for is high
garbage collection frequency, long garbage collection pauses, and high
tombstone read warnings.

Regards,
Anthony

On Thu, 11 Apr 2019 at 06:01, Abdul Patel  wrote:

> Yes the queries are all select queries as they are more of read intensive
> app.
> Last night i rebooted cluster and today they are fine .(i know its
> temporary) as i still is all time blocked values.
> I am thinking of incresiing concurrent
>
> On Wednesday, April 10, 2019, Paul Chandler  wrote:
>
>> Hi Abdul,
>>
>> When I have seen dropped messages, I normally double check to ensure the
>> node not CPU bound.
>>
>> If you have a high CPU idle value, then it is likely that tuning the
>> thread counts will help.
>>
>> I normally start with concurrent_reads and concurrent_writes, so in your
>> case as reads are being dropped then increase concurrent_reads, I normally
>> change it to 96 to start with, but it will depend on size of your nodes.
>>
>> Otherwise it might be badly designed queries, have you investigated which
>> queries are producing the client timeouts?
>>
>> Regards
>>
>> Paul Chandler
>>
>>
>>
>> > On 9 Apr 2019, at 18:58, Abdul Patel  wrote:
>> >
>> > Hi,
>> >
>> > My nodetool tpstats arw showing all time blocked high numbers a d also
>> read dropped messages as 400 .
>> > Client is expeirince high timeouts.
>> > Checked few online forums they recommend to increase,
>> native_transport_max_threads.
>> > As of jow its commented with 128 ..
>> > Is it adviabke to increase this and also can this fix timeout issue?
>> >
>>
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>
>>


Re: All time blocked in nodetool tpstats

2019-04-10 Thread Abdul Patel
Yes the queries are all select queries as they are more of read intensive
app.
Last night i rebooted cluster and today they are fine .(i know its
temporary) as i still is all time blocked values.
I am thinking of incresiing concurrent reads and writes to 256 and native
transport threads to 256 and see how it performs.

On Wednesday, April 10, 2019, Paul Chandler  wrote:

> Hi Abdul,
>
> When I have seen dropped messages, I normally double check to ensure the
> node not CPU bound.
>
> If you have a high CPU idle value, then it is likely that tuning the
> thread counts will help.
>
> I normally start with concurrent_reads and concurrent_writes, so in your
> case as reads are being dropped then increase concurrent_reads, I normally
> change it to 96 to start with, but it will depend on size of your nodes.
>
> Otherwise it might be badly designed queries, have you investigated which
> queries are producing the client timeouts?
>
> Regards
>
> Paul Chandler
>
>
>
> > On 9 Apr 2019, at 18:58, Abdul Patel  wrote:
> >
> > Hi,
> >
> > My nodetool tpstats arw showing all time blocked high numbers a d also
> read dropped messages as 400 .
> > Client is expeirince high timeouts.
> > Checked few online forums they recommend to increase,
> native_transport_max_threads.
> > As of jow its commented with 128 ..
> > Is it adviabke to increase this and also can this fix timeout issue?
> >
>
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


Re: All time blocked in nodetool tpstats

2019-04-10 Thread Paul Chandler
Hi Abdul,

When I have seen dropped messages, I normally double check to ensure the node 
not CPU bound. 

If you have a high CPU idle value, then it is likely that tuning the thread 
counts will help.

I normally start with concurrent_reads and concurrent_writes, so in your case 
as reads are being dropped then increase concurrent_reads, I normally change it 
to 96 to start with, but it will depend on size of your nodes.

Otherwise it might be badly designed queries, have you investigated which 
queries are producing the client timeouts?

Regards 

Paul Chandler 



> On 9 Apr 2019, at 18:58, Abdul Patel  wrote:
> 
> Hi,
> 
> My nodetool tpstats arw showing all time blocked high numbers a d also read 
> dropped messages as 400 .
> Client is expeirince high timeouts.
> Checked few online forums they recommend to increase, 
> native_transport_max_threads.
> As of jow its commented with 128 ..
> Is it adviabke to increase this and also can this fix timeout issue?
> 


-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: All time blocked in nodetool tpstats

2019-04-10 Thread Jean Carlo
In my cluster, I have it at 4096. I think you can start with 1024 and check
if you have no native requested blocked.

I believe this parameter depends on the cluster traffic

Cheers

Jean Carlo

"The best way to predict the future is to invent it" Alan Kay


On Tue, Apr 9, 2019 at 7:59 PM Abdul Patel  wrote:

> Hi,
>
> My nodetool tpstats arw showing all time blocked high numbers a d also
> read dropped messages as 400 .
> Client is expeirince high timeouts.
> Checked few online forums they recommend to increase,
> native_transport_max_threads.
> As of jow its commented with 128 ..
> Is it adviabke to increase this and also can this fix timeout issue?
>
>