Pluggable throttling of read and write queries

2017-02-17 Thread Abhishek Verma
Cassandra is being used on a large scale at Uber. We usually create
dedicated clusters for each of our internal use cases, however that is
difficult to scale and manage.

We are investigating the approach of using a single shared cluster with
100s of nodes and handle 10s to 100s of different use cases for different
products in the same cluster. We can define different keyspaces for each of
them, but that does not help in case of noisy neighbors.

Does anybody in the community have similar large shared clusters and/or
face noisy neighbor issues?

Is there a way to throttle read and write queries in Cassandra currently?
If not, what would be the right place in the code to implement a pluggable
interface for doing it. I have briefly considered using triggers, but that
is invoked only in the write path. The initial goal is to have a custom
pluggable class which would be a no-op.

We would like to enforce these rate limits per table and for different
query types (point or range queries, or LWT) separately.

Thank you in advance.

-Abhishek.


Re: Count(*) is not working

2017-02-17 Thread kurt greaves
really... well that's good to know. it still almost never works though. i
guess every time I've seen it it must have timed out due to tombstones.

On 17 Feb. 2017 22:06, "Sylvain Lebresne"  wrote:

On Fri, Feb 17, 2017 at 11:54 AM, kurt greaves  wrote:

> if you want a reliable count, you should use spark. performing a count (*)
> will inevitably fail unless you make your server read timeouts and
> tombstone fail thresholds ridiculous
>

That's just not true. count(*) is paged internally so while it is not
particular fast, it shouldn't require bumping neither the read timeout nor
the tombstone fail threshold in any way to work.

In that case, it seems the partition does have many tombstones (more than
live rows) and so the tombstone threshold is doing its job of warning about
it.


>
> On 17 Feb. 2017 04:34, "Jan"  wrote:
>
>> Hi,
>>
>> could you post the output of nodetool cfstats for the table?
>>
>> Cheers,
>>
>> Jan
>>
>> Am 16.02.2017 um 17:00 schrieb Selvam Raman:
>>
>> I am not getting count as result. Where i keep on getting n number of
>> results below.
>>
>> Read 100 live rows and 1423 tombstone cells for query SELECT * FROM
>> keysace.table WHERE token(id) > token(test:ODP0144-0883E-022R-002/047-052)
>> LIMIT 100 (see tombstone_warn_threshold)
>>
>> On Thu, Feb 16, 2017 at 12:37 PM, Jan Kesten  wrote:
>>
>>> Hi,
>>>
>>> do you got a result finally?
>>>
>>> Those messages are simply warnings telling you that c* had to read many
>>> tombstones while processing your query - rows that are deleted but not
>>> garbage collected/compacted. This warning gives you some explanation why
>>> things might be much slower than expected because per 100 rows that count
>>> c* had to read about 15 times rows that were deleted already.
>>>
>>> Apart from that, count(*) is almost always slow - and there is a default
>>> limit of 10.000 rows in a result.
>>>
>>> Do you really need the actual live count? To get a idea you can always
>>> look at nodetool cfstats (but those numbers also contain deleted rows).
>>>
>>>
>>> Am 16.02.2017 um 13:18 schrieb Selvam Raman:
>>>
>>> Hi,
>>>
>>> I want to know the total records count in table.
>>>
>>> I fired the below query:
>>>select count(*) from tablename;
>>>
>>> and i have got the below output
>>>
>>> Read 100 live rows and 1423 tombstone cells for query SELECT * FROM
>>> keysace.table WHERE token(id) > token(test:ODP0144-0883E-022R-002/047-052)
>>> LIMIT 100 (see tombstone_warn_threshold)
>>>
>>> Read 100 live rows and 1435 tombstone cells for query SELECT * FROM
>>> keysace.table WHERE token(id) > token(test:2565-AMK-2) LIMIT 100 (see
>>> tombstone_warn_threshold)
>>>
>>> Read 96 live rows and 1385 tombstone cells for query SELECT * FROM
>>> keysace.table WHERE token(id) > token(test:-2220-UV033/04) LIMIT 100 (see
>>> tombstone_warn_threshold).
>>>
>>>
>>>
>>>
>>> Can you please help me to get the total count of the table.
>>>
>>> --
>>> Selvam Raman
>>> "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"
>>>
>>>
>>
>>
>> --
>> Selvam Raman
>> "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"
>>
>>
>>


Re: lots of connection timeouts around same time every day

2017-02-17 Thread kurt greaves
typically when I've seen that gossip issue it requires more than just
restarting the affected node to fix. if you're not getting query related
errors in the server log you should start looking at what is being queried.
are the queries that time out each day the same?


Re: High disk io read load

2017-02-17 Thread kurt greaves
what's the Owns % for the relevant keyspace from nodetool status?


Re: lots of connection timeouts around same time every day

2017-02-17 Thread Mike Torra
I can't say that I have tried that while the issue is going on, but I have
done such rolling restarts for sure, and the timeouts still occur every
day. What would a rolling restart do to fix the issue?

In fact, as I write this, I am restarting each node one by one in the
eu-west-1 datacenter, and in us-east-1 I am seeing lots of timeouts - both
the metrics 'Connection.TotalTimeouts.m1_rate' and
'ClientRequest.Latency.Read.p999' flatlining at ~6s. Why would restarting
in one datacenter impact reads in another?

Any suggestions on what to investigate next, or what changes to try in the
cluster? Happy to provide any more info as well :)

On Fri, Feb 17, 2017 at 6:05 AM, kurt greaves  wrote:

> have you tried a rolling restart of the entire DC?
>


Re: Count(*) is not working

2017-02-17 Thread Sagar Jambhulkar
+1 for using spark for counts.

On Feb 17, 2017 4:25 PM, "kurt greaves"  wrote:

> if you want a reliable count, you should use spark. performing a count (*)
> will inevitably fail unless you make your server read timeouts and
> tombstone fail thresholds ridiculous
>
> On 17 Feb. 2017 04:34, "Jan"  wrote:
>
>> Hi,
>>
>> could you post the output of nodetool cfstats for the table?
>>
>> Cheers,
>>
>> Jan
>>
>> Am 16.02.2017 um 17:00 schrieb Selvam Raman:
>>
>> I am not getting count as result. Where i keep on getting n number of
>> results below.
>>
>> Read 100 live rows and 1423 tombstone cells for query SELECT * FROM
>> keysace.table WHERE token(id) > token(test:ODP0144-0883E-022R-002/047-052)
>> LIMIT 100 (see tombstone_warn_threshold)
>>
>> On Thu, Feb 16, 2017 at 12:37 PM, Jan Kesten  wrote:
>>
>>> Hi,
>>>
>>> do you got a result finally?
>>>
>>> Those messages are simply warnings telling you that c* had to read many
>>> tombstones while processing your query - rows that are deleted but not
>>> garbage collected/compacted. This warning gives you some explanation why
>>> things might be much slower than expected because per 100 rows that count
>>> c* had to read about 15 times rows that were deleted already.
>>>
>>> Apart from that, count(*) is almost always slow - and there is a default
>>> limit of 10.000 rows in a result.
>>>
>>> Do you really need the actual live count? To get a idea you can always
>>> look at nodetool cfstats (but those numbers also contain deleted rows).
>>>
>>>
>>> Am 16.02.2017 um 13:18 schrieb Selvam Raman:
>>>
>>> Hi,
>>>
>>> I want to know the total records count in table.
>>>
>>> I fired the below query:
>>>select count(*) from tablename;
>>>
>>> and i have got the below output
>>>
>>> Read 100 live rows and 1423 tombstone cells for query SELECT * FROM
>>> keysace.table WHERE token(id) > token(test:ODP0144-0883E-022R-002/047-052)
>>> LIMIT 100 (see tombstone_warn_threshold)
>>>
>>> Read 100 live rows and 1435 tombstone cells for query SELECT * FROM
>>> keysace.table WHERE token(id) > token(test:2565-AMK-2) LIMIT 100 (see
>>> tombstone_warn_threshold)
>>>
>>> Read 96 live rows and 1385 tombstone cells for query SELECT * FROM
>>> keysace.table WHERE token(id) > token(test:-2220-UV033/04) LIMIT 100 (see
>>> tombstone_warn_threshold).
>>>
>>>
>>>
>>>
>>> Can you please help me to get the total count of the table.
>>>
>>> --
>>> Selvam Raman
>>> "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"
>>>
>>>
>>
>>
>> --
>> Selvam Raman
>> "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"
>>
>>
>>


Re: Count(*) is not working

2017-02-17 Thread siddharth verma
Hi,
We faced this issue too.
You could try with reduced paging size, so that tombstone threshold isn't
breached.

try using "paging 500" in cqlsh
[ https://docs.datastax.com/en/cql/3.3/cql/cql_reference/cqlshPaging.html ]

Similarly paging size could be set in java driver as well

This is a work around.
For this warning, do review your data model once.

Regards


On Fri, Feb 17, 2017 at 4:36 PM, Sylvain Lebresne 
wrote:

> On Fri, Feb 17, 2017 at 11:54 AM, kurt greaves 
> wrote:
>
>> if you want a reliable count, you should use spark. performing a count
>> (*) will inevitably fail unless you make your server read timeouts and
>> tombstone fail thresholds ridiculous
>>
>
> That's just not true. count(*) is paged internally so while it is not
> particular fast, it shouldn't require bumping neither the read timeout nor
> the tombstone fail threshold in any way to work.
>
> In that case, it seems the partition does have many tombstones (more than
> live rows) and so the tombstone threshold is doing its job of warning about
> it.
>
>
>>
>> On 17 Feb. 2017 04:34, "Jan"  wrote:
>>
>>> Hi,
>>>
>>> could you post the output of nodetool cfstats for the table?
>>>
>>> Cheers,
>>>
>>> Jan
>>>
>>> Am 16.02.2017 um 17:00 schrieb Selvam Raman:
>>>
>>> I am not getting count as result. Where i keep on getting n number of
>>> results below.
>>>
>>> Read 100 live rows and 1423 tombstone cells for query SELECT * FROM
>>> keysace.table WHERE token(id) > token(test:ODP0144-0883E-022R-002/047-052)
>>> LIMIT 100 (see tombstone_warn_threshold)
>>>
>>> On Thu, Feb 16, 2017 at 12:37 PM, Jan Kesten  wrote:
>>>
 Hi,

 do you got a result finally?

 Those messages are simply warnings telling you that c* had to read many
 tombstones while processing your query - rows that are deleted but not
 garbage collected/compacted. This warning gives you some explanation why
 things might be much slower than expected because per 100 rows that count
 c* had to read about 15 times rows that were deleted already.

 Apart from that, count(*) is almost always slow - and there is a
 default limit of 10.000 rows in a result.

 Do you really need the actual live count? To get a idea you can always
 look at nodetool cfstats (but those numbers also contain deleted rows).


 Am 16.02.2017 um 13:18 schrieb Selvam Raman:

 Hi,

 I want to know the total records count in table.

 I fired the below query:
select count(*) from tablename;

 and i have got the below output

 Read 100 live rows and 1423 tombstone cells for query SELECT * FROM
 keysace.table WHERE token(id) > token(test:ODP0144-0883E-022R-002/047-052)
 LIMIT 100 (see tombstone_warn_threshold)

 Read 100 live rows and 1435 tombstone cells for query SELECT * FROM
 keysace.table WHERE token(id) > token(test:2565-AMK-2) LIMIT 100 (see
 tombstone_warn_threshold)

 Read 96 live rows and 1385 tombstone cells for query SELECT * FROM
 keysace.table WHERE token(id) > token(test:-2220-UV033/04) LIMIT 100 (see
 tombstone_warn_threshold).




 Can you please help me to get the total count of the table.

 --
 Selvam Raman
 "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"


>>>
>>>
>>> --
>>> Selvam Raman
>>> "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"
>>>
>>>
>>>
>


-- 
Siddharth Verma
(Visit https://github.com/siddv29/cfs for a high speed cassandra full table
scan)


Re: Count(*) is not working

2017-02-17 Thread Sylvain Lebresne
On Fri, Feb 17, 2017 at 11:54 AM, kurt greaves  wrote:

> if you want a reliable count, you should use spark. performing a count (*)
> will inevitably fail unless you make your server read timeouts and
> tombstone fail thresholds ridiculous
>

That's just not true. count(*) is paged internally so while it is not
particular fast, it shouldn't require bumping neither the read timeout nor
the tombstone fail threshold in any way to work.

In that case, it seems the partition does have many tombstones (more than
live rows) and so the tombstone threshold is doing its job of warning about
it.


>
> On 17 Feb. 2017 04:34, "Jan"  wrote:
>
>> Hi,
>>
>> could you post the output of nodetool cfstats for the table?
>>
>> Cheers,
>>
>> Jan
>>
>> Am 16.02.2017 um 17:00 schrieb Selvam Raman:
>>
>> I am not getting count as result. Where i keep on getting n number of
>> results below.
>>
>> Read 100 live rows and 1423 tombstone cells for query SELECT * FROM
>> keysace.table WHERE token(id) > token(test:ODP0144-0883E-022R-002/047-052)
>> LIMIT 100 (see tombstone_warn_threshold)
>>
>> On Thu, Feb 16, 2017 at 12:37 PM, Jan Kesten  wrote:
>>
>>> Hi,
>>>
>>> do you got a result finally?
>>>
>>> Those messages are simply warnings telling you that c* had to read many
>>> tombstones while processing your query - rows that are deleted but not
>>> garbage collected/compacted. This warning gives you some explanation why
>>> things might be much slower than expected because per 100 rows that count
>>> c* had to read about 15 times rows that were deleted already.
>>>
>>> Apart from that, count(*) is almost always slow - and there is a default
>>> limit of 10.000 rows in a result.
>>>
>>> Do you really need the actual live count? To get a idea you can always
>>> look at nodetool cfstats (but those numbers also contain deleted rows).
>>>
>>>
>>> Am 16.02.2017 um 13:18 schrieb Selvam Raman:
>>>
>>> Hi,
>>>
>>> I want to know the total records count in table.
>>>
>>> I fired the below query:
>>>select count(*) from tablename;
>>>
>>> and i have got the below output
>>>
>>> Read 100 live rows and 1423 tombstone cells for query SELECT * FROM
>>> keysace.table WHERE token(id) > token(test:ODP0144-0883E-022R-002/047-052)
>>> LIMIT 100 (see tombstone_warn_threshold)
>>>
>>> Read 100 live rows and 1435 tombstone cells for query SELECT * FROM
>>> keysace.table WHERE token(id) > token(test:2565-AMK-2) LIMIT 100 (see
>>> tombstone_warn_threshold)
>>>
>>> Read 96 live rows and 1385 tombstone cells for query SELECT * FROM
>>> keysace.table WHERE token(id) > token(test:-2220-UV033/04) LIMIT 100 (see
>>> tombstone_warn_threshold).
>>>
>>>
>>>
>>>
>>> Can you please help me to get the total count of the table.
>>>
>>> --
>>> Selvam Raman
>>> "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"
>>>
>>>
>>
>>
>> --
>> Selvam Raman
>> "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"
>>
>>
>>


Re: lots of connection timeouts around same time every day

2017-02-17 Thread kurt greaves
have you tried a rolling restart of the entire DC?


Re: sasi index question (read timeout on many selects)

2017-02-17 Thread Benjamin Roth
Btw:

They break incremental repair if you use CDC: https://issues.apache.
org/jira/browse/CASSANDRA-12888


Not only when using CDC! You shouldn't use incremental repairs with MVs.
Never (right now).

2017-02-16 17:42 GMT+01:00 Jonathan Haddad :

> My advice to avoid them is based on the issues that have been filed in
> Jira.  Benjamin Roth is one of the only people talking about his MV usage,
> and has filed a few JIRAs discussing their problems when bootstrapping new
> nodes, as well as issues repairing.
>
> https://issues.apache.org/jira/browse/CASSANDRA-12730?
> jql=project%20%3D%20CASSANDRA%20and%20reporter%20%3D%
> 20brstgt%20and%20text%20~%20%22materialized%22
>
> They also can't be altered: https://issues.apache.org/jira/browse/
> CASSANDRA-9736
>
> They may be less performant than managing the data yourself:
> https://issues.apache.org/jira/browse/CASSANDRA-10295, https://
> issues.apache.org/jira/browse/CASSANDRA-10307
>
> They're not as flexible as your own tables: https://issues.apache.
> org/jira/browse/CASSANDRA-9928, https://issues.apache.org/
> jira/browse/CASSANDRA-11194, https://issues.apache.org/jira/
> browse/CASSANDRA-12463
>
> They break incremental repair if you use CDC: https://issues.apache.
> org/jira/browse/CASSANDRA-12888
>
> I don't know why DataStax advises using them.  Perhaps ask them?
>
> Jon
>
> On Thu, Feb 16, 2017 at 7:57 AM Micha  wrote:
>
>>
>>
>> On 16.02.2017 16:33, Jonathan Haddad wrote:
>> >
>> > Regarding MVs, do not use the ones that shipped with 3.x.  They're not
>> > ready for production.  Manage it yourself by using a second table and
>> > inserting a second record there.
>> >
>>
>> Out of interest... there is a slight discrepance between the advice not
>> to use mv and the docu about the feature on the datastax side. Or do I
>> have to use another cassandra version (instead of 3.9)?
>>
>>


-- 
Benjamin Roth
Prokurist

Jaumo GmbH · www.jaumo.com
Wehrstraße 46 · 73035 Göppingen · Germany
Phone +49 7161 304880-6 · Fax +49 7161 304880-1
AG Ulm · HRB 731058 · Managing Director: Jens Kammerer


Re: High disk io read load

2017-02-17 Thread Benjamin Roth
Hi Nate,

See here dstat results:
https://gist.github.com/brstgt/216c662b525a9c5b653bbcd8da5b3fcb
Network volume does not correspond to Disk IO, not even close.

@heterogenous vnode count:
I did this to test how load behaves on a new server class we ordered for
CS. The new nodes had much faster CPUs than our older nodes. If not
assigning more tokens to new nodes, what else would you recommend to give
more weight + load to newer and usually faster servers.

2017-02-16 23:21 GMT+01:00 Nate McCall :

>
> - Node A has 512 tokens and Node B 256. So it has double the load (data).
>> - Node A also has 2 SSDs, Node B only 1 SSD (according to load)
>>
>
> I very rarely see heterogeneous vnode counts in the same cluster. I would
> almost guarantee you are the only one doing this with MVs as well.
>
> That said, since you have different IO hardware, are you sure the system
> configurations (eg. block size, read ahead, etc) are the same on both
> machines? Is dstat showing a similar order of magnitude of network traffic
> in vs. IO for what you would expect?
>
>
> --
> -
> Nate McCall
> Wellington, NZ
> @zznate
>
> CTO
> Apache Cassandra Consulting
> http://www.thelastpickle.com
>



-- 
Benjamin Roth
Prokurist

Jaumo GmbH · www.jaumo.com
Wehrstraße 46 · 73035 Göppingen · Germany
Phone +49 7161 304880-6 · Fax +49 7161 304880-1
AG Ulm · HRB 731058 · Managing Director: Jens Kammerer


Re: Count(*) is not working

2017-02-17 Thread kurt greaves
if you want a reliable count, you should use spark. performing a count (*)
will inevitably fail unless you make your server read timeouts and
tombstone fail thresholds ridiculous

On 17 Feb. 2017 04:34, "Jan"  wrote:

> Hi,
>
> could you post the output of nodetool cfstats for the table?
>
> Cheers,
>
> Jan
>
> Am 16.02.2017 um 17:00 schrieb Selvam Raman:
>
> I am not getting count as result. Where i keep on getting n number of
> results below.
>
> Read 100 live rows and 1423 tombstone cells for query SELECT * FROM
> keysace.table WHERE token(id) > token(test:ODP0144-0883E-022R-002/047-052)
> LIMIT 100 (see tombstone_warn_threshold)
>
> On Thu, Feb 16, 2017 at 12:37 PM, Jan Kesten  wrote:
>
>> Hi,
>>
>> do you got a result finally?
>>
>> Those messages are simply warnings telling you that c* had to read many
>> tombstones while processing your query - rows that are deleted but not
>> garbage collected/compacted. This warning gives you some explanation why
>> things might be much slower than expected because per 100 rows that count
>> c* had to read about 15 times rows that were deleted already.
>>
>> Apart from that, count(*) is almost always slow - and there is a default
>> limit of 10.000 rows in a result.
>>
>> Do you really need the actual live count? To get a idea you can always
>> look at nodetool cfstats (but those numbers also contain deleted rows).
>>
>>
>> Am 16.02.2017 um 13:18 schrieb Selvam Raman:
>>
>> Hi,
>>
>> I want to know the total records count in table.
>>
>> I fired the below query:
>>select count(*) from tablename;
>>
>> and i have got the below output
>>
>> Read 100 live rows and 1423 tombstone cells for query SELECT * FROM
>> keysace.table WHERE token(id) > token(test:ODP0144-0883E-022R-002/047-052)
>> LIMIT 100 (see tombstone_warn_threshold)
>>
>> Read 100 live rows and 1435 tombstone cells for query SELECT * FROM
>> keysace.table WHERE token(id) > token(test:2565-AMK-2) LIMIT 100 (see
>> tombstone_warn_threshold)
>>
>> Read 96 live rows and 1385 tombstone cells for query SELECT * FROM
>> keysace.table WHERE token(id) > token(test:-2220-UV033/04) LIMIT 100 (see
>> tombstone_warn_threshold).
>>
>>
>>
>>
>> Can you please help me to get the total count of the table.
>>
>> --
>> Selvam Raman
>> "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"
>>
>>
>
>
> --
> Selvam Raman
> "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"
>
>
>