[
https://issues.apache.org/jira/browse/CASSANDRA-10502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14951313#comment-14951313
]
Dodong Juan edited comment on CASSANDRA-10502 at 10/9/15 10:50 PM:
-------------------------------------------------------------------
The following traces of the were done on cqlsh .
So the trace is varying a lot. And does not seem to correlate with the data
return from the client ? Maybe datastax java driver related. ..? (not
likely).. Just checkout the results.
Below is the one that I took when from the client (java application)
perspective it was returning data in about 1100 ms.
{noformat}
tracing session: 566477c0-6ebc-11e5-9493-9131aba66d63
activity
| timestamp
| source | source_elapsed
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------
Execute CQL3 query | 2015-10-09
15:31:28.700000 | 172.31.17.129 | 0
Parsing select * from processinfometric_profile where profilecontext='GENERIC'
and id=‘1' and month='Oct' and day='' and hour='' and minute='';
[SharedPool-Worker-1] | 2015-10-09 15:31:28.701000 | 172.31.17.129 |
101
Preparing statement [SharedPool-Worker-1] | 2015-10-09
15:31:28.701000 | 172.31.17.129 | 334
Executing single-partition
query on processinfometric_profile [SharedPool-Worker-3] | 2015-10-09
15:31:28.701000 | 172.31.17.129 | 692
Acquiring sstable references [SharedPool-Worker-3] | 2015-10-09
15:31:28.701000 | 172.31.17.129 | 713
Merging memtable tombstones [SharedPool-Worker-3] | 2015-10-09
15:31:28.701000 | 172.31.17.129 | 726
Key cache hit for sstable 209 [SharedPool-Worker-3] | 2015-10-09
15:31:28.704000 | 172.31.17.129 | 3143
Seeking to
partition beginning in data file [SharedPool-Worker-3] | 2015-10-09
15:31:28.704000 | 172.31.17.129 | 3169
Key cache hit for sstable 208 [SharedPool-Worker-3] | 2015-10-09
15:31:28.704000 | 172.31.17.129 | 3691
Seeking to
partition beginning in data file [SharedPool-Worker-3] | 2015-10-09
15:31:28.704000 | 172.31.17.129 | 3713
Skipped 0/2 non-slice-intersecting
sstables, included 0 due to tombstones [SharedPool-Worker-3] | 2015-10-09
15:31:28.704000 | 172.31.17.129 | 3807
Merging
data from memtables and 2 sstables [SharedPool-Worker-3] | 2015-10-09
15:31:28.704000 | 172.31.17.129 | 3818
Read 462 live and 0 tombstone cells [SharedPool-Worker-3] | 2015-10-09
15:31:29.611000 | 172.31.17.129 | 910723
Request complete | 2015-10-09
15:31:29.649251 | 172.31.17.129 | 949251
{noformat}
Below when this is around 1400 ms . But the trace data seems to look faster ..?
{noformat}
tracing session: 7c591550-6ebf-11e5-9493-9131aba66d63
activity
| timestamp
| source | source_elapsed
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------
Execute CQL3 query | 2015-10-09
15:54:00.869000 | 172.31.17.129 | 0
Parsing select * from processinfometric_profile where profilecontext='GENERIC'
and id=‘1' and month='Oct' and day='' and hour='' and minute='';
[SharedPool-Worker-133] | 2015-10-09 15:54:00.869000 | 172.31.17.129 |
122
Preparing statement [SharedPool-Worker-133] | 2015-10-09
15:54:00.869000 | 172.31.17.129 | 265
Executing single-partition
query on processinfometric_profile [SharedPool-Worker-9] | 2015-10-09
15:54:00.870000 | 172.31.17.129 | 673
Acquiring sstable references [SharedPool-Worker-9] | 2015-10-09
15:54:00.870000 | 172.31.17.129 | 695
Merging memtable tombstones [SharedPool-Worker-9] | 2015-10-09
15:54:00.870000 | 172.31.17.129 | 709
Key cache hit for sstable 242 [SharedPool-Worker-9] | 2015-10-09
15:54:00.872000 | 172.31.17.129 | 3134
Seeking
to partition beginning in data file [SharedPool-Worker-9] | 2015-10-09
15:54:00.872000 | 172.31.17.129 | 3155
Skipped 0/1 non-slice-intersecting
sstables, included 0 due to tombstones [SharedPool-Worker-9] | 2015-10-09
15:54:00.872000 | 172.31.17.129 | 3259
Merging
data from memtables and 1 sstables [SharedPool-Worker-9] | 2015-10-09
15:54:00.872000 | 172.31.17.129 | 3270
Read 462 live and 0 tombstone cells [SharedPool-Worker-9] | 2015-10-09
15:54:02.070000 | 172.31.17.129 | 201640
Request complete | 2015-10-09
15:54:02.111294 | 172.31.17.129 | 242294
{noformat}
This is when it was 1600 ms.
{noformat}
Tracing session: 9c0ea900-6ec4-11e5-9493-9131aba66d63
activity
| timestamp
| source | source_elapsed
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------
Execute CQL3 query | 2015-10-09
16:30:41.552000 | 172.31.17.129 | 0
Parsing select * from processinfometric_profile where profilecontext='GENERIC'
and id=‘1' and month='Oct' and day='' and hour='' and minute='';
[SharedPool-Worker-149] | 2015-10-09 16:30:41.552000 | 172.31.17.129 |
99
Preparing statement [SharedPool-Worker-149] | 2015-10-09
16:30:41.552000 | 172.31.17.129 | 215
Executing single-partition
query on processinfometric_profile [SharedPool-Worker-164] | 2015-10-09
16:30:41.552000 | 172.31.17.129 | 507
Acquiring sstable references [SharedPool-Worker-164] | 2015-10-09
16:30:41.552000 | 172.31.17.129 | 533
Merging memtable tombstones [SharedPool-Worker-164] | 2015-10-09
16:30:41.552000 | 172.31.17.129 | 551
Key cache hit for sstable 302 [SharedPool-Worker-164] | 2015-10-09
16:30:41.556000 | 172.31.17.129 | 3875
Seeking to
partition beginning in data file [SharedPool-Worker-164] | 2015-10-09
16:30:41.556000 | 172.31.17.129 | 3902
Skipped 0/1 non-slice-intersecting
sstables, included 0 due to tombstones [SharedPool-Worker-164] | 2015-10-09
16:30:41.556000 | 172.31.17.129 | 4050
Merging
data from memtables and 1 sstables [SharedPool-Worker-164] | 2015-10-09
16:30:41.556000 | 172.31.17.129 | 4068
Read 468 live and 0 tombstone cells [SharedPool-Worker-164] | 2015-10-09
16:30:43.269000 | 172.31.17.129 | 717277
Request complete | 2015-10-09
16:30:43.307559 | 172.31.17.129 | 755559
{noformat}
The last one is now at 2300 ms.
{noformat}
racing session: 7d89a1e0-6ec6-11e5-9493-9131aba66d63
activity
| timestamp
| source | source_elapsed
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------
Execute CQL3 query | 2015-10-09
16:44:09.342000 | 172.31.17.129 | 0
Parsing select * from processinfometric_profile where profilecontext='GENERIC'
and id=‘1' and month='Oct' and day='' and hour='' and minute='';
[SharedPool-Worker-1] | 2015-10-09 16:44:09.342000 | 172.31.17.129 |
87
Preparing statement [SharedPool-Worker-1] | 2015-10-09
16:44:09.342000 | 172.31.17.129 | 277
Executing single-partition
query on processinfometric_profile [SharedPool-Worker-2] | 2015-10-09
16:44:09.343000 | 172.31.17.129 | 456
Acquiring sstable references [SharedPool-Worker-2] | 2015-10-09
16:44:09.343000 | 172.31.17.129 | 473
Merging memtable tombstones [SharedPool-Worker-2] | 2015-10-09
16:44:09.343000 | 172.31.17.129 | 485
Key cache hit for sstable 328 [SharedPool-Worker-2] | 2015-10-09
16:44:09.345000 | 172.31.17.129 | 2851
Seeking to
partition beginning in data file [SharedPool-Worker-2] | 2015-10-09
16:44:09.345000 | 172.31.17.129 | 2869
Skipped 0/1 non-slice-intersecting
sstables, included 0 due to tombstones [SharedPool-Worker-2] | 2015-10-09
16:44:09.345000 | 172.31.17.129 | 3005
Merging
data from memtables and 1 sstables [SharedPool-Worker-2] | 2015-10-09
16:44:09.345001 | 172.31.17.129 | 3017
Read 468 live and 0 tombstone cells [SharedPool-Worker-2] | 2015-10-09
16:44:11.329000 | 172.31.17.129 | 987011
Request complete | 2015-10-09
16:44:11.388637 | 172.31.17.129 | 46637
{noformat}
was (Author: dodongjuan):
The following traces of the were done on cqlsh .
So the trace is varying a lot. And does not seem to correlate with the data
return from the client ? Maybe datastax java driver related. ..? (not
likely).. Just checkout the results.
Below is the one that I took when from the client (java application)
perspective it was returning data in about 1100 ms.
{noformat}
tracing session: 566477c0-6ebc-11e5-9493-9131aba66d63
activity
| timestamp
| source | source_elapsed
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------
Execute CQL3 query | 2015-10-09
15:31:28.700000 | 172.31.17.129 | 0
Parsing select * from processinfometric_profile where profilecontext='GENERIC'
and id=‘1' and month='Oct' and day='' and hour='' and minute='';
[SharedPool-Worker-1] | 2015-10-09 15:31:28.701000 | 172.31.17.129 |
101
Preparing statement [SharedPool-Worker-1] | 2015-10-09
15:31:28.701000 | 172.31.17.129 | 334
Executing single-partition
query on processinfometric_profile [SharedPool-Worker-3] | 2015-10-09
15:31:28.701000 | 172.31.17.129 | 692
Acquiring sstable references [SharedPool-Worker-3] | 2015-10-09
15:31:28.701000 | 172.31.17.129 | 713
Merging memtable tombstones [SharedPool-Worker-3] | 2015-10-09
15:31:28.701000 | 172.31.17.129 | 726
Key cache hit for sstable 209 [SharedPool-Worker-3] | 2015-10-09
15:31:28.704000 | 172.31.17.129 | 3143
Seeking to
partition beginning in data file [SharedPool-Worker-3] | 2015-10-09
15:31:28.704000 | 172.31.17.129 | 3169
Key cache hit for sstable 208 [SharedPool-Worker-3] | 2015-10-09
15:31:28.704000 | 172.31.17.129 | 3691
Seeking to
partition beginning in data file [SharedPool-Worker-3] | 2015-10-09
15:31:28.704000 | 172.31.17.129 | 3713
Skipped 0/2 non-slice-intersecting
sstables, included 0 due to tombstones [SharedPool-Worker-3] | 2015-10-09
15:31:28.704000 | 172.31.17.129 | 3807
Merging
data from memtables and 2 sstables [SharedPool-Worker-3] | 2015-10-09
15:31:28.704000 | 172.31.17.129 | 3818
Read 462 live and 0 tombstone cells [SharedPool-Worker-3] | 2015-10-09
15:31:29.611000 | 172.31.17.129 | 910723
Request complete | 2015-10-09
15:31:29.649251 | 172.31.17.129 | 949251
{noformat}
Below when this is around 1400 ms . But the trace data seems to look faster ..?
racing session: 7c591550-6ebf-11e5-9493-9131aba66d63
activity
| timestamp
| source | source_elapsed
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------
Execute CQL3 query | 2015-10-09
15:54:00.869000 | 172.31.17.129 | 0
Parsing select * from processinfometric_profile where profilecontext='GENERIC'
and id=‘1' and month='Oct' and day='' and hour='' and minute='';
[SharedPool-Worker-133] | 2015-10-09 15:54:00.869000 | 172.31.17.129 |
122
Preparing statement [SharedPool-Worker-133] | 2015-10-09
15:54:00.869000 | 172.31.17.129 | 265
Executing single-partition
query on processinfometric_profile [SharedPool-Worker-9] | 2015-10-09
15:54:00.870000 | 172.31.17.129 | 673
Acquiring sstable references [SharedPool-Worker-9] | 2015-10-09
15:54:00.870000 | 172.31.17.129 | 695
Merging memtable tombstones [SharedPool-Worker-9] | 2015-10-09
15:54:00.870000 | 172.31.17.129 | 709
Key cache hit for sstable 242 [SharedPool-Worker-9] | 2015-10-09
15:54:00.872000 | 172.31.17.129 | 3134
Seeking
to partition beginning in data file [SharedPool-Worker-9] | 2015-10-09
15:54:00.872000 | 172.31.17.129 | 3155
Skipped 0/1 non-slice-intersecting
sstables, included 0 due to tombstones [SharedPool-Worker-9] | 2015-10-09
15:54:00.872000 | 172.31.17.129 | 3259
Merging
data from memtables and 1 sstables [SharedPool-Worker-9] | 2015-10-09
15:54:00.872000 | 172.31.17.129 | 3270
Read 462 live and 0 tombstone cells [SharedPool-Worker-9] | 2015-10-09
15:54:02.070000 | 172.31.17.129 | 201640
Request complete | 2015-10-09
15:54:02.111294 | 172.31.17.129 | 242294
This is when it was 1600 ms.
Tracing session: 9c0ea900-6ec4-11e5-9493-9131aba66d63
activity
| timestamp
| source | source_elapsed
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------
Execute CQL3 query | 2015-10-09
16:30:41.552000 | 172.31.17.129 | 0
Parsing select * from processinfometric_profile where profilecontext='GENERIC'
and id=‘1' and month='Oct' and day='' and hour='' and minute='';
[SharedPool-Worker-149] | 2015-10-09 16:30:41.552000 | 172.31.17.129 |
99
Preparing statement [SharedPool-Worker-149] | 2015-10-09
16:30:41.552000 | 172.31.17.129 | 215
Executing single-partition
query on processinfometric_profile [SharedPool-Worker-164] | 2015-10-09
16:30:41.552000 | 172.31.17.129 | 507
Acquiring sstable references [SharedPool-Worker-164] | 2015-10-09
16:30:41.552000 | 172.31.17.129 | 533
Merging memtable tombstones [SharedPool-Worker-164] | 2015-10-09
16:30:41.552000 | 172.31.17.129 | 551
Key cache hit for sstable 302 [SharedPool-Worker-164] | 2015-10-09
16:30:41.556000 | 172.31.17.129 | 3875
Seeking to
partition beginning in data file [SharedPool-Worker-164] | 2015-10-09
16:30:41.556000 | 172.31.17.129 | 3902
Skipped 0/1 non-slice-intersecting
sstables, included 0 due to tombstones [SharedPool-Worker-164] | 2015-10-09
16:30:41.556000 | 172.31.17.129 | 4050
Merging
data from memtables and 1 sstables [SharedPool-Worker-164] | 2015-10-09
16:30:41.556000 | 172.31.17.129 | 4068
Read 468 live and 0 tombstone cells [SharedPool-Worker-164] | 2015-10-09
16:30:43.269000 | 172.31.17.129 | 717277
Request complete | 2015-10-09
16:30:43.307559 | 172.31.17.129 | 755559
The last one is now at 2300 ms.
racing session: 7d89a1e0-6ec6-11e5-9493-9131aba66d63
activity
| timestamp
| source | source_elapsed
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------
Execute CQL3 query | 2015-10-09
16:44:09.342000 | 172.31.17.129 | 0
Parsing select * from processinfometric_profile where profilecontext='GENERIC'
and id=‘1' and month='Oct' and day='' and hour='' and minute='';
[SharedPool-Worker-1] | 2015-10-09 16:44:09.342000 | 172.31.17.129 |
87
Preparing statement [SharedPool-Worker-1] | 2015-10-09
16:44:09.342000 | 172.31.17.129 | 277
Executing single-partition
query on processinfometric_profile [SharedPool-Worker-2] | 2015-10-09
16:44:09.343000 | 172.31.17.129 | 456
Acquiring sstable references [SharedPool-Worker-2] | 2015-10-09
16:44:09.343000 | 172.31.17.129 | 473
Merging memtable tombstones [SharedPool-Worker-2] | 2015-10-09
16:44:09.343000 | 172.31.17.129 | 485
Key cache hit for sstable 328 [SharedPool-Worker-2] | 2015-10-09
16:44:09.345000 | 172.31.17.129 | 2851
Seeking to
partition beginning in data file [SharedPool-Worker-2] | 2015-10-09
16:44:09.345000 | 172.31.17.129 | 2869
Skipped 0/1 non-slice-intersecting
sstables, included 0 due to tombstones [SharedPool-Worker-2] | 2015-10-09
16:44:09.345000 | 172.31.17.129 | 3005
Merging
data from memtables and 1 sstables [SharedPool-Worker-2] | 2015-10-09
16:44:09.345001 | 172.31.17.129 | 3017
Read 468 live and 0 tombstone cells [SharedPool-Worker-2] | 2015-10-09
16:44:11.329000 | 172.31.17.129 | 987011
Request complete | 2015-10-09
16:44:11.388637 | 172.31.17.129 | 46637
> Cassandra query degradation with high frequency updated tables
> --------------------------------------------------------------
>
> Key: CASSANDRA-10502
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10502
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Reporter: Dodong Juan
>
> Hi,
> So we are developing a system that computes profile of things that it
> observes. The observation comes in form of events. Each thing that it
> observe has an id and each thing has a set of subthings in it which has
> measurement of some kind. Roughly there are about 500 subthings within each
> thing. We receive events containing measurements of these 500 subthings every
> 10 seconds or so.
> So as we receive events, we read the old profile value, calculate the new
> profile based on the new value and save it back.
> One of the things we observe are the processes running on the server.
> We use the following schema to hold the profile.
> {noformat}
> CREATE TABLE processinfometric_profile (
> profilecontext text,
> id text,
> month text,
> day text,
> hour text,
> minute text,
> command text,
> cpu map<text, double>,
> majorfaults map<text, double>,
> minorfaults map<text, double>,
> nice map<text, double>,
> pagefaults map<text, double>,
> pid map<text, double>,
> ppid map<text, double>,
> priority map<text, double>,
> resident map<text, double>,
> rss map<text, double>,
> sharesize map<text, double>,
> size map<text, double>,
> starttime map<text, double>,
> state map<text, double>,
> threads map<text, double>,
> user map<text, double>,
> vsize map<text, double>,
> PRIMARY KEY ((profilecontext, id, month, day, hour, minute), command)
> ) WITH CLUSTERING ORDER BY (command ASC)
> AND bloom_filter_fp_chance = 0.1
> AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
> AND comment = ''
> AND compaction = {'class':
> 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'}
> AND compression = {'sstable_compression':
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND dclocal_read_repair_chance = 0.1
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = '99.0PERCENTILE';
> {noformat}
> This profile will then be use for certain analytics that can use in the
> context of the ‘thing’ or in the context of specific thing and subthing.
> A profile can be defined as monthly, daily, hourly. So in case of monthly the
> month will be set to the current month (i.e. ‘Oct’) and the day and hour will
> be set to empty ‘’ string.
> The problem that we have observed is that over time (actually in just a
> matter of hours) we will see a huge degradation of query response for the
> monthly profile. At the start it will be respinding in 10-100 ms and after a
> couple of hours it will go to 2000-3000 ms . If you leave it for a couple of
> days you will start experiencing readtimeouts . The query is basically just :
> {noformat}
> select * from myprofile where id=‘1’ and month=‘Oct’ and day=‘’ and hour=‘'
> and minute=''
> {noformat}
> This will have only about 500 rows or so.
> We were using Cassandra 2.2.1 , but upgraded to 2.2.2 to see if it fixed the
> issue to no avail. And since this is a test, we are running on a single node.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)