[ 
https://issues.apache.org/jira/browse/CASSANDRA-10502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tyler Hobbs updated CASSANDRA-10502:
------------------------------------
    Description: 
Hi,

So we are developing a system that computes profile of things that it observes. 
 The observation comes in form of events. Each thing that it observe has an id 
and each thing has a set of subthings in it which has measurement of some kind. 
Roughly there are about 500 subthings within each thing. We receive events 
containing measurements of these 500 subthings every 10 seconds or so.

So as we receive events, we  read the old profile value, calculate the new 
profile based on the new value and save it back. 

One of the things we observe are the processes running on the server.

We use the following schema to hold the profile. 

{noformat}
CREATE TABLE processinfometric_profile (
    profilecontext text,
    id text,
    month text,
    day text,
    hour text,
    minute text,
    command text,
    cpu map<text, double>,
    majorfaults map<text, double>,
    minorfaults map<text, double>,
    nice map<text, double>,
    pagefaults map<text, double>,
    pid map<text, double>,
    ppid map<text, double>,
    priority map<text, double>,
    resident map<text, double>,
    rss map<text, double>,
    sharesize map<text, double>,
    size map<text, double>,
    starttime map<text, double>,
    state map<text, double>,
    threads map<text, double>,
    user map<text, double>,
    vsize map<text, double>,
    PRIMARY KEY ((profilecontext, agentid, month, day, hour, minute), command)
) WITH CLUSTERING ORDER BY (command ASC)
    AND bloom_filter_fp_chance = 0.1
    AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
    AND comment = ''
    AND compaction = {'class': 
'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'}
    AND compression = {'sstable_compression': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99.0PERCENTILE';
{noformat}

This profile will then be use for certain analytics that can use in the context 
of the ‘thing’ or in the context of specific thing and subthing. 

A profile can be defined as monthly, daily, hourly. So in case of monthly the 
month will be set to the current month (i.e. ‘Oct’) and the day and hour will 
be set to empty ‘’ string.


The problem that we have observed is that over time (actually in just a matter 
of hours) we will see a huge degradation of query response  for the monthly 
profile. At the start it will be respinding in 10-100 ms and after a couple of 
hours it will go to 2000-3000 ms . If you leave it for a couple of days you 
will start experiencing readtimeouts . The query is basically just :

{noformat}
select * from myprofile where id=‘1’ and month=‘Oct’ and day=‘’ and hour=‘' and 
minute=''
{noformat}

This will have only about 500 rows or so.

We were using Cassandra 2.2.1 , but upgraded to 2.2.2 to see if it fixed the 
issue to no avail. And since this is a test, we are running on a single node.


  was:
Hi,

So we are developing a system that computes profile of things that it observes. 
 The observation comes in form of events. Each thing that it observe has an id 
and each thing has a set of subthings in it which has measurement of some kind. 
Roughly there are about 500 subthings within each thing. We receive events 
containing measurements of these 500 subthings every 10 seconds or so.

So as we receive events, we  read the old profile value, calculate the new 
profile based on the new value and save it back. 

One of the things we observe are the processes running on the server.

We use the following schema to hold the profile. 


CREATE TABLE processinfometric_profile (
    profilecontext text,
    id text,
    month text,
    day text,
    hour text,
    minute text,
    command text,
    cpu map<text, double>,
    majorfaults map<text, double>,
    minorfaults map<text, double>,
    nice map<text, double>,
    pagefaults map<text, double>,
    pid map<text, double>,
    ppid map<text, double>,
    priority map<text, double>,
    resident map<text, double>,
    rss map<text, double>,
    sharesize map<text, double>,
    size map<text, double>,
    starttime map<text, double>,
    state map<text, double>,
    threads map<text, double>,
    user map<text, double>,
    vsize map<text, double>,
    PRIMARY KEY ((profilecontext, agentid, month, day, hour, minute), command)
) WITH CLUSTERING ORDER BY (command ASC)
    AND bloom_filter_fp_chance = 0.1
    AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
    AND comment = ''
    AND compaction = {'class': 
'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'}
    AND compression = {'sstable_compression': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99.0PERCENTILE';


This profile will then be use for certain analytics that can use in the context 
of the ‘thing’ or in the context of specific thing and subthing. 

A profile can be defined as monthly, daily, hourly. So in case of monthly the 
month will be set to the current month (i.e. ‘Oct’) and the day and hour will 
be set to empty ‘’ string.


The problem that we have observed is that over time (actually in just a matter 
of hours) we will see a huge degradation of query response  for the monthly 
profile. At the start it will be respinding in 10-100 ms and after a couple of 
hours it will go to 2000-3000 ms . If you leave it for a couple of days you 
will start experiencing readtimeouts . The query is basically just :

select * from myprofile where id=‘1’ and month=‘Oct’ and day=‘’ and hour=‘' and 
minute=''

This will have only about 500 rows or so.

We were using Cassandra 2.2.1 , but upgraded to 2.2.2 to see if it fixed the 
issue to no avail. And since this is a test, we are running on a single node.



> Cassandra query degradation with high frequency updated tables
> --------------------------------------------------------------
>
>                 Key: CASSANDRA-10502
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10502
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Dodong Juan
>
> Hi,
> So we are developing a system that computes profile of things that it 
> observes.  The observation comes in form of events. Each thing that it 
> observe has an id and each thing has a set of subthings in it which has 
> measurement of some kind. Roughly there are about 500 subthings within each 
> thing. We receive events containing measurements of these 500 subthings every 
> 10 seconds or so.
> So as we receive events, we  read the old profile value, calculate the new 
> profile based on the new value and save it back. 
> One of the things we observe are the processes running on the server.
> We use the following schema to hold the profile. 
> {noformat}
> CREATE TABLE processinfometric_profile (
>     profilecontext text,
>     id text,
>     month text,
>     day text,
>     hour text,
>     minute text,
>     command text,
>     cpu map<text, double>,
>     majorfaults map<text, double>,
>     minorfaults map<text, double>,
>     nice map<text, double>,
>     pagefaults map<text, double>,
>     pid map<text, double>,
>     ppid map<text, double>,
>     priority map<text, double>,
>     resident map<text, double>,
>     rss map<text, double>,
>     sharesize map<text, double>,
>     size map<text, double>,
>     starttime map<text, double>,
>     state map<text, double>,
>     threads map<text, double>,
>     user map<text, double>,
>     vsize map<text, double>,
>     PRIMARY KEY ((profilecontext, agentid, month, day, hour, minute), command)
> ) WITH CLUSTERING ORDER BY (command ASC)
>     AND bloom_filter_fp_chance = 0.1
>     AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
>     AND comment = ''
>     AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'}
>     AND compression = {'sstable_compression': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
>     AND dclocal_read_repair_chance = 0.1
>     AND default_time_to_live = 0
>     AND gc_grace_seconds = 864000
>     AND max_index_interval = 2048
>     AND memtable_flush_period_in_ms = 0
>     AND min_index_interval = 128
>     AND read_repair_chance = 0.0
>     AND speculative_retry = '99.0PERCENTILE';
> {noformat}
> This profile will then be use for certain analytics that can use in the 
> context of the ‘thing’ or in the context of specific thing and subthing. 
> A profile can be defined as monthly, daily, hourly. So in case of monthly the 
> month will be set to the current month (i.e. ‘Oct’) and the day and hour will 
> be set to empty ‘’ string.
> The problem that we have observed is that over time (actually in just a 
> matter of hours) we will see a huge degradation of query response  for the 
> monthly profile. At the start it will be respinding in 10-100 ms and after a 
> couple of hours it will go to 2000-3000 ms . If you leave it for a couple of 
> days you will start experiencing readtimeouts . The query is basically just :
> {noformat}
> select * from myprofile where id=‘1’ and month=‘Oct’ and day=‘’ and hour=‘' 
> and minute=''
> {noformat}
> This will have only about 500 rows or so.
> We were using Cassandra 2.2.1 , but upgraded to 2.2.2 to see if it fixed the 
> issue to no avail. And since this is a test, we are running on a single node.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to