Re: Is it a memory issue?

2016-11-06 Thread Ben Slater
Yes, it does mean you’re getting ahead of Cassandra’s ability to keep up
although I would have probably expected a higher number of pending
compactions before you got serious issues (I’ve seen numbers in the
thousands).

I notice from the screenshot you provide that you are using secondary
indexes. There are a lot of way to missuse secondary indexes (vs not very
many way to use them well). I think it’s possible that what you are seeing
is the result of the secondary index on event time (I assume a very high
cardinality column). This is a good blog on secondary indexes:
http://www.wentnet.com/blog/?p=77

Cheers
Ben

On Mon, 7 Nov 2016 at 16:29 wxn...@zjqunshuo.com 
wrote:

> Thanks Ben. I stopped inserting and checked compaction status as you
> mentioned. Seems there is lots of compaction work waiting to do. Please see
> below. In this case is it a sign that writting faster than C* can process?
>
> One node,
> [root@iZbp11zpafrqfsiys90kzoZ bin]# ./nodetool compactionstats
> pending tasks: 195
>
>  id   compaction type   keyspace  
>   tablecompleted totalunit   progress
>
>5da60b10-a4a9-11e6-88e9-755b5673a02aCompaction cargts   
> eventdata.eventdata_event_time_idx   1699866872   26536427792   bytes  
> 6.41%
>
>Compaction system  
>   hints 103543795172210360   bytes  0.20%
> Active compaction remaining time :   0h29m48s
>
> Another node,
> [root@iZbp1iqnrpsdhoodwii32bZ bin]# ./nodetool compactionstats
> pending tasks: 84
>
>  id   compaction type   keyspace  
>   table completed totalunit   progress
>
>28a9d010-a4a7-11e6-b985-979fea8d6099Compaction cargts  
>   eventdata 6561414001424412420   bytes 46.06%
>
>7c034840-a48e-11e6-b985-979fea8d6099Compaction cargts   
> eventdata.eventdata_event_time_idx   32098562606   42616107664   bytes 
> 75.32%
> Active compaction remaining time :   0h11m12s
>
>
> *From:* Ben Slater 
> *Date:* 2016-11-07 11:41
> *To:* user 
> *Subject:* Re: Is it a memory issue?
>
> This sounds to me like your writes go ahead of compactions trying to keep
> up which can eventually cause issues. Keep an eye on nodetool
> compactionstats if the number of compactions continually climbs then you
> are writing faster than Cassandra can actually process. If this is
> happening then you need to either add more processing capacity (nodes) to
> your cluster or throttle writes on the client side.
>
> It could also be related to conditions like an individual partition
> growing too big but I’d check for backed up compactions first.
>
> Cheers
> Ben
>
> On Mon, 7 Nov 2016 at 14:17 wxn...@zjqunshuo.com 
> wrote:
>
> Hi All,
> We have one issue on C* testing. At first the inserting was very fast and
> TPS was about 30K/s, but when the size of data rows reached 2 billion, the
> insertion rate decreased very badly and the TPS was 20K/s. When the size of
> rows reached 2.3 billion, the TPS decreased to 0.5K/s, and writing timeout
> come out. At last OOM issue happened in some nodes and C* deamon in some
> nodes crashed.  In production we have about 8 billion rows. My testing
> cluster setting is as below. My question is if the memory is the main
> issue. Do I need increase the memory, and what's the right setting for 
> MAX_HEAP_SIZE
> and HEAP_NEWSIZE?
>
> My cluster setting:
> C* cluster with 3 nodes in Aliyun Cloud
> CPU: 4core
> Memory: 8G
> Disk: 500G
> MAX_HEAP_SIZE=2G
> HEAP_NEWSIZE=500M
>
> My table schema:
>
> CREATE KEYSPACE IF NOT EXISTS cargts WITH REPLICATION = {'class': 
> 'SimpleStrategy','replication_factor':2};
> use cargts;
> CREATE TABLE eventdata (
> deviceId int,
> date int,
> event_time bigint,
> lat decimal,
> lon decimal,
> speed int,
> heading int,
> PRIMARY KEY ((deviceId,date),event_time)
> )
> WITH CLUSTERING ORDER BY (event_time ASC);
> CREATE INDEX ON eventdata (event_time);
>
> Best Regards,
> -Simon Wu
>
>


Re: Is it a memory issue?

2016-11-06 Thread wxn...@zjqunshuo.com
Thanks Ben. I stopped inserting and checked compaction status as you mentioned. 
Seems there is lots of compaction work waiting to do. Please see below. In this 
case is it a sign that writting faster than C* can process?

One node,
[root@iZbp11zpafrqfsiys90kzoZ bin]# ./nodetool compactionstats
pending tasks: 195
 id   compaction type   keyspace
tablecompleted totalunit   progress
   5da60b10-a4a9-11e6-88e9-755b5673a02aCompaction cargts   
eventdata.eventdata_event_time_idx   1699866872   26536427792   bytes  6.41%
   Compaction system
hints 103543795172210360   bytes  0.20%
Active compaction remaining time :   0h29m48s

Another node,
[root@iZbp1iqnrpsdhoodwii32bZ bin]# ./nodetool compactionstats
pending tasks: 84
 id   compaction type   keyspace
table completed totalunit   progress
   28a9d010-a4a7-11e6-b985-979fea8d6099Compaction cargts
eventdata 6561414001424412420   bytes 46.06%
   7c034840-a48e-11e6-b985-979fea8d6099Compaction cargts   
eventdata.eventdata_event_time_idx   32098562606   42616107664   bytes 
75.32%
Active compaction remaining time :   0h11m12s
 
From: Ben Slater
Date: 2016-11-07 11:41
To: user
Subject: Re: Is it a memory issue?
This sounds to me like your writes go ahead of compactions trying to keep up 
which can eventually cause issues. Keep an eye on nodetool compactionstats if 
the number of compactions continually climbs then you are writing faster than 
Cassandra can actually process. If this is happening then you need to either 
add more processing capacity (nodes) to your cluster or throttle writes on the 
client side.

It could also be related to conditions like an individual partition growing too 
big but I’d check for backed up compactions first.

Cheers
Ben

On Mon, 7 Nov 2016 at 14:17 wxn...@zjqunshuo.com  wrote:
Hi All,
We have one issue on C* testing. At first the inserting was very fast and TPS 
was about 30K/s, but when the size of data rows reached 2 billion, the 
insertion rate decreased very badly and the TPS was 20K/s. When the size of 
rows reached 2.3 billion, the TPS decreased to 0.5K/s, and writing timeout come 
out. At last OOM issue happened in some nodes and C* deamon in some nodes 
crashed.  In production we have about 8 billion rows. My testing cluster 
setting is as below. My question is if the memory is the main issue. Do I need 
increase the memory, and what's the right setting for MAX_HEAP_SIZE and 
HEAP_NEWSIZE?

My cluster setting:
C* cluster with 3 nodes in Aliyun Cloud
CPU: 4core
Memory: 8G
Disk: 500G
MAX_HEAP_SIZE=2G
HEAP_NEWSIZE=500M

My table schema:
CREATE KEYSPACE IF NOT EXISTS cargts WITH REPLICATION = {'class': 
'SimpleStrategy','replication_factor':2};
use cargts;
CREATE TABLE eventdata (
deviceId int,
date int,
event_time bigint,
lat decimal,
lon decimal,
speed int,
heading int,
PRIMARY KEY ((deviceId,date),event_time)
)
WITH CLUSTERING ORDER BY (event_time ASC);
CREATE INDEX ON eventdata (event_time);

Best Regards,
-Simon Wu



Re: Is it a memory issue?

2016-11-06 Thread Ben Slater
This sounds to me like your writes go ahead of compactions trying to keep
up which can eventually cause issues. Keep an eye on nodetool
compactionstats if the number of compactions continually climbs then you
are writing faster than Cassandra can actually process. If this is
happening then you need to either add more processing capacity (nodes) to
your cluster or throttle writes on the client side.

It could also be related to conditions like an individual partition growing
too big but I’d check for backed up compactions first.

Cheers
Ben

On Mon, 7 Nov 2016 at 14:17 wxn...@zjqunshuo.com 
wrote:

> Hi All,
> We have one issue on C* testing. At first the inserting was very fast and
> TPS was about 30K/s, but when the size of data rows reached 2 billion, the
> insertion rate decreased very badly and the TPS was 20K/s. When the size of
> rows reached 2.3 billion, the TPS decreased to 0.5K/s, and writing timeout
> come out. At last OOM issue happened in some nodes and C* deamon in some
> nodes crashed.  In production we have about 8 billion rows. My testing
> cluster setting is as below. My question is if the memory is the main
> issue. Do I need increase the memory, and what's the right setting for 
> MAX_HEAP_SIZE
> and HEAP_NEWSIZE?
>
> My cluster setting:
> C* cluster with 3 nodes in Aliyun Cloud
> CPU: 4core
> Memory: 8G
> Disk: 500G
> MAX_HEAP_SIZE=2G
> HEAP_NEWSIZE=500M
>
> My table schema:
>
> CREATE KEYSPACE IF NOT EXISTS cargts WITH REPLICATION = {'class': 
> 'SimpleStrategy','replication_factor':2};
> use cargts;
> CREATE TABLE eventdata (
> deviceId int,
> date int,
> event_time bigint,
> lat decimal,
> lon decimal,
> speed int,
> heading int,
> PRIMARY KEY ((deviceId,date),event_time)
> )
> WITH CLUSTERING ORDER BY (event_time ASC);
> CREATE INDEX ON eventdata (event_time);
>
> Best Regards,
> -Simon Wu
>
>


Is it a memory issue?

2016-11-06 Thread wxn...@zjqunshuo.com
Hi All,
We have one issue on C* testing. At first the inserting was very fast and TPS 
was about 30K/s, but when the size of data rows reached 2 billion, the 
insertion rate decreased very badly and the TPS was 20K/s. When the size of 
rows reached 2.3 billion, the TPS decreased to 0.5K/s, and writing timeout come 
out. At last OOM issue happened in some nodes and C* deamon in some nodes 
crashed.  In production we have about 8 billion rows. My testing cluster 
setting is as below. My question is if the memory is the main issue. Do I need 
increase the memory, and what's the right setting for MAX_HEAP_SIZE and 
HEAP_NEWSIZE?

My cluster setting:
C* cluster with 3 nodes in Aliyun Cloud
CPU: 4core
Memory: 8G
Disk: 500G
MAX_HEAP_SIZE=2G
HEAP_NEWSIZE=500M

My table schema:
CREATE KEYSPACE IF NOT EXISTS cargts WITH REPLICATION = {'class': 
'SimpleStrategy','replication_factor':2};
use cargts;
CREATE TABLE eventdata (
deviceId int,
date int,
event_time bigint,
lat decimal,
lon decimal,
speed int,
heading int,
PRIMARY KEY ((deviceId,date),event_time)
)
WITH CLUSTERING ORDER BY (event_time ASC);
CREATE INDEX ON eventdata (event_time);

Best Regards,
-Simon Wu